UNIVERSITY OF GHANA COLLEGE OF BASIC AND APPLIED SCIENCES GENERALISED BERNOULLI MODEL FOR CORRELATED BINARY RESPONSES: APPLICATION TO THE NATIONAL INCOME DYNAMICS STUDY (NIDS) DATASETS BY MAWUTOR FLEKU (ID. NO. 10071460) THIS THESIS IS SUBMITTED TO THE UNIVERSITY OF GHANA, LEGON IN PARTIAL FULFILMENT OF THE REQUIREMENT FOR THE AWARD OF MPHIL STATISTICS DEGREE DEPARTMENT OF STATISTICS JUNE, 2016 University of Ghana http://ugspace.ug.edu.gh i ABSTRACT The bivariate Bernoulli model was used to estimate covariate parameters for conditional as well as marginal models for the NIDs datasets. This is a follow- up research on one conducted by Islam et al. (2012) which introduced the use of the bivariate Bernoulli model to properly specify the dependence among bivariate binary responses. The covariate parameters were estimated by first expressing the proposed model in the exponential family form, finding the log-likelihood function and then the corresponding estimating equations. The Newton Rahpson and the Nelder Mead methods of iteration were used to estimate the covariate parameters. The research revealed that the bivariate Bernoulli model fitted bivariate binary response data significantly better than the conditional logistic and the Generalized Estimating Equation (GEE) logistic models-marginal model. The result was same for both artificial and real-life data. It is worth mentioning that to aim at more efficient covariate estimates, the bivariate Bernoulli model is highly recommended for bivariate binary response data. However, further research is needed to choose an initial value when using the Newton Rahpson and the Nelder Mead methods of iteration, because it posed a serious challenge in this study. Again further research is recommended to probably use the multivariate Bernoulli distribution to fit multivariate binary response data and estimate covariate coefficients. University of Ghana http://ugspace.ug.edu.gh ii DECLARATION Candidate’s Declaration I hereby declare that this submission is my work in partial fulfilment of MPhil and that, to the best of my knowledge, it contains no material previously published by another person nor material which has been accepted for the award of any other degree of the University, except where due acknowledgement has been made in the text. MAWUTOR FLEKU (10071460) SIGNATURE DATE Certified by: DR. ANANI LOTSI (SUPERVISOR) SIGNATURE DATE Certified by: DR. KWABENA DOKU-AMPONSAH (SUPERVISOR) SIGNATURE DATE University of Ghana http://ugspace.ug.edu.gh iii ACKNOWLEDGEMENTS I would like to express my profound gratitude to Dr. Nortey, who helped me to settle on the thesis topic and provided valuable corrections after reading through the final work. I would also like to thank my supervisors; Dr. Anani Lotsi and Dr. Kwabena Doku-Amponsah who provided guidance and explanation whenever I got confused at various stages of this project. Additional, I am very grateful for the encouragement they offered when I felt like giving up. Special thanks to the Acting Commissioner of the Commission of Human Rights and Administrative Justice (CHRAJ), Mr. Richard Quayson, who granted me two-year study leave. Finally, I thank my parents for their support and concern throughout the period of my study. University of Ghana http://ugspace.ug.edu.gh iv TABLE OF CONTENTS ABSTRACT ..................................................................................................................................... i DECLARATION ............................................................................................................................ ii ACKNOWLEDGEMENTS ........................................................................................................... iii LIST OF TABLES ........................................................................................................................ vii LIST OF ABBREVIATIONS ...................................................................................................... viii CHAPTER ONE: INTRODUCTION ............................................................................................. 1 1.1 Preliminary remarks .............................................................................................................. 1 1.2 Statement of problem ............................................................................................................ 2 1.3 Objectives of the study .......................................................................................................... 2 1.4 Relevance of the study .......................................................................................................... 3 1.5 Scope of the study ................................................................................................................. 3 1.6 Sources of data ...................................................................................................................... 3 1.7 Limitation of the study .......................................................................................................... 4 1.8 Approach ............................................................................................................................... 4 1.8.1 Simulation ....................................................................................................................... 4 1.8.2 Application to real life data: ......................................................................................... 10 1.9 Definition of some relevant terms ....................................................................................... 13 CHAPTER TWO:LITERATURE REVIEW ................................................................................ 15 2.1 Background ......................................................................................................................... 15 2.2 Marginal models .................................................................................................................. 16 2.2.1 Response probabilities and correlations ....................................................................... 16 2.2.2 Marginal odd ratios ....................................................................................................... 17 2.2.3 Markov chain ................................................................................................................ 18 2.2.4 Quadratic exponential model ........................................................................................ 18 2.2.5 Generalized multivariate logistic model ....................................................................... 19 2.2.6 Multivariate plackett distribution ................................................................................. 19 2.2.7 Bivariate or multivariate Bernoulli approaches ............................................................ 19 2.3 Conditional models ............................................................................................................. 20 2.4 Measuring dependence ........................................................................................................ 21 University of Ghana http://ugspace.ug.edu.gh v 2.5 Joint model .......................................................................................................................... 22 2.6 Methodology ....................................................................................................................... 22 2.6.1 Simulation ..................................................................................................................... 22 2.6.2 Model checking ............................................................................................................ 30 CHAPTER THREE:THE PROPOSED MODEL ......................................................................... 32 3.1 The Bivariate Bernoulli model ............................................................................................ 32 3.2 Comparison with an alternative test .................................................................................... 42 3.3 Test for the model ............................................................................................................... 43 CHAPTER FOUR:SIMULATION ............................................................................................... 45 4.1 Introduction ......................................................................................................................... 45 4.2 Exploratory analyses ........................................................................................................... 46 4.2.1 Odds ratio ..................................................................................................................... 46 4.2.2 Marshall–Olkin correlation ........................................................................................... 47 4.2.3 3Dependence value based on Ș ..................................................................................... 47 4.3 Regressive model ................................................................................................................ 48 4.4 Conditional ( 1Y = 0 ) ........................................................................................................... 49 4.5 Conditional 1(Y = 1) ........................................................................................................... 50 4.6 The Marginal model ............................................................................................................ 51 4.7 Bivariate Bernoulli model ................................................................................................... 52 4.8 Overall fit of the proposed model ....................................................................................... 53 CHAPTER FIVE:APPLICATION TO REAL -LIFE DATA ...................................................... 54 5.1 Introduction ......................................................................................................................... 54 5.2 Odds ratio ............................................................................................................................ 54 5.3 Marshall–Olkin correlation ................................................................................................. 55 5.4 3Dependence value based on Ș ............................................................................................. 55 5.5 Traditional models............................................................................................................... 56 5.5.1 Marginal model-GEE ................................................................................................... 56 5.5.2 Conditional model 1(Y = 1) .......................................................................................... 58 5.5.3 Conditional model 1(Y = 0) .......................................................................................... 59 University of Ghana http://ugspace.ug.edu.gh vi 5.6 Generalized bivariate Bernoulli model ............................................................................... 59 5.6.1 Marginal........................................................................................................................ 59 5.6.2 Conditional model 1(Y = 0) ....................................................................................... 60 5.6.3 Conditional model 1(Y = 1) .......................................................................................... 60 5.7 Overall fit of the model ....................................................................................................... 61 CHAPTER SIX:RESULTS AND DISCUSSION ........................................................................ 63 6.1 Introduction ......................................................................................................................... 63 6.2 Correlation ........................................................................................................................... 63 6.3 Dependence ......................................................................................................................... 63 6.4 Investigation of the traditional models ................................................................................ 64 6.5 The Bivariate Bernoulli model ............................................................................................ 64 6.5.1 Deriving parameter estimates ....................................................................................... 65 6.5.2 Parameter estimates ...................................................................................................... 65 6.6 Comparison of results.......................................................................................................... 65 6.7 Structure of models ............................................................................................................. 66 6.8 Missing values ..................................................................................................................... 66 CHAPTER SEVEN:CONCLUSION AND RECOMMENDATIONS ........................................ 67 7.1 Conclusion ........................................................................................................................... 67 7.2 Recommendations ............................................................................................................... 67 REFERENCES ............................................................................................................................. 69 APPENDIX A: Simulation tables ................................................................................................. 72 APPENDIX B: Codes used in data analysis ................................................................................. 89 B1:Codes used for simulation (Sample size 50 only) .............................................................. 89 B2:Codes used for real-life-application .................................................................................. 118 University of Ghana http://ugspace.ug.edu.gh vii LIST OF TABLES Table 1.1a: Summary of test statistics generated: Exploratory 7 Table 1.1b: Summary of test statistics generated: Regression model 8 Table 1.1c: Summary of test statistics generated: Conditional model 9 Table 1.1d: Summary of test statistics generated: Marginal Model 9 Table 1.1e: Summary of test statistics generated: Bivariate Bernoulli Model 10 Table 1.2: Transition counts and respective probabilities on the employment status for the two periods 11 Table 1.3: Summary table for testing the usefulness of the models 12 Table 3.1: 2 x 2 contingency table for the bivariate Bernoulli distribution 33 Table 5.1: Transition counts and respective probabilities on the employment status for the two periods 54 Table 5.2a: Fitted models using data from NIDS: Traditional models 55 Table 5.2b: Fitted models using data from NIDS: Generalized bivariate Bernoulli model 56 Table 5.3: Comparison of results: probability of a covariate being employed at the time of second visit 61 Table A1a: Results of estimated models from 1000 simulations with sample of size of 50: Exploratory statistics 72 Table A1b: Results of estimated models from 1000 simulations with sample of size of 50: Regression model 73 Table A1c: Results of estimated models from 1000 simulations with sample of size of 50: Conditional models 73 Table A1d: Results of estimated models from 1000 simulations with sample of size of 50: Marginal model 74 Table A1e: Results of estimated models from 1000 simulations with sample of size of 50: Bivariate Bernoulli model 74 Table A2a: Results of estimated models from 1000 simulations with sample of size of 100: Exploratory statistics 75 Table A2b: Results of estimated models from 1000 simulations with sample of size of 100: Regression model 76 Table A2c : Results of estimated models from 1000 simulations with sample of size of 100: Conditional models 76 Table A2d: Results of estimated models from 1000 simulations with sample of size of 100: Marginal model 77 Table A2e: Results of estimated models from 1000 simulations with sample of size of 100: Bivariate Bernoulli model 77 University of Ghana http://ugspace.ug.edu.gh viii Table A3a: Results of estimated models from 1000 simulations with sample of size of 200: Exploratory statistics 78 Table A3b: Results of estimated models from 1000 simulations with sample of size of 200: Regressive model 79 Table A3c: Results of estimated models from 1000 simulations with sample of size of 200- Conditional model 80 Table A3d: Results of estimated models from 1000 simulations with sample of size of 200: Marginal model 80 Table A3e: Results of estimated models from 1000 simulations with sample of size of 200: Bivariate Bernoulli model 81 Table A4a: Results of estimated models from 1000 simulations with sample of size of 500: Exploratory statistics 82 Table A4b: Results of estimated models from 1000 simulations with sample of size of 500: Regressive model 83 Table A4c: Results of estimated models from 1000 simulations with sample of size of 500: Conditional models 83 Table A4d: Results of estimated models from 1000 simulations with sample of size of 500: Marginal model 84 Table A4e: Results of estimated models from 1000 simulations with sample of size of 500: Bivariate Bernoulli model 84 Table A5a: Results of estimated models from 1000 simulations with sample of size of 1000: Exploratory statistics 85 Table A5b: Results of estimated models from 1000 simulations with sample of size of 1000: Regressive model 86 Table A5c: Results of estimated models from 1000 simulations with sample of size of 1000: Conditional models 86 Table A5d: Results of estimated models from 1000 simulations with sample of size of 1000: Marginal model 87 Table A5e: Results of estimated models from 1000 simulations with sample of size of 1000: Bivariate Bernoulli model 87 University of Ghana http://ugspace.ug.edu.gh ix LIST OF ABBREVIATIONS GEE Generalized Estimating Equations NIDS National Income Dynamics Study University of Ghana http://ugspace.ug.edu.gh 1 CHAPTER ONE INTRODUCTION 1.1 Preliminary remarks In longitudinal studies, outcomes are normally taken from the same subjects over a period of time. In such situations, there is likely to be correlation between the outcomes. For example, in pre and post tests, outcomes are initially taken from subjects treatment is then given to the subjects and outcomes taken thereafter from the same subjects. In such cases, there is the likelihood that the initial outcomes and post outcomes will be correlated. Similarly, in tests where records are taken over a period of time, such as examining the blood pressure level of patients, these records are likely to be correlated. The possibility of correlations between repeated outcomes needs to be taken into account when analyzing such data. Using standard statistical models and assuming independence for correlated responses may lead to misleading results, particularly in the estimation of regression parameters. One therefore needs a statistical model that takes the dependence in response variables into consideration. To address this problem, most studies have employed either marginal or conditional models. Many studies have employed the use of marginal models such as Liang and Zeger (1986) and Molenberghs and Leasaffre (1994). A few however employed the use of conditional models, notable among them are Bonney (1986) and Bonney (1987). However, in all the above studies, it is not easy to precisely specify the measures of dependence in response variables. Islam, Chowdhury, and Briollais (2012), proposed a joint modeling approach for bivariate binary response employing both the marginal and conditional models where University of Ghana http://ugspace.ug.edu.gh 2 the dependence in outcome variables can be measured and tested using a link function of the models. The usefulness of their proposed model was investigated thoroughly by using extensive simulation study and real-life-data. 1.2 Statement of problem Marginal and conditional models proposed by various authors to tackle the problem of dependence between outcomes may fail to provide efficient population parameter estimates, because they do not specify the dependence of binary outcomes in the model, hence the introduction of joint modeling by Islam et al. (2012), for bivariate binary response using both the marginal and conditional models. However, Islam et al. (2012) did not investigate the performance of the regression coefficients and the model as a whole. In this thesis, the usefulness of their proposed model as well as the efficiency of the regression coefficient estimates was investigated by using an extensive simulation study and an application to real life data. 1.3 Objectives of the study The overall objective of this study is to address dependence in binary outcomes, by investigating the usefulness of the proposed model by finding out whether the model fits bivariate binary response data significantly better. Specifically, the study wishes to • unify the marginal and conditional probabilities • test for dependence in outcomes and • test properties of the estimates of regression coefficients • test the overall fit of the model University of Ghana http://ugspace.ug.edu.gh 3 1.4 Relevance of the study In various fields such as epidemiology, public health, economics and anthropology, studies involving repeated measures are carried out. Most clinical trials employ the use of repeated measures. Results from such studies need to have very little percentage of error. A case in point is clinical trials for testing the efficacy of drugs for the Ebola and Zika diseases. Using a statistical model that significantly fits available data better than existing models enhances the accuracy of results. The hope of the researcher therefore, is to affirm the use of the Bivariate Bernoulli Model for any analyses involving bivariate binary response outcomes. 1.5 Scope of the study The thesis made reference to only one stage of repeated study. Specifically, populations included in this study only consisted of “Two Waves”, that is a study on the same population at only two time points. Only binary response outcomes were taken into account. For instance, the presence or absence of a trait in a subject under study. Finally, the effect of any clustering was ignored. 1.6 Sources of data Two sources of data were employed in this study, namely artificial and real-life data: 1.6.1 Artificial data In pursuance of investigating the usefulness of the proposed model, we first make use of artificial correlated data. Correlated binary data were generated from a package called bindata on the R- programming language platform. University of Ghana http://ugspace.ug.edu.gh 4 1.6.2 Real-life data Data from the National Income Dynamics Study (NIDS), a study on the personal records of 28,000 South Africans was used to investigate the usefulness of the proposed model. NIDS is a programme to compile comprehensive longitudinal information on persons selected for the study and to find out who is moving ahead and who is falling behind. This data is also key for research and policy makers. The first study was conducted in 2008, and the data was compiled into the Wave1dataset. . The second, Wave 2 dataset was compiled after the second visit was made to the same group of people between 2010 -2011. 1.7 Limitations of the study The limitations of the study are as follows:  The choice of rho (  ) to generate the simulated binary data was restricted. For example the package did not allow the choice of high values of  such as 0.9 and above. During the simulation exercise, especially where the sample size was 1000, the execution time run for more than 15 minutes and for some cases the process had to be truncated.  Estimating the parameters for the proposed model was very challenging. The initial choice of parameters to aid the running of the estimated parameter through the use of Newton-Raphson and the Nelder-Mead methods of iteration was time consuming. 1.8 Approach Two approaches were employed: first, simulation and then application to real life data. 1.8.1 Simulation According to Kaiser et al. (2011), a typical strategy for testing the usefulness of a statistical model is the utilization of manufactured data, since, the dataset will have the desired set of properties. University of Ghana http://ugspace.ug.edu.gh 5 The models to be tested are fitted to the manufactured dataset and then checked for the presence or otherwise of these effects. The fitted models could also be checked to find out how they behave under different experimental conditions. In this study, simulation was undertaken to investigate the usefulness of the bivariate Bernoulli model. Correlated binary data were generated for simulations, using a technique suggested by Leisch et al. (1990, cited in Islam et al 2012, p852) known as bindata package for R. Leisch et al. (1990) came up with two ways of generating binary random variates namely (i) The inversion method and (ii) Direct Conversion of Real-Valued Random Variates; (see literature review) However, the Direct Conversion of Real-Valued Random Variates approach, provides a faster approach for generating binary vector samples. Accordingly, data was initially generated from multivariate normal random variates and thereafter converted into binary data. Three variables were simulated; two dependent response variables 1Y and 2Y and one covariate, X . Different correlation combinations between the two response variables and pairwise correlations, that is their relationship with the covariate, X were considered. Simulations were performed 1000 times with different sample sizes ( 20, 50 100, 500 and 1000). In generating the required data for simulation, this study used the following inputs: i) Marginal probabilities for 1Y , 2Y and X , ii)  , this describes the correlation between the response variables. Specifically, five different sets of models were generated using different marginal probability combinations for 1Y and 2Y as follows ( it is worth stating that the function rmvbin was used in generating correlated binary data. It required the user to provide the marginal probabilities whereas University of Ghana http://ugspace.ug.edu.gh 6 the function estimates its own pairwise probabilities, hence only marginal probabilities were provided): (i) Low marginal probabilities 1 2( 0.1, 0.1)P P  with 0.2  (ii) Average marginal probabilities 1 2( 0.5, 0.5)P P  with 0.3   (iii) A high and a low marginal probability 1 2( 0.1, 0.8)P P  with 0.1  (iv) An average and a low marginal probability 1 2( 0.5, 0.3)P P  with 0.3  (v) Above average and below average marginal probabilities  1 20.2, 0.6P P  with 0  In other words, the study came up with five (5) different tables with sample sizes 20, 50, 100, 500 and 1000 for null, positive and negative correlations between the bivariate responses ( 1Y , 2Y ) as stated above. The true pairwise correlations between ( 1Y , X) and ( 2Y , X) were also shown for each model.As the dependence in outcome variables depends on the association between outcomes and covariates too, the study also examined the conditional correlations between 2Y and X for given 1 1Y y The following was extracted to form one complete table: (i) the average observed correlations using the Marshall–Olkin method : The Marshall- Olkin estimator for correlation for the bivariate Bernoulli variates 1Y and 2Y is 11 00 10 01 0 1 0 1 ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆMO P P P P P P P P       , where 1 2( , )P Y u Y v  = uvP , 0,1; 0,1u v  and uP  or vP are the marginal probabilities, (ii) the average of the estimated odds ratios given by 11 00 10 01 P P P P , University of Ghana http://ugspace.ug.edu.gh 7 (iii) the average estimates of the parameters from the models, (iv) the average values of the proposed tests and total number of instances where p < 0.05, (v) the average values of the regressive and conditional models approach and corresponding number of instances where p < 0.05, (vi) a check of the overall fit of the bivariate Bernoulli model. The likelihood ratio test was employed. The log-likelihood of the regressive, conditional and marginal models were estimated and compared to the log-likelihood of the proposed test. The formula for the log-likelihood test is given by : LR = -2In(log.likelihood(restrictive.model/ proposed.model) 2(log.likelihood(proposed.model) - log.likelihood(restrictive.model) It is worth mentioning that, data for the simulation exercise is multivariate; the relationship between 1Y and 2Y , 1Y and X as well as 2Y and X were investigated. Also, there is alternative real life data which will result in similar conclusions, when analysed the same way. Tables 1.1a to 1.1e summarize test statistics generated. Table 1.1a: Summary of test statistics generated: Exploratory Models Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0 1 2( 0.2, 0.6)P P   1 2Avg.Cor ,Y Y  sd in  1Avg.Cor ,Y X  sd in  2Avg.Cor ,Y X  sd in Avg.Odds Ratio  1 2 ,Y Y University of Ghana http://ugspace.ug.edu.gh 8 Marshall – Olkin correlation  1 2 ,Y Y Marshall – Olkin correlation  2 1, / 0Y X Y  Marshall –Olkin correlation 2 1( , / 1)Y X Y  Marshall – Olkin correlation  1 2/ 1Y X Y , Marshall –Olkin correlation  1 2, / 0Y X Y  Dependence value based on 3 Av means average values Table 1.1b: Summary of test statistics generated: Regression model Regressive model Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0 1 2( 0.2, 0.6)P P  . 1Av  % 0.05of p  .Av  % 0.05of p  2.Av  : Wald test % 0.05of p  Av means average values University of Ghana http://ugspace.ug.edu.gh 9 Table 1.1c: Summary of test statistics generated: Conditional models Conditional Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0 1 2( 0.2, 0.6)P P  Conditional 1 0Y  01  % 0.05of p  :Wald test % 0.05of p  Conditional 1 1Y  11  % 0.05of p  :Wald test % 0.05of p  Av means average values Table 1.1d: Summary of test statistics generated: Marginal model Marginal model Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0 1 2( 0.2, 0.6)P P  . 1 ( . )Av s e :Wald test % 0.05of p  Av means average values University of Ghana http://ugspace.ug.edu.gh 10 Table 1.1e: Summary of test statistics generated: Bivariate Bernoulli model Bivariate Bernoulli model Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0 1 2( 0.2, 0.6)P P  11  % 0.05of p  01  % 0.05of p  . 1VA  % 0.05of p  Likelihood ratio test (proposed model and regressive model) % 0.05of p  Likelihood ratio test (proposed model and conditional 1 0Y  ) % 0.05of p  Likelihood ratio test (proposed model and conditional 1 1Y  ) % 0.05of p  Likelihood ratio test (proposed model and marginal model ) % 0.05of p  Av means average values 1.8.2 Application to real life data This study illustrated the application to real life data by using data from the National Income Dynamics Study (NIDS), a panel study conducted in South Africa. The panel study takes the personal records of 28,000 South Africans starting in 2008 to answer policy and research questions. The data helps to put the spotlight on who is getting ahead and who is falling behind and what factors might be contributing to their state. The NIDS data constitute areas such as health, University of Ghana http://ugspace.ug.edu.gh 11 education, labour market and birth history. Two periods were chosen for this study: 2008 representing Wave 1 and 2011 representing Wave 2. This study focused on the employment status of persons selected for the exercise. The binary response variable measured therefore was whether an individual was employed or not at the time of visit. Specifically, (Not employed=0, Employed=1) Covariates chosen for this exercise include i) Sex of respondent ( Male= 0, Female = 1) ii) Ever been to school (No = 0,Yes = 1 ) and iii) Age (17-25 years = 0, 26-51years = 1) The covariates chosen were tested to find out whether there is significant impact for conditional, marginal models and joint models. Extracts done included finding out the number of persons who had i. not been employed for the two periods, ii. not been employed in the first period but employed in the second period , iii. been employed in the first period but not in the second, iv. been employed for the two periods, with their respective probabilities as shown in Table 1.2 Table 1.2: Transition counts and respective probabilities on the employment status for the two periods Wave 2 Transition count Transition probability Wave 1 0 1 Total 0 1 Total 0 No.(0,0) No.(0,1) Prob.(0,0) Prob.(0,1) 1 No.(1,0) No.(1,1) Prob.(1,0) Prob.(1,1) No. means number and Prob. means probability University of Ghana http://ugspace.ug.edu.gh 12 Additionally, the study i. found out whether there exist dependence between the response variables for the two periods; ii. found the estimated odds ratio for the employment status in Wave 1 and Wave 2. This was done to be find out the proportion of the concordant and the discordant responses; iii. found the Marshall-Olkin correlation coefficient in order to find out whether there is correlation between the two time points; iv. found out whether the covariates show significance for the traditional conditional, marginal models as well the proposed joint model; v. Tested the overall fit of the proposed model using the likelihood ratio test. Table 1.3 provides the summary for testing the usefulness of the models Table 1.3: Summary table for testing the usefulness of the models Traditional models Conditional Marginal Model 01 j Model 11 j Model 1 j -GEE Covariates 01 j se p-value 11 j se p-value 1 j se p-value Constant Age Education Sex Log- likelihood University of Ghana http://ugspace.ug.edu.gh 13 Generalized Bernoulli model Model 01 j Model 11 j Model 1 j Covariates 01 j se p-value 11 j se p-value 1 j se p-value Age Education Sex Log- likelihood Likelihood ratio test ( proposed model vrs traditional methods 1.9 Definition of some relevant terms Two definitions are provided; first the major difference between marginal and conditional models and second, longitudinal studies. 1.9.1 Marginal and conditional models The major difference between marginal and conditional models depends on whether the regression coefficients are intended to describe the marginal response or an individual’s response to changing covariates, that is, one that does not attempt to control for unobserved subjects’ random effects. For example, a marginal gender contrast compares the mean among men to that among women, while a conditional gender contrast compares the mean among men to that among women holding the same value of a random effect (a particular value that corresponds to each individual). Consider the following normal models: First the conditional model ( / )ij i ij iE Y X    University of Ghana http://ugspace.ug.edu.gh 14 and then the marginal model ( )ij ijE Y X  Lee & Nelder (2004). 1.9.2 Longitudinal studies Longitudinal studies “are studies in which the outcome variable is repeatedly measured on two or more occasions over time” (Wilson & Lorenz, 2015). University of Ghana http://ugspace.ug.edu.gh 15 CHAPTER TWO LITERATURE REVIEW 2.1 Background The generalized linear models by McCullagh and Nelder (1989) include special cases such as logit models, probit models and multinomial response models for quantal responses. On the subject matter of models for binary responses, the writers advocated that, to investigate the relationship between the response probability and the vector of covariates, it is expedient to come up with a model that can describe the effect on the response probability as a result of changes in the vector of covariates. They added that, in practice this formal model usually embodies assumptions such as correlation or independence, additivity and so on. They emphasized that the assumptions cannot be taken for granted and should be checked if possible. Furthermore, the behavior of the model as far as possible should be consistent with known mathematical, physical, or biological laws, especially in its limiting behavior. When we have a study where each subject to be investigated has only one observation , generalized linear models can be used, however with repeated observations, possible correlation among the observations must be checked before further analysis is done. Liang and Zeger (1986) introduced an extension of general linear models to the analysis of longitudinal data .They introduced a class of estimating equations that give regression parameters that have consistent estimates. However, one objective of statistical analysis is to account for the correlation of repeated outcomes for subjects in order to arrive at the right conclusion when dealing with repeated outcome observations and a set of covariates. Zeger and Liang (1986) proposed a methodology for discrete and continuous longitudinal data that employs the quasi-likelihood approach because of the scarcity of multivariate distributions for non-normal distributed data. They University of Ghana http://ugspace.ug.edu.gh 16 specified that there should be a known function of the marginal expectation of the response variable that exhibits a linear function of the explanatory variables. They also assumed that the variance is a known function of the mean. In addition, they specify a “working correlation matrix for observations for each subject”. The above set up leads to generalized estimating equations which produces consistent estimators of the regression coefficients and of their variances. For example, with binary outcomes it is assummed that the logit of the probability of response p depends linearly on the covariates, the variance is just (1 )p p .This partial specification of the outcome distribution results in simple techniques for regression analyses of Gaussian, Gamma, Poisson, binomial and ordinal data (McCullagh and Nelder, 1983, as cited in Zeger and Liang, 1986). 2.2 Marginal models 2.2.1 Response probabilities and correlations The Generalized Estimating Equations (GEEs) are a widely used method for estimating the covariate coefficients of marginal models for repeated binary responses. In particular, Liang and Zeger (1986) and Prentice (1988) have introduced moment based GEEs, which only required a specification of the form of the first two moments (the probabilities of success and their correlations) of the vector of binary responses (Fitzmaurice, & Lipsitz, (1995). The GEE models are built on probability of an event of interest and correlations as espoused by Liang and Zeger (1986) and Prentice (1988). Specifically, when regression is the primary focus, Liang and Zeger (1986) presented an extension of generalized linear models to the study of longitudinal data. Their approach employed a generalized linear model for the marginal distribution of ity , where ( 1, ....., )iity t n They did not University of Ghana http://ugspace.ug.edu.gh 17 indicate any form of joint distribution for the repeated measures. However, they introduced estimating equations that resulted in consistent estimates of the regression parameters and of their variances .They modeled the marginal distribution, but did not consider the conditional distribution which takes into account previous observations. The methods they proposed reduced to maximum likelihood when the ity are multivariate Gaussian. Some have argued that binary response models that condition on all or some of binary responses variables are useful for studying only particular types of dependencies. To estimate marginal response probabilities, the conditioning is not needed. In cases where the analysis involves paired data, parametric methods to these data appear not to be complicated, otherwise it is complicated. Hence, Prentice (1988) advocated a generalized estimating equation approach for inference on response probabilities and correlations. 2.2.2 Marginal odd ratios Instead of using marginal correlations in modelling the association between a pair of binary responses ,Lipsitz et al. (1991), Liang et al. (1992) and Carey et al. (1993) proposed the use of a marginal odds ratio. To them, the odds ratio is a natural metric for measuring association and may also be easier to interpret, when dealing with binary responses. Fitzmaurice, & Lipsitz, (1995) also proposed a serial pattern models for the odd ratio. An attractive feature of these models is that they allow the repeated responses for each individual to be unevenly spaced in time and the times of measurement to vary between individuals. Also, the number of observations per individual may vary. University of Ghana http://ugspace.ug.edu.gh 18 2.2.3 Markov chains Azzalini (1994) proposed a marginal model which is founded on the binary Markov chain for a solitary stationary process. He proposed a stochastic model for the “study of the influence of time- dependent covariates on the marginal distribution of the binary response in serially correlated binary data. Markov chains were expressed in terms of transitional rather than marginal probabilities”. 2.2.4 Quadratic exponential model Cox (1972), Zhao and Prentice (1990) and Cox and Wermuth (1994) proposed different models using the quadratic exponential model approach. In suggesting likely distributions for multivariate binary data, Cox (1972) proposed that logistic models, in which we have the logarithm for the joint response probability to be “a quadratic function of the response vector are probably the simplest”. He however, proposed that, some conditional distributions from this set of models have a predominantly simple form, whereas marginal probabilities do not. Zhao and Prentice (1990), went ahead to suggest a re-parameterized version of a probability distribution of this quadratic exponential form done in terms of marginal parameters and ways of estimating such parameters. Cox and Wermuth (1994) on the other hand, proposed the joint distribution of k binary variables “in the quadratic exponential form containing only main effects and two-factor interactions in the log probabilities. Approximate versions of marginalized forms of the distribution were studied based on Taylor expansion”. University of Ghana http://ugspace.ug.edu.gh 19 2.2.5 Generalized multivariate logistic model In another attempt, Glonek and McCullagh (1995) and Glonek (1996) proposed a generalized multivariate logistic model. Glonek and McCullagh (1995) compared the multivariate logistic regression models to that of the log-linear models, and argued that it seems the log-linear approach had gained acceptance largely because of its convenient theoretical properties and the relative ease with which models can be stated and its maximum likelihood found. To restore the balance they defined the class of multivariate logistic regression models in a way that leads to a computational scheme that is feasible for problems of moderate size. Again, Glonek (1996) on the other hand, discussed a set of “link functions that lie between the two extremes of the multivariate logistic transform” established by McCullagh and Nelder (1989) and the log-linear contingency table analysis breakdown. Models resulting from the link functions established from Glonek (1996) showed various desired properties of both the log-linear regression model and the multivariate logistic regression models. The computation for implementing these models were more manageable than that for the multivariate logistic regression models. 2.2.6 Multivariate plackett distribution Employing a multivariate Plackett distribution for correlated data, Molenberghs and Lesaffre (1994) developed a marginal model. They used three link functions for association and marginal parameters. 2.2.7 Bivariate or multivariate Bernoulli approaches A few studies on accounts of bivariate or multivariate Bernoulli approaches such as Lin and Clayton (2005), Lovison (2006) and Sun et al. (2007) were introduced. University of Ghana http://ugspace.ug.edu.gh 20 Lin and Clayton (2005) described the use of quasi-likelihood estimating equations for spatially correlated binary data. In their paper, a logistic function is used to model marginal probability of binary outcomes in terms of regression parameters. Lovison (2006), on the other hand proposed a matrix-valued Bernoulli distribution, based on the log-linear representation introduced by Cox (1972) for “the Multivariate Bernoulli distribution with correlated components”. Sun et al. (2007) presented a new model called the “multivariate Bernoulli mixtures of normal”, and then compared it with “normal component mixtures-of-experts” and the “Rubin–Wu model”. This he did by investigating the joint distribution of observed data looking at each subject under each model. 2.3 Conditional models Compared to the marginal models, a relatively small number of studies have been conducted on the conditional approaches. Among some of the studies on conditional approaches are Muenz and Rubinstein (1995), Bonney (1986), Bonney (1987) and Islam and Chowdhury (2006). Muenz and Rubinstein (1995), assumed individuals from a heterogeneous group who are studied over time each having two states “ 0” or “1” and also assumed a binary Markov chain when the sequence of their state is taken. Through the use of two logistic regressions Muenz and Rubinstein (1995) modeled the transition probabilities for “0” to “0” and “1” to “0”transitions . They showed how the explanatory variables relate to changes in state. Muenz and Rubinstein (1995), also showed “how to use transition probability estimates to test hypotheses about the probability of occupying state “0” at time   2, .., i i T  and the equilibrium probability of state 0”. University of Ghana http://ugspace.ug.edu.gh 21 Bonney (1986) used the regressive logistic models in the study of familial disease and other binary traits. According to him, regressive logistic models are formed by decomposing the probability of the binary response variable conditioned on all the explanatory variables. Bonney (1987) expressed the likelihood of a set of binary dependent outcomes that have or do not have covariates, as a product of conditional probabilities each assumed to be logistic; these models are known as regressive logistic models. They provide a simple form of the multivariate distribution. His paper was intended to stimulate the development and usage of regressive logistic models. For higher order Markov chains especially, Islam and Chowdhury (2006), sought to propose accurate approaches “for linking covariate dependence with transition probabilities” so as to analyze the factors associated with such transitions. It should be noted that the probability of a transition of a point at a time to another at another time hence involves conditional modeling. 2.4 Measuring dependence For correlated binary data, it is worth noting that Le Cessie and van Houwelingen (1994) proposed measures of dependence using logistic regression. According to Le Cessie and Houwelingen (1994), the dependence between a pair of observations can be quantified. Prentice (1988) considered the correlation between 1Y and 2Y . This is instinctively attractive but the correlation is restricted, and the restrictions depend on the marginals P1 and P2. Therefore, they proposed two other measures whose range does not depend on the marginals: the tetrachoric correlation and the odds ratio. This study however, introduces measure of dependence by expressing the bivariate Bernoulli distribution in the exponential family form and picking the link function for 1Y and 2Y . University of Ghana http://ugspace.ug.edu.gh 22 2.5 Joint models The marginal as well as conditional measures may fail to measure the dependence of binary outcomes owing to the deficiency of proper specification of the underlying model. Accordingly, a joint model is proposed in this study which takes into consideration both marginal and conditional probabilities of correlated binary events of interest. The joint function was specified by putting together marginal and conditional probabilities. In addition, the overall fit of the proposed model was tested in this study. 2.6 Methodology 2.6.1 Simulation Leisch et al. (1998) dealt with binary variables, typically encoded by the set {0,1}. In the least complex case the variables are scalar and, the equivalent distribution is likewise known as the alternative distribution. In their illustration, binary random variables were represented byA ,B or 1A , 2A ,…….., respectively. True values of these random variables were represented by corresponding lower case letters. “The alternative distribution is fully determined by the single value ( 1)AP P A  ”, which is also the expectation of A , i.e. E(A)=PA . The variance is given by var( ) (1 )A AA P P ; for convenience they wrote    A A1 A 0 .P Pq     According to Leisch et al. (1998) ,given that “two binary random variables A and B are not necessarily independent ,then the joint distribution of A and B ” is completely defined by AP , BP and either ABP , |A BP or |B AP , where | | ( 1, 1) ( 1| 1) ( 1| 1) AB A B B A P P A B P P A B P P A A          (1) University of Ghana http://ugspace.ug.edu.gh 23 Bayes’ theorem may be used to express the remaining probabilities | |AB A B B B A AP P P P P  or relationships like ( 1, 0) A ABP A B P P   . Leisch et al. (1998) also said, the bivariate binary distribution can simply be summed up to “the multivariate case, where 1 {0,1}( ,....., ) ' ddA A A  is a vector with (possibly dependent) binary components. For a full description of an unrestricted distribution of A we need 2 -1d parameters, e.g., the probabilities of all 2d possible values of A (the last probability is determined by the condition that the sum equals 1)”. They came up with two ways of generating binary random variates namely (iii) The inversion method and (iv) “Direct Conversion of Real-Valued Random Variates”; (i) The Inversion method According to Leisch et al. (1998), “the task of generating a sample of size n of a binary random variable A with given AP can be accomplished if a random number generator is available which produces random numbers U which are uniformly distributed in [0, 1]”. If we want to create a sample 1a ,..........an of A we generate a sample 1u ,.........un of U and set , , , 0 1 i A i i A u qa u q     (2) Referring to Leisch et al. (1998), they assigned M to denote an arbitrary discrete random variable taking values in Z (the set of integers) with probabilities ( )iP P M i  . Then if we apply the uniform random variable U, then the transformation i i ii M i M P U P      can be used. University of Ghana http://ugspace.ug.edu.gh 24 However, the above relationship cannot easily be solved for U or M , because for each transformation a search in the table of 'i sP has to be undertaken. So Leisch et al. (1998), went ahead “to generate a random number m for M” by first generating a random number u for U and then setting m j if i i ii j i j P u P      . Using the above expression, d-dimensional binary data can now be generated when the probabilities of all d-tuples of {0,1} are specified. Leisch et al. (1998) provided that for example, “ 2 d  , one has to specify ( 0, 0), ( 0, 1), ( 1, 0)P A B P A B P A B     , and ( 1, 1)A B  . All d-tuples can be enumerated by interpreting the binary vector as a binary number, resulting in the index numbers {0,.........2 1}d  and corresponding probabilities (0),......., (2 1)dP P ”.They continued by providing another way of expressing the d-tuples. This they did by relying on the size of the iP , bringing about computational gains. Leisch et al. (1998) continued by stating that specifying directly all iP is simply impossible for small d , this is because of the exponential growth of the number of 'iP s . “In practice a lower- dimensional parameterization of the iP is used such that for some function f ( , )iP f i  , 1( ,......, ) 'n   with the dimension 2dn  of the new parameter vector  much smaller than 2d . If pairwise combinations of components are given by their correlation, then only  2On d parameters have to be specified. E.g., autologistic models fall into this category”. They asserted that “the inversion method is very powerful, because it can generate random values from arbitrary distributions. On the other hand, it requires a search in 2d values for each random number generated, which makes it rather slow”. University of Ghana http://ugspace.ug.edu.gh 25 (ii) Direct conversion of real-valued random variables Leisch et al. (1998) established that using a direct conversion of real-valued random variables provides a faster method for generating samples from a binary vector , ...........,1( )dA A A . They initially assigned “ 1( ,......, )dX X X to be a d-dimensional normally distributed vector with mean  and covariance matrix . Here, normally distributed random variates can easily be transformed to binary values by componentwise threshold in 1 0i ia x   Due to the construction ( 1) ( 0)Ai i iP P A P X    and ( 1, 1) ( 0, 0)AiAj i j i jP P A A P X X     where ( 0)iP X  depends, for fixed variances, only on i whereas ( 0, 0)ji jP X X  depends on i , j and on the correlation between iX and jX ” . Generating random numbers using a normal distribution is easier and faster because only one additional comparison and assignment per value is undertaken. Leisch et al. (1998) established that “if only random numbers from the standard normal distribution (zero mean, unit covariance matrix) can be generated, then they have to be transformed to general multivariate normals. Let  be the (matrix) square root of  such that ' = . Then ( , )Y X N     if ~ (0,1)X N ”. Determination of  and  Leisch et al. (1998) proposed the determination of  and  as follows: “Let iY be a 1-dimensional normally distributed random variable with mean i and unit variance. Hence, ( 0) ( ) (( ) )i i i i i iP Y P Y P Y       University of Ghana http://ugspace.ug.edu.gh 26 where the second equality holds, because ( )i iY  is normally distributed with zero mean. They choose i to be the AiP -quantile of the standard normal distribution and restricted all variances to 1, so that ( 0)i AiP Y P  . The desired marginal probabilities AiP are used to determine the mean vector  for the components of A . Allowing for different variances of the iY introduces additional degrees of freedom (and therefore modeling capabilities)”. Leisch et al. (1998) admitted that left out is “the relation between the covariance matrix b of the binary variables and the covariance matrix  of the normal distribution. They argued that using the option of specifying a covariance matrix, only pairwise relations between the components of the d-dimensional sample can be specified”. So, for ease of notation they restricted themselves to the bivariate case. The correlation coefficient ABp of the binary random variables A and B can be expressed as AB A B AB A A B B p p pr p q p q  (3) such that AB AB A A B B A Bp r p q p q p p (4) Leisch et al. (1998) continued by saying that, “if A and B are converted from two normal random variables X and Y as described above, then ABp can be related to the normal distribution by University of Ghana http://ugspace.ug.edu.gh 27 , ,, 0) ( , )( 0 (x y x yAB YP P X P X Y L           (5) Where 1 1 2 2 2 2 1 1 1 1( , ) ( ; ) ( / )P Y y Y y x P Y y Y y x P Y y x     and yY Y  have a standard bivariate normal distribution with correlation coefficient xy   ; and )( , , ) ( , ; ), ( h kL h k P X h Y k x y dydx         (6) with 2 2 2 2 ( ) 1 2 , ; exp 1 2(1 ) x xy y x y                (7) being the density function of ( , )"X Y . Leisch et al. (1998) indicated that “the values of )( , ;L h k  are tabulated (see the references in Patel & Read, 1982, p. 293f). It can also be obtained by numerical integration or Monte Carlo simulation. For x = y = 0 there is the simple relation 1 arcsin( )(0,0, ) 4 2AB P L      (8) or ( )1sin 2 " 4AB P      (9) University of Ghana http://ugspace.ug.edu.gh 28 Pseudocode for direct conversion Leisch et al. (1998) proposed an algorithm for generating a d -dimensional binary vector A with given first and second moments ,( )Ai Ai AjP P P : 1. “ Choose the marginal probabilities AiP and the covariance matrix b , 2. Set 1( )i AiP  where  is the standard normal distribution function, 3. Find an appropriate covariance matrix  for the normal distribution, 4. Generate a sample of a d -dimensional normal variables X with mean and covariance matrix  and 5. Set 1 0i ia x   ”. Leisch et al. (1998) continued by specifying pairwise relations: The pairwise relations between the variables , 1,....,iA i d can be gotten, by specifying the covariance matrix b or the pairwise probabilities P , i jAiAj   . Leisch et al. (1998) indicated that one has to make sure that the values specified are valid in both options. They explained that a covariance matrix b is valid if and only if it is symmetric and positive definite. In other words, all the eigenvalues of b have to be positive. This property they said can be checked easily. However, if the matrix is not positive definite, they did not have a clear solution of which elements of the matrix to change in order to turn the matrix into a positive definite one. Ai; i = 1; : : :; d Using the option of specifying pairwise probabilities, they were not aware of any convincing conditions that will lead to the validity of the specification. They however, provided some conditions for the pairwise probabilities: University of Ghana http://ugspace.ug.edu.gh 29 “ From |AB A B A AP P P P (10) we get min( )AB A BP P P (11) Letting A BP  be the probability that at least one of A and B equals 1. Then, we get 1 A B A B A BP P P P P   (12) which gives ,max( 1 0)AB A BP P P   (13) Fulfilling conditions (11) & (13) is not sufficient as is shown by the following simple example. Let 1/ 2A B CP P P   and 0AB AC BCP P P  . Then, ( A ) & ( B ) are fulfilled and one can easily construct a 2-dimensional binary distribution which is valid for AP , BP , and ABP by specifying that the pairs (0,1) and (1,0) have a probability of 1/ 2 each. However, there is no way to add the third variable C . The reason is that University of Ghana http://ugspace.ug.edu.gh 30 1 A B C AB AC BC ABCP P P P P P P     (14) A B C AB AC BCP P P P P P      (15) which is not fulfilled in the above example ”. Leisch et al. (1998) concluded that the illustrated example above can be applied to d -dimensions by setting 1/ ( 1), 1,....,AiP d i d   and 0,AiAjP i j   . After this, for any ( 1)d  of the d variables, a distribution can be extracted that satisfies all pairwise conditions by giving every ( 1)d  -tuple with exactly 1 the probability1/ ( 1)d  , however, they found no valid distribution for all d variables. Hence, to define a d-dimensional binary distribution by specifying , 1,....,AiP i d and ,AiAjP i j  , they proposed that we have to ensure that for all 2 k d  and all possible choices 1 , ...., kJ J of k elements out of d the condition 1 , 1, 1 i i l i lAji AjiAjl k n P P       is fulfilled. 2.6.2 Model checking According to McCullagh and Nelder (1989), after a selection of a suitable model, the data used to test the model may indicate that the particular model selected is not useful. It is likely that the data as a whole shows some departure from the fitted values, or it may be that a few data values depart from the fitted values. In the view of the writers, the detection of both systematic and isolated inconsistencies is part of the technique of model checking. To exemplify a systematic discrepancy they considered a plot of the residuals r against x. University of Ghana http://ugspace.ug.edu.gh 31 “McCullagh and Nelder (1989) also said that “having selected a particular model, it is required to estimate the parameters and to assess the precision of the estimates.” This study will use various methods for checking the usefulness of the proposed model (as discussed under the approach section) as well as estimate parameters and assess its precision as advocated by McCullagh and Nelder. University of Ghana http://ugspace.ug.edu.gh 32 CHAPTER THREE THE PROPOSED MODEL 3.1 The Bivariate Bernoulli model Marginal or conditional approaches have been proposed as discussed under the literature review in addressing binary responses in repeated studies. Some joint models have been considered as well but the models could not be made useful due to their limitations in estimating the parameters in light of the practical utility of such models to real life data. In this section, we propose the model based on the marginal-conditional approach to obtain joint models. In the univariate case, some distributions such as the binomial, Poisson, negative binomial, hypergeometric, Gamma and normal distributions originated from the Bernoulli distribution. These distributions are obtained as sums or limits, thereby forming an interrelated family of distributions (Marshall & Olkin, 1985). If ( , ) X Y has Bernoulli marginals then ( , ) X Y has only four possible values (0,0), (0,1), (1,0) and (1,1) . Let 00{( , ) (0,0)} ,  P X Y P  01{( , ) (0,0)}P X Y P  10{( , ) (1,0)}P X Y P  , 11{( , ) (1,1)}P X Y P  (Marshall & Olkin, 1985) Islam et al. (2013) initiated the proposed model as follows: The bivariate Bernoulli distribution for outcomes 1Y and 2Y can be expressed as   1 2 1 2 1 2 1 21 (1 )(1 ) (1 )( ) (1 )1 2 2 00 0 10 1, y y y y y y y yP Y y Y y p p p p      (16) University of Ghana http://ugspace.ug.edu.gh 33 The bivariate Bernoulli distribution can be expressed in a 2 X 2 contingency table as follows: Table 3.1: 2 x 2 contingency table for the bivariate Bernoulli distribution 1Y 2Y 0 1 Total 0 00P 01P 1 0( 0)P Y P   1 10P 11P 1 1( 1)P Y P  Total 2 0( 0)P Y P  2 1( 1)P Y P  1 The joint probability can be derived from the marginal and conditional and probabilities as 1 1 2 2 2 2 1 1 1 1( , ) ( ) ( )P Y y Y y P Y y Y y P Y y     (17) The bivariate probabilities as a function of covariate X are as follows: 1 1 2 2 2 2 1 1 1 1( , ) ( ; ) ( / )P Y y Y y x P Y y Y y x P Y y x      (18) The joint probability mass function in Equation (16) can be expressed in terms of the exponential family for the generalized linear models as  1 1 2 2 1 2 00 1 2 01 1 1 2 10 1 2 11, exp[(1 )(1 ) log (1 ) log ( ) log log ]P Y y Y y y y P y y P y y y P y y P          (19) University of Ghana http://ugspace.ug.edu.gh 34 00 2 00 1 00 1 2 00 2 01 1 2 01 1 10 1 2 10 1 2 11exp[(log log log log log log log log log )]P y P y P y y P y P y y P y P y y P y y P         (20) 011 10 1 00 2 01 2 00 1 2 00 1 2 1 2 10 1 2 11 00exp[( log log log log log log log log log )]y P y P y P y P y y P y y P y y P y y P P         (21) 10 01 00 11 1 2 1 2 001 1 2 2 00 00 10 01 ( , ) exp log log log logP P P PP Y y Y y y y y y P                         (22)  1 2, (0,0),(0,1),(1,0),(1,1), 1y y pijij  Let us consider a sample of size n then the log likelihood function in this case is given by 10 01 00 11 1 2 1 2 00 00 00 01 10 log log log log 1 1 i i i i i i i i i i i i i n n P P P Pl l y y y y Pii i                            (23) It follows that the components of the link function can be denoted as follows:   00 1101 10000 1 2 300 00 01 10, , , log log log and log P PP PP                        where 0 is the base line link function, 2 is the link function for 1 Y , 1 is the link function for 2Y and 3 is the link function for dependence between 1Y and 2Y . We have demonstrated the probabilities without function of covariates in the previous expressions. Now let us consider 1 2X (1,  ,  ,. . , )pX X X and 1 2x (1,  ,  ,. . , )px x x where * 1 2X (  ,  ,. . , )pX X X is the vector of covariates and * 1 2x (  ,  ,. . , )px x x is the vector corresponding values of the covariates. We can now express the conditional probabilities in terms of the logit link functions as follows: University of Ghana http://ugspace.ug.edu.gh 35 01 2 1 01 01 ( 1/ 0, ) ( ) 1 xeP Y Y x xxe      (24) 11 2 1 11 11 ( 1/ 1, ) ( ) 1 xeP Y Y x xxe      (25) 2 1 00 01 1( 0 / 0, ) ( ) 1 P Y Y x xxe     (26) and 2 1 10 11 1( 0 / 1, ) ( ) 1 P Y Y x xxe     (27) where 01 010 011 012 01, , , ...( ) 'p      and 11 110 111 112 11, , , ...( ) 'p      The marginal probabilities are as follows  1 11/   ( )P Y X x x   and 1 10 / ) 1 ( )P Y X x x    (28) Now, we assume that 1 1 1 1 ( 1/  )     ( ) 1 xeP Y x xxe     and 1 1 1 1( 0 /  )     1 ( ) 1 P Y x xxe    (29) University of Ghana http://ugspace.ug.edu.gh 36 where 1 10 11 12 1, , , ...( ) 'p      Also we can write 01 01 2 1 1 01 1 . 1( ) ( 1/ 0, ). ( 0 / ) 11 xeP x P Y Y X x P Y X x x xee           00 2 1 01 1 1 .1 1( ) ( 0 / 0, ). ( 0 / ) 11 P x P Y Y X x P Y X x x xee           11 1 11 2 1 1 11 1 .( ) ( 1/ 1, ). ( 1/ ) 1 1 x xe eP x P Y Y X x P Y X x x xe e             1 10 2 1 1 11 1 . 1( ) ( 0 / 1, ). ( 1/ ) 1 1 xeP x P Y Y X x P Y X x x xe e            (30) Now we can show that 000 101 1 01 . 11 1 1( ) ln( ( )) ln ln ln 111 1 x P x xx x x eee e                                 1 0 01( ) ln(1 ) ln(1 );x xx e e       (31) 01 01 01 1 1 01 100 . . ( ) 1( ) ln ln 1 .1 ( ) 11 xP x e x xx e e x xP x ee                         011 ( ) ;x x  (32) University of Ghana http://ugspace.ug.edu.gh 37 1 01 2 11 . . 1( ) ln 1 1 xe xx exe                 011 2 11 1( ) ln ln ln(1 ) 1 xxx e exe                01 1112 ( ) ln(1 ) ln(1 );x xx x e e        (33) 01 111 1 00 11 3 01 1 11 101 1 11 101 10 ( ) ( ) 1 11 1( ) ln ln . . . . . . ( ) ( ) 1 1 111 1 11 x xx xP x P x e ee ex x x x xx x x x e e eee e ee                             11 3 01 ( ) ln xex xe           11 013 ( ) ( )x x    (34) This indicates that if there is no association between 1Y and 2Y then this is true for 01 11   01 and 11 are the estimated covariate coefficients for the conditional logit models ,given covariates X for 1Y = 0 and 2Y = 1 , respectively. The assumption underlying the covariates is that they are time independent. It is noteworthy that Glonek and McCullagh 1995 ( as cited in Islam et al., 2013 ) considered a separate model for the measure of association instead of the dependence link function as a function of two conditional models as shown in equations (31),(32), (33) and (34). In addition, they considered marginal models for the link functions for 1Y and 2Y . The proposed model employs the conditional and marginal models for the outcome variables of interest and thus the measure of University of Ghana http://ugspace.ug.edu.gh 38 association can be linked with the link function as a function of conditional models, which provides a natural measure from the odds ratio. On the other hand, Muenz and Rubinstein in 1995 and Islam and Chowdhury in 2006 (as cited in Islam et al., 2013 ), showed the conditional models only and no attempts were made to obtain the joint mass function for the correlated binary data. Hence, the use of the proposed conditional and marginal models provide the necessary background to obtain the test for dependence in the repeated outcome variables based on the link functions presented in equation (34). This is a new formulation to measure the dependence in terms of the parameters of the conditional models obtained from the joint mass function. We can test the overall significance of a model using the likelihood ratio test and the dependence can be examined on the basis of 3 . This study, came out with a method of estimating the parameters .First, substituting 0 ( )x , 1 ( )x 2 ( )x , 3 ( )x into the likelihood equation (23) will give us the following: 01 01 111 01 11 011 21 1 21 1 ln(1 ) ln(1 ) ( ln(1 )) ln(1 ) ( )ii i i n n x x x xl l e e y x e e y x y y xii i                           (35) The estimating equations will be 1 0101 2 1 2 01 01 011 exp( )exp( ) ,1 e ( ) exp( ) i i ii i i i i i i i n i Y X Xl X X Y X Y Y XX           (36) 11 1 2 11 111 exp( ) ,1 exp( ) i i i i i i n i X Xl Y Y XX      (37) . 1 1 1 11 exp( ) 1 exp( ) i i i i i n i X Xl Y XX       (38) University of Ghana http://ugspace.ug.edu.gh 39 Now, because of the complexity of solving for the estimated values of the parameters, this study adopted the use of the Newton -Raphson and Nelder-Mead algorithms. Both methods are used to solve non-linear -optimization problems .The Newton-Raphson method requires the first and second derivatives of the function whereas the Nelder-Mead does not use derivatives but rather iterates on a simplex and then replaces the worst simplex. Finding the maximum of a function of k variables using the Newton-Raphson algorithm According to Quinn (2001), consider, ( ) f x a b x x Cx    , where a is a scalar, b and x are k vectors, and C is a k x k symmetric, negative definite matrix. The gradient of f at x is ( ) 2f x b Cx   . Since the gradient at the value that maximizes f will be a vector of zeros we know that the maximizer x satisfies 0 2b C x  . Solving for x we would have 1 12 Cx b    . Since C is assumed to be negative definite we know that this is a maximum. Using the R language, an initial value is chosen and then, the function then iterates till the optimal value pops up. The results also show the number of iterations. The Nelder-Mead algorithm According to Gavin (2013), “the Nelder-Mead algorithm provides a means of minimizing an objective function of n design parameters,    1 2 T, ,· · ·, , nx x xf x x . The Nelder-Mead algorithm iterates on a simplex, which is a set of 1n designs, (1) (2) ( 1)[x , x , · · · , x ]n. The Nelder-Mead University of Ghana http://ugspace.ug.edu.gh 40 algorithm specifies a sequence of steps for iteratively updating the worst design in the simplex ( 1)(x )n . The simplex may be thought of as a polygon with 1n vertices. If n = 2, the simplex is a triangle, and the Nelder-Mead algorithm may be easily visualized. If n = 3, the simplex is a tetrahedron. Considering triangular simplexes, [u, v, w] in the 1 2x x plane, the triangle connecting them, and the objective function evaluated at the three points, f (u), f (v), and f (w).” It iteratively improve the vertices of the triangle in order to minimize f (x). Steps for one iteration of the Nelder-Mead algorithm Gavin (2013) provided the following “1. Sort the vertices such that f (u) < f (v) < f (w). Point u is the best point, point v is the next-to- worst point, and point w is the worst point. 2. Reflect the worst point, w, through the centroid of the remaining points (u and v) to obtain the reflected point, r, and evaluate f (r). If the cost at the reflected point, f (r), is between the best and next-to-worst cost (f (u) < f (r) < f (v)) then replace the worst point, w, with the reflected point, r, and go to step 5. 3. If the cost at the reflected point, f (r), is better than f (u) (f (r) < f (u)) then extend the reflected point, r, further past the average of u and v, to point e, and evaluate f (e). (a) If the cost at the extended point, f (e), is better than the reflected point cost, f (r), then replace the worst point, w, with the extended point, e, and go to step 5. (b) Otherwise replace the worst point, w with the reflected point, r, and go to step 5. University of Ghana http://ugspace.ug.edu.gh 41 4. If the inequalities of steps 2 and 3 are not satisfied, then it is certain that the reflected point, r, is worse than the next-to-worst point, v, (f (r) > f (v)) and, a smaller value of f might be found between points w and r. So try to contract the worst point, w, to a point c between w and r, and evaluate f (w). The best distance along the line from w to r can be hard to determine, and, in general, it is not worth trying too hard to find this minimum. Typical values of c are one-quarter and three-quarters of the way from w to r. These are called inside and outside contraction points,          . . . 1 :,1 , 2 :,2 ,f f f f X f X and oc . (a) If the cost at the better of the two contraction points is better than the next-to-worst cost, min [ f (ci), f (co)] < f (v), then replace w with the better contraction point, ic or 0c , and go to step 5. (b) Otherwise shrink the simplex into the best point, u, and go to step 5.” 5. Check convergence According to Gavin (2013), “The simplex has converged if the simplex is “sufficiently small” and if function values at the simplex vertices are “sufficiently close.” “Sufficiently” is quantified by convergence tolerances Ex and Ef. These convergence tolerances are problem-dependent and can be user-specified. (a) If the largest difference between the adjacent vertices is less than Ex times the average of adjacent vertices,         , ,2 max , , x u v v w Eu v v w   for n = 2, as in this example University of Ghana http://ugspace.ug.edu.gh 42 (:,1: ) (:,2 : 1)2 max (:,1: ) (:,2 : 1) x X n X n EX n X n     for n ≥ 2, more generally and if the difference between worst and best function values is less than Ef times the best function value, 9 ( ) ( ) ( ) 10 f f w f u Ef u    for n = 2, as in this example 9 1 1 1 ( ) ( ) ( ) 10 f nf f Ef     for n ≥ 2, more generally then the iterations have converged, and the algorithm is terminated. In the convergence criteria for n ≥ 2, the simplex X is a n-by-(n + 1) matrix, in which each column of X is a vertex of the simplex. The objective function for each column of the simplex is in a vector f of dimension 1-by-(n + 1). The vector of objective functions is sorted in increasing order f (1) < f (2) < · · · < f (n+1). The columns of X correspond to the elements of          . . . 1 :,1 , 2 :,2 ,f f f f f X X etc. In this 2D example,  2x3, ,X u v w and       1 3 , , .f f f    f u v w (b) If the number of function evaluations has exceeded a user-specified limit then the algorithm is terminated. If the convergence criteria are not met and if the number of function evaluations has not been exceeded, then the algorithm returns to step 1 for the next iteration.” 3.2 Comparison with an alternative test We have compared the proposed test for dependence in the bivariate Bernoulli outcome variables, University of Ghana http://ugspace.ug.edu.gh 43 1Y and 2Y , with a test proposed by employing the regressive model (7, 8). The joint mass function for 1Y and 2Y can be expressed as shown: 1 2 1 2 1( , ) ( ) ( | , )P y y P y P y yx x x ,where X = x is the vector of covariate values. It therefore follows that the regressive model includes the previous outcome, 1Y , as a covariate, in addition to the explanatory variables, 1 2, ,. . ., ,px x x as follows: 0 1 1 2 0 1 1 1 2 1 1 ( .......... ) ( .......... )( / , ) 1 p p p p x x y y x x y eP y y x e                  (39) where 0 1, ,. . ., p   and γ are the regressive model parameters. It is noteworthy that γ is the parameter associated with the outcome variable 1Y such that, H0: γ = 0 indicates a lack of dependence between 1Y and 2Y . However, one of the major limitations arises from the fact that dependence in 1Y and 2Y relies on the dependence between the outcome variables and the covariates as well. Hence, in many instances, the regressive model (39) may fail to recognize the true nature of the relationship between 1Y and 2Y in the presence of covariates 1 2, ,. . ., pX X X in the model. However, in the proposed model, the generalized bivariate Bernoulli model clearly takes account of the association between the response variables and the covariates. 3.3 Test for the model To test the overall fit of the joint model as function of the covariates, Islam et al. (2012) suggested the following: _ _ _ _ 01 11 10 0: , ,HH          against _ 1 0: HH   where University of Ghana http://ugspace.ug.edu.gh 44       011 012 0101 111 112 1111 1 1 1 2 11 , , , , , , _ _ _ p p p                     Then 010 110 1 0 01 11 12 ( , , ) ( , , ) ,InL InL                   which is asymptotically distributed as 23 p . (40) For testing 0 1 _ _ : 0 : 0 H H H H     Islam et al. (2012) also suggested the use of the Wald test statistic: ^ ^ ^ ( ) H H W se    (41) University of Ghana http://ugspace.ug.edu.gh 45 CHAPTER FOUR SIMULATION 4.1 Introduction In this chapter, simulation is undertaken to verify the usefulness of the proposed model. First, exploratory analysis is done to examine the characteristics of the data simulated .Second, with the same simulated data, the regressive, conditional logistic and the GEE models are thoroughly examined before the proposed model. Finally, the overall-fit of the proposed model compared to the traditional models (i.e. the regressive, conditional logistic and the GEE models) are then critically examined. Data generation For the purposes of investigating the reliability of the proposed model, correlated binary data were initially generated. The function rmvbin was used in generating correlated binary data. This function creates correlated multivariate binary random variables by thresholding a normal distribution. In this study, the correlated binary data was generated by specifying the marginal probabilities and  . The Sperman’s  is a non-parametric test used to measure the strength of association between two variables. Five different sets of models were generated using different marginal probability combinations for 1Y and 2Y as shown under the Approach section (see 1.8). Details of the simulation results are shown in Tables A1a to A5e (See Appendix A) University of Ghana http://ugspace.ug.edu.gh 46 4.2 Exploratory analyses 4.2.1 Odds ratio If 1 0Y  then the odds of 2 0Y  versus 2 1Y  are 00 01/p p . Similarly, if 1Y =1 , then the odds of 2 0Y  versus 2 1Y  are 10 11/p p . The “odds ratio “is the ratio of the odds of 2Y when 1 0Y  to the odds of 2Y when 1 1Y  , hence 11 00 10 01 P P P P . If the odds ratio is greater than 1 then concordant responses i.e 1 2( 0,  0)Y Y  or 1 2( 1,  1)Y Y  are more common than the discordant responses i.e 1 2( 0,  1)Y Y  or 1 2( 1,  0)Y Y  . In contrast, when the odds ratio is less than 1, then discordant responses are more common. The simulation exercise showed that, models with low marginal probabilities i.e. 1 2( 0.1 , 0.1)P P  , have common concordant responses than that of the discordant. The odds ratio actually increased from 2.57 at sample size 50 to 2.75 at sample size 1000. The highest odds ratio was observed when one of the marginal probabilities is very high while the other is very low, i.e. 1 2( 0.1 , 0.8)P P . At sample size 50, the odds ratio was 3.87 but decreased to 3.77 with sample size 100 and thereafter increased to 3.93 at sample size 200. Even though this group of marginal probabilities produced the highest odds ratio, no pattern was formed as the sample size increased. The lowest odds ratios were identified when the marginal probabilities had average probabilities, i.e 1 2( 0.5 , 0.5)P P  .They were in the range of 0.98 to 1.01, indicating that the number of concordant responses was either slightly higher than that of the discordant or the discordant responses slightly higher than the concordant responses. University of Ghana http://ugspace.ug.edu.gh 47 4.2.2 Marshall–Olkin correlation The model generated using low marginal probabilities i.e. 1 2( 0.1 , 0.1)P P  , had similar values of  and the Marshall–Olkin correlation was approximately 0.2. However, all other models produced different Marshall–Olkin correlation and  values. For instance, the model with probabilities 1 2( 0.5 , 0.3)P P  recorded 0.3 for  but 0.03 or 0.04 for the Marshall–Olkin correlation depending on the sample size. Also, the model with marginal probabilities 1 2( 0.1 , 0.8)P P  produced 0.1 for rho but 0.2 for for Marshall–Olkin correlation .This might be due to the fact that rho on one hand measures correlation using the ranks of the ordinal variable whereas Marshall–Olkin correlation uses the probabilities of concordant and discordant responses. The Marshall–Olkin correlations of 1Y on X or 2Y on X given that the response variable is either 0 or 1 produced very small probabilities, indicating that the concordant as well as discordant responses were almost the same. 4.2.3 3Dependence value based on Ș All but one model had a non-zero value, indicating that there exist dependence between the response variables. The model with marginal probability 0.5 for both response variables 1Y and 2Y had a dependence value of approximately 0. A sample size of 50 for instance, produced a value of -0.02 whereas a sample size of 1000 produced a value of 0.0001.This indicates independence when the data was simulated with marginal probabilities of 0.5 for both response variables. This is rightly so because the marginal probabilities chosen to model the data, add up to one (1) meaning that the pairwise probabilities will be zero, hence indicating independence. University of Ghana http://ugspace.ug.edu.gh 48 4.3 Regressive model The fitted model is of the form 1 1 1 1 1 logit( ) log 1 Y                       , where 1 represents the estimated probability of recording a “1” response at the second measurement. 11  represents the estimated probability of not recording a “1” response at the second measurement, 1 represents the estimated parameter coefficient for covariate X and  represents the estimated parameter coefficient associated with the response variable 1Y . If  equals zero, then it shows that there exist no dependence between 1Y and 2Y .  denotes the value of the logit when the covariates are zero. The regressive model produced more than 90% of parameter estimates that fitted the available data when the sample size was 500. Model 1 1 2( 0.1 , 0.1)P P  and Model 4 1 2( 0.5 , 0.3)P P  for instance, had all sampled estimates simulated fitting the available data when the sample size was 500. However, lower samples did not produce high percentages of parameter estimates that fitted the available data. A case in point is a sample size of 50, where Model 4 1 2( 0.5 , 0.3)P P  had only 40% of parameter estimates fitting the available data. Model 3 1 2( 0.1 , 0.8)P P  on the same hand did not have any of its parameter estimates fitting the available data. It is worth to note that even with a large sample size of 1000, the regressive model was not able to identify that Model 2 1 2( 0.5 , 0.5)P P  had an independent relationship between the response variables (the estimated regressive parameter  was -2.63). On the other hand, the dependence University of Ghana http://ugspace.ug.edu.gh 49 value based on 3 was 0.0001 clearly showing independence .This might be due to the fact that the regressive model only accounted for correlation between two responses only. It did not account for possible correlations between responses and covariates. 4.4 Conditional ( 1Y = 0 ) The fitted conditional model is of the form 01 01 01 01 logit( ) log 1                    , where 01 represents the probability of recording a “0” response at the first measurement and a “1” at the second measurement 011  represents the probability of not recording a “0” response at the first measurement and a “1” at the second measurement. 01 represents the estimated coefficient parameter for covariate X .  is the value of the logit when the covariate is zero. The simulation exercise revealed that, with a sample size of 50, only a small percentage of the parameter estimates were significant .In addition ,only a small percentage of the models fit the data well after running the simulation for 1000 times. Since we are interested in accurate predictions as much as possible, we need a model that fits the data well enough. A case in point is Model 1 1 2( 0.1 , 0.1)P P  which had only 22% of parameter estimates being significant and 33% of all the models run fitted the data well. As the size of the data grew larger, the percentage of significant estimated parameter also grew bigger as well as the percentage of models that fitted the data well. For instance, with a sample size of 500, Model 4 1 2( 0.5 , 0.3)P P  had all its parameter estimates being significant .Moreover, all the models simulated fitted the available data well. University of Ghana http://ugspace.ug.edu.gh 50 In contrast, Model 2 1 2( 0.5 , 0.5)P P  had all estimated parameters being significant as well as all models fitting the available data. We must be cautious not to jump to conclusion here, because it has been established that Model 2 1 2( 0.5 , 0.5)P P has independent response variables; a possible reason for the perfect fit is as follows: When binary data are randomly generated, the covariance of the outcome variables will follow the binomial model (i.e. two possible outcomes; an occurrence of an event of interest or non- occurrence) with constant probability. However, when binary data are not sampled randomly then there is the likelihood that the outcome variances will not follow the binomial model. An example is picking samples from clusters; this will result in different probabilities and hence increased variance compared to variances observed under the binomial model. This phenomenon is known as over dispersion and the effects are that, the standard errors and the conclusions might be affected. 4.5 Conditional 1(Y = 1) The fitted conditional model is of the form 11 1111 11 logit( ) log 1                    , where 11 represents the probability of recording a “1” response at the first measurement and a “1” at the second measurement . 111  represents the probability of not recording a “1” response at the first measurement and a “1” at the second measurement. 11 represents the estimated coefficient parameter for covariate X.  denotes the value of the logit when the covariate is zero. Just like the previous conditional case where 1( 0)Y  , as the sample size increased, the percentage of significant parameter coefficient increased .On the same hand, the percentage of models that fitted the available data also increased. Model 3 1 2( 0.1 , 0.8)P P  for instance, had only 19% University of Ghana http://ugspace.ug.edu.gh 51 of the models fitting the available data well but with a sample size of 500, 94% of all models simulated under the stated marginal probabilities fitted the data well. The same model had none of the estimated parameters attaining significance at sample size 50. However, with a sample size of 500, 66% of the estimated parameters were significant. Model 5  1 20.2 , 0.6P P  in general had very low percentage of estimated parameters and well fit models respectively, irrespective of sample size. At a sample size of 50, none of the estimated parameters was significant and only 5% of the models had a good fit to the simulated data. At a sample size of 1000, none of the estimated parameters was significant and none of the models had a good fit. 4.6 The Marginal model The fitted marginal model is of the form 1 1 1 1 logit( ) log 1                    , where 1 represents the estimated probability of recording a “1” response at the second measurement. 11  represents the estimated probability of not recording a “1” response at the second measurement and 1 represents the estimated parameter coefficient for covariate X.  denotes the value of the logit when the covariate is zero. With sample size of 50, only Model 3 1 2( 0.1 , 0.8)P P  had 14% of its parameter coefficients not equal to zero using the wald test (41).All other models had all the estimated parameter coefficients equal to zero, meaning that they will not serve as a good predictor model for the available data. University of Ghana http://ugspace.ug.edu.gh 52 For a sample size of 100 none of the models had a non-zero estimated parameter coefficient. However, with a sample size of 200, Model 1 1 2( 0.1 , 0.1)P P  had 28% of simulated models with a non-zero parameter coefficient, Model 5  1 20.2 , 0.6P P  had 14%,all others equaled zero. For a sample size of 500 none of the models with a non-zero estimated parameter coefficient whereas with a sample size of 1000, only Model 3 1 2( 0.1 , 0.8)P P  had 29% of models fitting the available data. In short, the marginal model was poor at fitting the available data compared to the conditional model. 4.7 Bivariate Bernoulli model The process of finding parameter estimates for the bivariate Bernoulli model was very tedious, in that an initial estimate for a parameter needs to be provided.Iteration is then done by the Newton- Raphson and the Nelder-Mead maximization methods to estimate the actual parameter that maximizes the likelihood function. To make it easier for the selection of initial parameter estimates, the estimates for the conditional and marginal estimates were referred to so as to give an idea of where the best-fit parameter might fall. The proportion of significant estimates were not calculated, because a lot of choices of the initial parameters had to be inputted before arriving at estimated parameter. However, almost all of the parameters settled on were significant. A more important test was that of the overall fit of the proposed model. University of Ghana http://ugspace.ug.edu.gh 53 4.8 Overall fit of the proposed model The log likelihoods for the regressive, conditional as well as the marginal models were calculated and then compared to the log likelihood of the proposed model. Results showed that the bivariate Bernoulli model came out as a better model than the rest of the models. The likelihood ratio test was employed in comparing the fit of the proposed model to that of the regressive, conditional and marginal models respectively. A model with more parameters is likely to better fit the data .However, it is important to find out whether the right model fits significantly, hence the use of the test. All tests produced highly significant results i.e. significant at 5% level of significance, meaning that the proposed model fits better. (See Tables A1a to A5e under Appendix A) University of Ghana http://ugspace.ug.edu.gh 54 CHAPTER FIVE APPLICATION TO REAL -LIFE DATA 5.1 Introduction In this chapter, real-life data was used to examine the usefulness of the proposed model. First, exploratory analysis was conducted .All the models were examined under the same dataset and the overall fit of the proposed model was also examined. The study went ahead to estimate parameters for all the models under study and compared the results. Table 5.1: Transition counts and respective probabilities on the employment status for the two periods Wave 2 Transition count Transition probability Wave 1 0 1 Total 0 1 Total 0 4132 707 4839 0.8539 0.1461 1.000 1 1442 1559 3001 0.4805 0.5195 1.000 Odds ratio = 6.32 Marshall-Olkin correlation coefficient = 0.37 Table 5.1 displays the transition counts and probabilities on the employment status of persons for the two periods. It is evident that 85.4% of persons between the ages of 17 and 51 interviewed remained unemployed whereas 14.6% gained employment by the second period. Moreover, within the same period, 48.1% moved from being employed to not being employed whereas 52% remained employed. 5.2 Odds ratio The estimated odds ratio for the two periods was 6.40, meaning that there were more concordant responses than discordant responses. Specifically, as many as 4,132 respondents remained unemployed whereas 1,559 respondents remained employed. In contrast, 873 respondents were University of Ghana http://ugspace.ug.edu.gh 55 unemployed at the first visit but gained employment at the second visit whereas 576 respondents were employed at the first visit but had lost employment by the second visit. 5.3 Marshall–Olkin correlation The Marshall-Olkin correlation coefficient estimate of 0.59 indicates that there is a substantial positive correlation between employment statuses between the two periods under study. 5.4 3Dependence value based on Ș The dependence value is 1.84, clearly indicating dependence between the response variables i.e employment status at the two periods. Table 5.2a: Fitted models using data from NIDS: Traditional models Conditional Models Marginal Model Model 01 j Model 11 j Model 1 j -GEE Covariates 01 j se p- value 11 j se p-value 1 j se p-value Constant -4.115 0.436 2e-16 *** 2.517 0.287 2e-16 *** 1.012 0.200 4.44 e-07 *** Age -0.069 0.081 0.386 -0.750 0.069 2e-16 *** -1.429 0.053 2e-16 *** Education 0.826 0.206 6.38 e- 05*** 0.026 0.136 0.846 0.589 0.090 5.53 e-11 *** Sex 0.043 0.085 0.615 -0.310 0.062 5.73e- 07*** -0.740 0.047 2e-16 *** Log- likelihood -2520 -3240 8755.65 University of Ghana http://ugspace.ug.edu.gh 56 Table 5.2b: Fitted models using data from NIDS: Generalized bivariate Bernoulli model Generalized bivariate Bernoulli model Model 01 j Model 11 j Model 1 j Covariates 01 j se p- value 11 j se p- value 1 j se p- value Constant 0.07 0.00 <2e-16 *** 0.11 0.00 <2e-16 *** -0.55 0.0004 <2e-16 *** Age -5.558 0.005 <2e-16 *** 0.145 0.0008 <2e-16 *** -4.892 0.0005 <2e-16 *** Education 0.679 0.002 <2e-16 *** 0.309 0.001 <2e-16 *** 4.663 0.0003 <2e-16 *** Sex -4.092 0.004 <2e-16 *** -2.006 0.002 <2e-16 *** -3.250 0.0009 <2e-16 *** Log- likelihood 2892940 2786759 10672276 Likelihood ratio test ( proposed model vrs traditional methods p< 0.05 p< 0.05 p< 0.05 5.5 Traditional models Table 5.2a provides the following details: 5.5.1 Marginal model-GEE Using the marginal model, all three covariates were significant, meaning that they contribute significantly to the state of employment status. The fitted marginal model is 1 1 1 logit( ) log 1.012 1.429 0.589 0.740 1 age edu sex                , University of Ghana http://ugspace.ug.edu.gh 57 where 1 represents the estimated probability of recording an “employed” response at the second measurement and 11  represents the estimated probability of not recording an “employed” response at the second measurement . The estimated intercept is 1.012 representing the estimated logit when age = 0, edu = 0 and sex = 0. This means that the respondent had no age group, no sex and no educational status which does not really make sense in this particular study. The estimated coefficient for the variable age is -1.429 meaning that for respondents who are in age range 17-25 versus those in the age bracket 26-51, the expected change in the log odds is 1.429, given that education and sex stays constant. We can also claim that based on the odds ratio scale, the odds for an individual in the 17-25 year group to be employed is exp(1.012+0.589-0.740) = 2.366 and the odds for an individual in the 26- 51 year age group is exp(1.012-1.429+0.589-0.740 )= 0.567. Therefore the odds of 2.37 means that individuals in the 17-25 year group are 2.37 times more likely to be employed at the second visit than those in age group 26-51. In terms of probabilities, the probability of an individual in the 17-25 year group to be employed at the time of the second visit will be 2.366/(1+2.366) = 0.70 and that of an individual in the age group 26-51 will be 0.30. All three covariates were significant at a 5% level of significance indicating that if the same sample were run for 100 times 95 of them will have all three estimated coefficients being significant. The interpretation for education status and sex of the respondents was left undone, since that is not the main objective for this study. University of Ghana http://ugspace.ug.edu.gh 58 5.5.2 Conditional model 1(Y = 1) Given that a respondent at the first time of visit was employed, only two of the covariates produced significant values, i.e age and sex. In other words, only age and sex had significant impact on the probability of being employed at the second time of visit given that a respondent was initially employed. The fitted conditional model is 11 11 11 logit( ) log 2.517 0.750 0.026 0.310 1 age edu sex                , where 11 represents the estimated probability of recording an “employed” response at the second measurement, given that the first response recorded “employed” and 111  represents the estimated probability of not recording an “employed” response at the second measurement . The estimated intercept is 2.517 representing the estimated logit when age = 0,edu = 0 and sex = 0. The estimated coefficient for the variable age is -0.750, meaning that for respondents who are in the age range 17-25 versus those in the age bracket 26-51 , the expected change in the log odds is 0.750,given that education and sex is constant. Interpreting based on the odds ratio scale, the odds for an individual in the 17-25 year group to be employed at the second time of visit is exp(2.517+0.026-0.310) = 9.327 and the odds for an individual in the 26-51 year age group is exp(2.517-0.750+0.026-0.310 )= 4.406. In probability terms, the probability of an individual in the 17-25 year group to be employed at the time of the second visit will be 9.327/(1+9.327) =0.90 and that of an individual in the age group 26-51 will be 0.10. University of Ghana http://ugspace.ug.edu.gh 59 5.5.3 Conditional model 1(Y = 0) Given that the respondent was unemployed at the first time of visit, the fitted model will be 01 01 01 logit( ) log 4.115 0.069 0.826 0.043 1 age edu sex                 . This time, only the variable education turned out to be significant. 01 represents the estimated probability of recording an “employed” response at the second measurement, given that the first response recorded “unemployed”. 011  represents the estimated probability of not recording “employed” response at the second measurement, given that the first response recorded “unemployed”. The estimated coefficient for the variable education is 0.826 .Also the odds of the group that had never being to school is exp (-4.115-0.069-0.043) = 0.014. This means that the group that had never been to school are 0.01 times more likely to be employed than persons who have some level of education or the probability of a person in the group that had never been to school to be employed will be (0.014/1+0.014) = 0.01 5.6 Generalized bivariate Bernoulli model Unifying the marginal and conditional probabilities into a joint distribution as specified by the generalized bivariate Bernoulli model, we can estimate parameters by conditioning on the initial response or using the marginal model. Table 5.2b provides the following: 5.6.1 Marginal The fitted logistic model is University of Ghana http://ugspace.ug.edu.gh 60 1 1 1 logit( ) log 0.55 4.892 4.663 3.250 1 age edu sex                 . The estimated population parameters follow the usual interpretation as stated under the traditional models. All the estimated population parameters were significant with very small standard errors. On the odds ratio scale, the odds for an individual in the 17-25 year group to be employed is exp (-0.55+4.663-3.250) = 2.370 which means that persons in the 17-25 year group are 2.37 times more likely to be employed at the second visit than those in the age group 26-51. The probability of an individual in the 17-25 year group to be employed at the time of the second visit will be 2.37/ (1+2.37) = 0.70 and that of an individual in the age group 26-51 will be 0.30. 5.6.2 Conditional model 1(Y = 0) Given that a respondent was employed at the first time of visit, the fitted model will be 11 11 11 logit( ) log 0.11 0.851 1.186 0.954 1 age edu sex                . Again, all the estimated parameters were significant with very small standard errors and the estimated population parameters follow the usual interpretation. 5.6.3 Conditional model 1(Y = 1) Given that a respondent was unemployed at the first time of visit, then by the generalized Bernoulli model, the logistic model will be 01 01 01 logit( ) log 0.07 0.2887 0.7805 0.4651 1 age edu sex                . University of Ghana http://ugspace.ug.edu.gh 61 All the estimated parameters were close to zero. 5.7 Overall fit of the model Even though the proposed model produced estimated parameters with small very small standard errors it is necessarily to find out how good the proposed model fits the available data in comparison with the traditional methods. The likelihood ratio test showed that the generalized bivariate Bernoulli model fits the available data significantly better at 5% level of significance than the traditional methods. Table 5.3: Comparison of results: probability of a covariate being employed at the time of second visit Conditional Marginal Covariate Categories Model 01 j  Generalized bivariate Bernoulli Model 11 j  Generalized bivariate Bernoulli Model 1 j  - GEE Generalized bivariate Bernoulli Age 17-25 0.96 0.97 0.90 0.83 0.70 70 26-51 0.04 0.03 0.10 0.17 0.30 30 Education Ever been to school 0.99 0.9999 0.81 0.86 0.77 0.82 Never been to school 0.01 0.00001 0.19 0.14 0.23 0.18 Sex Male 0.97 0.992 0.86 0.64 0.54 0.69 Female 0.03 0.008 0.14 0.36 0.46 0.31 Table 5.3 shows that the probabilities for the traditional as well as the proposed model were generally similar, but first, under the marginal models , the GEE estimated that the probability of a male employed at the second time of visit will be 0.54 whereas the proposed model estimated a probability of 0.69. University of Ghana http://ugspace.ug.edu.gh 62 Second, given that a respondent was employed at the first time of study, the conditional logistic model estimated the probability of a male employed at the second time of visit to be 0.86 whereas the proposed model estimated a probability of 0.64. Third, the probability of a respondent who has ever gone to school under the GEE was 0.77 whereas the bivariate Bernoulli model recorded a probability of 0.82. University of Ghana http://ugspace.ug.edu.gh 63 CHAPTER SIX RESULTS AND DISCUSSION 6.1 Introduction Key findings from the simulation as well as the application to real life exercise are discussed in this chapter. The challenges encountered while undertaking these exercises are also discussed. The existence of dependence in repeated studies and how to deal with such cases in analysis has been a problem for researchers. This is, because relying on the assumption of independence between responses may lead to inefficient results. Marginal as well conditional models have been used to address this problem in the past. This study however, uses the joint model (i.e. unifying the conditional and marginal) known as the bivariate Bernoulli model to address the challenge of dependence in repeated studies and more importantly estimate parameter coefficients. 6.2 Correlation The simulation study revealed that the Marshall-Olkin correlation and  produced different correlation values. It is likely that the measure of correlation using the Marshall-Olkin correlation was better than  because the Marshall-Olkin correlation employs probabilities of concordant and discordant responses whereas  measures correlation using the ranks of the ordinal variable in this case the response variables. 6.3 Dependence The simulation study also revealed that the measure of dependence ( 3 ) that was derived from the bivariate Bernoulli model was able to identify independence between two response variables when independent response variables were simulated. This is an added plus to the use of the proposed model because it helps to confirm dependence or otherwise of response variables before further tests are carried out. University of Ghana http://ugspace.ug.edu.gh 64 It is important to state that the regressive model failed to identity independence between response variables. However, the dependence measure 3 was able to do so. The inability of the regressive model to account for independence may be due to the fact that the model only accounted for correlation between responses and responses between two periods. It did not account for possible correlations between responses and covariates. 6.4 Investigation of the traditional models In this investigation, the marginal model, which represents the probability of obtaining an occurrence of interest with no conditionality to previous responses, and the conditional model which says that the probability of an event occurring depends on the previous state of employment status were examined. The simulation exercise revealed that the conditional models fitted the available data far better than the marginal. This finding is not surprising, because taking into consideration the response in an initial investigation before taking the second response (or saying the probability of an event is conditional on a past response) is likely to result in a better finding than disregarding the first responses. The measure used to find out how well a model fits available data is the Wald test. The Wald test determines if the estimated coefficients are simultaneously equal to zero, as against at least one estimated coefficient not being zero. 6.5 The Bivariate Bernoulli model The model proposed for this study unified the marginal and conditional models into one model. This resulted in a joint model, which was then used to estimate probabilities. The good thing about this model is that its distribution is uniquely identified. Hence, finding the likelihood function was straightforward, compared to the traditional marginal model which does not employ the use of a University of Ghana http://ugspace.ug.edu.gh 65 known distribution but rather uses the probabilities of success and their correlations of the vector of binary responses. 6.5.1 Deriving parameter estimates Estimating parameter coefficients from the estimating equations (derived from the likelihood function) was tedious, hence iterative methods of finding roots was employed. Two methods were used: the Newton-Raphson and the Nelder-Mead methods of iteration. This study revealed that the Nelder-Mead method was more efficient in arriving at the roots than the Newton-Raphson. Specifically, with the same set of initial values, the Nelder-Mead method was able to provide all the roots as against some of the roots when the Newton –Raphson method used. Picking the initial roots was very tedious. First of all, values close to the roots obtained using the traditional methods were used as initial values.Then a random selection of initial values was taken thereafter, when the earlier values obtained using the traditional models failed to help estimate the roots. It must be noted that some of the estimates were arrived at after several selection trials of initial values; some values were arrived at after about four hours. 6.5.2 Parameter estimates All the parameter estimates produced were significant under the bivariate Bernoulli model. This does not prove that the model is good, but it means that the covariates selected for this study have an impact on the probability of the response variable. 6.6 Comparison of results All three covariates had significant impact under the GEE marginal model and the bivariate Bernoulli model. However, only some of the covariates had significant impact when the conditional logistic models were used. Specifically, age and sex significantly impacted the employment status when the probability of being employed was conditioned on the respondent being employed at the first time of visit. University of Ghana http://ugspace.ug.edu.gh 66 Also, only education significantly impacted the employment status when the probability of being employed was conditioned on the respondent not being employed at the first time of visit. In some cases, there seem to be differences in results; for instance, in the probability of being employed at the second time of visit. Under the GEE marginal model, whereas the probability of a male getting employed at the second time of visit was at 0.54, the bivariate Bernoulli estimated the probability to be 0.69. 6.7 Structure of models It is also important to note that the GEE uses various correlation structures such as “independence”, “unstructured”, “exchangeable” and “user defined” to estimate covariate coefficients. This might lead to inadequate results because a joint modelling approach is not used. The joint modelling provides explicitly the distribution for the available data, they enhance the fit of the data and hence enhance the efficiency of parameter estimates. This study has shown that with the use of the proposed model, all three covariates have significant impact on the probability of the response variable, irrespective of whether there was conditioning or not . 6.8 Missing values The simulation exercise was done without taking into account cases of missing values. So, estimating various statistics was performed without any challenge. However, when real-life application was introduced, the presence of missing values made estimating impossible. All the missing values therefore had to be expunged from the dataset, making use of only responses given at each time of visit. This situation is worrying because valuable data was discarded. Because this study used secondary data there was no way of going back to fill the gaps in the data. University of Ghana http://ugspace.ug.edu.gh 67 CHAPTER SEVEN CONCLUSION AND RECOMMENDATIONS 7.1 Conclusion This study has shown that the bivariate Bernoulli model fits bivariate binary response data significantly better than the GEE marginal model and the conditional logistic models. Since the joint distribution is explicitly known, the likelihood function was found and then the coefficient of the parameters estimated. Findings using the bivariate Bernoulli model against the GEE model were generally similar apart from one case where the bivariate Bernoulli model recorded a probability of 0.69, whereas the marginal model recorded a probability of 0.54. Results were also generally similar using the bivariate Bernoulli model against the conditional logistic models except for a few cases. For instance, the logistic model estimated the probability of a male to be employed at the second time of visit to be 0.86 whereas the proposed model estimated the probability to be 0.64. Even though selecting the initial value in order to estimate the covariate coefficient using the bivariate Bernoulli model was tedious, the bivariate Bernoulli model is highly recommended for analyses involving binary response data for two periods i.e. (t and t+i) because it fits bivariate binary data significantly better. 7.2 Recommendations Investigating the usefulness of the proposed model based on simulation as well as its application to real-life data, the following recommendations should be considered: University of Ghana http://ugspace.ug.edu.gh 68 First, more research should be carried out to find out how to easily select an initial value when using the Newton Raphson or the Nelder Mead methods of iteration. Second, further research should also be conducted using the multivariate Bernoulli model to handle not only two periods of study, but several periods. Third, in developing models, the methods of estimating missing values should be incorporated so that missing values will be addressed while using the model. This is recommended to prevent the deletion of variables containing missing data. Fourth, in order to enhance accuracy, Health Institutions such as the School of Public Health, as well as other Research Institutions such as the Ghana Statistical Service and the Institute of Statistical Social and Economic Research (ISSER) should utilize the use of the bivariate Bernoulli model to estimate regression coefficients when the outcome variables involved is bivariate binary. University of Ghana http://ugspace.ug.edu.gh 69 REFERENCES Azzalini, A. (1994). Logistic regression for autocorrelated data with application to repeated measures. Biometrika, 81(4), 767-775. Bonney, G. E. (1987). Logistic regression for dependent binary observations. Biometrics, 951-973. Bonney, G. E. (1986). Regressive logistic models for familial disease and other binary traits. Biometrics, 611-625. Carey, V., Zeger, S. L., & Diggle, P. (1993). Modelling multivariate binary data with alternating logistic regressions. Biometrika, 80(3), 517-526. Cox, D. R., & Wermuth, N. (1994). A note on the quadratic exponential binary distribution. Biometrika, 81(2), 403-408. Cox, D. R. (1972). The analysis of multivariate binary data. Applied statistics, 113-120. Fitzmaurice, G. M., & Lipsitz, S. R. (1995). A model for binary time series data with serial odds ratio patterns. Applied Statistics, 51-61. Glonek, G. F. (1996). A class of regression models for multivariate categorical responses. Biometrika, 83(1), 15-28. Glonek, G. F., & McCullagh, P. (1995). Multivariate logistic models. Journal of the royal statistical society. Series B (Methodological), 533-546. Gavin, H. P. (2013). The Nelder-Mead Algorithm in Two Dimensions. CEE 201L. Duke U. Islam, M. A., Alzaid, A. A., Chowdhury, R. I., & Sultan, K. S. (2013). A generalized bivariate Bernoulli model with covariate dependence. Journal of Applied Statistics, 40(5), 1064-1075. Islam, M. A., & Chowdhury, R. I. (2006). A higher order Markov model for analyzing covariate dependence. Applied Mathematical Modelling, 30(6), 477-488. Islam, M. A., Chowdhury, R. I., & Briollais, L. (2012). A bivariate binary model for testing dependence in outcomes. Bull. Malays. Math. Sci. Soc.(2), 35(4), 845-858. Kaiser, S., Träger, D., & Leisch, F. (2011). Generating correlated ordinal random values. Le Cessie, S., & Van Houwelingen, J. C. (1994). Logistic regression for correlated binary data. Applied Statistics, 95-108. University of Ghana http://ugspace.ug.edu.gh 70 Lee, Y., & Nelder, J. A. (2004). Conditional and marginal models: another view. Statistical Science, 19(2), 219-238. Leisch, F., Weingessel, A., & Hornik, K. (1998). On the generation of correlated artificial binary data. Liang, K. Y., & Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika, 13-22. Lin, P. S., & Clayton, M. K. (2005). Analysis of binary spatial data by quasi-likelihood estimating equations. The Annals of Statistics, 33(2), 542-555. Lipsitz, S. R., Laird, N. M., & Harrington, D. P. (1991). Generalized estimating equations for correlated binary data: using the odds ratio as a measure of association. Biometrika, 78(1), 153-160. Lovison, G. (2006). A matrix-valued Bernoulli distribution. Journal of Multivariate Analysis, 97(7), 1573-1585. Marshall, A. W., & Olkin, I. (1985). A family of bivariate distributions generated by the bivariate Bernoulli distribution. Journal of the American Statistical Association, 80(390), 332-338 McCullagh, P., & Nelder, J. A. (1989). Generalized Linear Models, no. 37 in Monograph on Statistics and Applied Probability. Molenberghs, G., & Lesaffre, E. (1994). Marginal modeling of correlated ordinal data using a multivariate Plackett distribution. Journal of the American Statistical Association, 89(426), 633-644. Muenz, L. R., & Rubinstein, L. V. (1985). Markov models for covariate dependence of binary sequences. Biometrics, 91-101. Prentice, R. L. (1988). Correlated binary regression with covariates specific to each binary observation. Biometrics, 1033-1048. Quinn, K. (2001). The newton raphson algorithm for function optimization. University of Washington, Seattle. Southern Africa Labour and Development Research Unit. National Income Dynamics Study 2010-2011, Wave 2 [dataset]. Version 3.0. Cape Town: Southern Africa Labour and Development Research Unit [producer], 2015. Cape Town: DataFirst [distributor], 2015 University of Ghana http://ugspace.ug.edu.gh 71 Southern Africa Labour and Development Research Unit. National Income Dynamics Study 2008, Wave 1 [dataset]. Version 6.0. Cape Town: Southern Africa Labour and Development Research Unit [producer], 2015. Cape Town: DataFirst [distributor], 2015 Sun, Z., Rosen, O., & Sampson, A. R. (2007). Multivariate Bernoulli mixture models with application to postmortem tissue studies in schizophrenia. Biometrics, 63(3), 901- 909. Wilson, J. R., & Lorenz, K. A. (2015). Modeling Binary Correlated Responses Using Sas, Spss and R (Vol. 9). Springer. Zeger, S. L., & Liang, K. Y. (1986). Longitudinal data analysis for discrete and continuous outcomes. Biometrics, 121-130. Zhao, L. P., & Prentice, R. L. (1990). Correlated binary regression using a quadratic exponential model. Biometrika, 77(3), 642-648. University of Ghana http://ugspace.ug.edu.gh 72 APPENDIX A: Simulation tables Table A1a: Results of estimated models from 1000 simulations with sample of size of 50: Exploratory statistics Models Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0 1 2( 0.2, 0.6)P P   1 2Avg.Cor ,Y Y  sd in 0.2 -0.30(0.14) 0.1 0.3 0.01  1Avg.Cor ,Y X  sd in 0.2 -0.29 (0.13) 0.1 0.3 0  2Avg.Cor ,Y X  sd in 0.2 -0.30(0.14) 0.1 0.31 0 Avg.Odds Ratio  1 2 ,Y Y 2.57 0.98 3.87 1.32 1.60 Marshall – Olkin correlation  1 2 ,Y Y 0.21 -0.005 0.20 0.03 0.12 Marshall – Olkin correlation  2 1, / 0Y X Y  0.003 -0.03 0.002 0.01 0.003 Marshall –Olkin correlation 2 1( , / 1)Y X Y  -0.003 -.0002 -0.004 -0.004 -0.04 Marshall – Olkin correlation  1 2/ 1Y X Y , 0.003 -0.06 -0.001 -0.003 -0.005 Marshall –Olkin correlation  1 2, / 0Y X Y  5.25^-0.6 -0.07 0.001 1.02^-05 -0.0002 Dependence value based on 3 0.94 -0.02 1.35 0.13 0.47 University of Ghana http://ugspace.ug.edu.gh 73 Table A1b: Results of estimated models from 1000 simulations with sample of size of 50: Regression model Regressive model Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0 1 2( 0.2, 0.6)P P  . 1Av  % 0.05of p  1.17 -2.39 0.40 1.01 0.86 0.05 60 0 20 100 .Av  % 0.05of p  0.99 -2.18 1.40 1.27 1.23 15 40 0 40 100 2.Av  : Wald test % 0.05of p  4.8 16.6 2.28 8.1 5.8 20 100 0 50 40 Table A1c: Results of estimated models from 1000 simulations with sample of size of 50: Conditional models Conditional models Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0 1 2( 0.2, 0.6)P P  Conditional 1Y = 0 01  % 0.05of p  1.68 -2.5 0.76 1.08 -0.29 22 87 0 20 11 :Wald test % 0.05of p  33 100 20 46 10 Conditional 1Y = 1 11  1.61 -1.96 -0.23 1.21 -0.11 University of Ghana http://ugspace.ug.edu.gh 74 % 0.05of p  22 58 0 30 0 :Wald test % 0.05of p  40 100 19 56 5 Table A1d: Results of estimated models from 1000 simulations with sample of size of 50: Marginal model Marginal model Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0 1 2( 0.2, 0.6)P P  . 1( . )VA s e :Wald test % 0.05of p  0.47(0.87) 0.13(0.79) 0.45( 0.64) -0.04(0.43 ) -0.06 0 0 14 0 0 Table A1e: Results of estimated models from 1000 simulations with sample of size of 50: Bivariate Bernoulli model Bivariate Bernoulli model Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0 1 2( 0.2, 0.6)P P  11  % 0.05of p  5.9 0.49 -0.48 0.90 0.68 90 90 90 90 90 01  2.81 1.63 3.66 1.45 2.92 % 0.05of p  . 1VA  1.4 1.19 0.57 1.16 0.97 % 0.05of p  Likelihood ratio test (proposed model and regressive model) p<0.00001 (result is significant at 5% level of significance) University of Ghana http://ugspace.ug.edu.gh 75 Likelihood ratio test (proposed model and conditional 1Y = 0 ) p<0.00001 (result is significant at 5% level of significance) Likelihood ratio test (proposed model and conditional 1Y = 1 ) p<0.00001 (result is significant at 5% level of significance) Table A2a: Results of estimated models from 1000 simulations with sample of size of 100: Exploratory statistics Models Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0  1 20.2, 0.6P P   1 2Avg.Cor ,Y Y  sd in 0.2 -0.30 0.1 0.3 0  1Avg.Cor ,Y X  sd in 0.2 -0.30 0.1 0.3 0  2Avg.Cor ,Y X  sd in 0.2 -0.30 0.1 0.3 0 Avg.Odds Ratio  1 2 ,Y Y 2.67 1.005 3.77 1.13 1.6 Marshall – Olkin correlation  1 2 ,Y Y 0.21 0.002 0.20 0.03 0.12 Marshall – Olkin correlation  2 1, / 0Y X Y  0.003 -0.003 0.001 0.002 0.002 Marshall –Olkin correlation 2 1( , / 1)Y X Y  0.006 -0.0008 0.003 0.001 -1.61^-0.5 Marshall – Olkin correlation  1 2/ 1Y X Y , -0.003 0.002 -0.0006 -0.005 0 Marshall –Olkin correlation  1 2, / 0Y X Y  0.003 -0.001 0.0007 1.03^-05 0.003 University of Ghana http://ugspace.ug.edu.gh 76 Table A2b: Results of estimated models from 1000 simulations with sample of size of 100: Regression model Table A2c: Results of estimated models from 1000 simulations with sample of size of 100: Conditional models 3Dependence value based on  0.98 0.005 1.33 0.12 0.47 Regressive Model Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0  1 20.2, 0.6P P  . 1VA  % 0.05of p  0.70 -2.57 0.64 0.95 1.26 0.05 95 0 50 22 .Av  % 0.05of p  1.59 -2.61 1.44 1.03 1.17 55 95 0 50 25 2.Av  : Wald test % 0.05of p  7.5 28.7 3.5 18.9 6.5 53 100 20 100 70 Conditional models Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0  1 20.2, 0.6P P  Conditional 1Y = 0 01  % 0.05of p  1.36 -2.68 1.08 1.14 -2.77 33 100 0 50 0 University of Ghana http://ugspace.ug.edu.gh 77 Table A2d: Results of estimated models from 1000 simulations with sample of size of 100: Marginal model Table A2e: Results of estimated models from 1000 simulations with sample of size of 100: Bivariate Bernoulli model :Wald test % 0.05of p  53 100 75 86 0 Conditional 1Y = 1 11  1.03 -1.98 1.22 1.39 -0.09 % 0.05of p  33 70 0 80 12 :Wald test % 0.05of p  72 100 16 90 3 Marginal model Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0  1 20.2, 0.6P P  . 1( . )Av s e :Wald test % 0.05of p  0.18(0.56) 0.09(0.57) 0.23(0.43) 0.009(0.29) -0.04(0.55) 0 0 0 0 0 Bivariate Bernoulli model Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0  1 20.2, 0.6P P  11  % 0.05of p  2.10 3.9 1.6 0.10 2.5 80 University of Ghana http://ugspace.ug.edu.gh 78 Table A3a: Results of estimated models from 1000 simulations with sample of size of 200: Exploratory statistics 01  % 0.05of p  3.56 2.14 1.8 0.56 1.2 . 1VA  2.05 1.4 2.3 0.58 -0.8 % 0.05of p  Likelihood ratio test (proposed model and regressive model) p<0.00001 (result is significant at 5% level of significance) Likelihood ratio test (proposed model and conditional 1Y = 0 ) p<0.00001 (result is significant at 5% level of significance) Likelihood ratio test (proposed model and conditional 1Y = 1 ) p<0.00001 (result is significant at 5% level of significance) Models Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0  1 20.2, 0.6P P   1 2Avg.Cor ,Y Y  sd in 0.2 -0.30(0.07) 0.1 0.3 0  1Avg.Cor ,Y X  sd in 0.2 -0.30(0.07) 0.1 0.3 0  2Avg.Cor ,Y X  sd in 0.2 -0.30(0.07) 0.1 0.3 0 Avg.Odds Ratio  1 2 ,Y Y 2.68 0.99 3.93 1.16 1.6 Marshall – Olkin correlation  1 2 ,Y Y 0.22 0.001 0.20 0.04 0.12 Marshall – Olkin correlation -0.003 0.002 0.001 0.003 0.002 University of Ghana http://ugspace.ug.edu.gh 79 Table A3b: Results of estimated models from 1000 simulations with sample of size of 200: Regressive model  2 1, / 0Y X Y  Marshall –Olkin correlation 2 1( , / 1)Y X Y  0.0003 -0.002 -0.001 -0.001 -0.001 Marshall – Olkin correlation  1 2/ 1Y X Y , -0.0009 0.0006 -0.001 0.0003 -3.9^-06 Marshall –Olkin correlation  1 2, / 0Y X Y  -0.006 -0.001 -0.003 -0.004 -0.003 3Dependence value based on  0.99 -0.01 1.37 0.15 0.47 Regressive model Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0  1 20.2, 0.6P P  . 1VA  % 0.05of p  1.35 -2.60 0.77 0.96 0.13 60 100 35 75 0 .Av  % 0.05of p  1.22 -1.53 1.39 1.25 0.17 0.55 100 0 75 0 2.Av  : Wald test % 0.05of p  15.9 55.4 8.5 32.3 2.8 86 100 47 100 15 University of Ghana http://ugspace.ug.edu.gh 80 Table A3c: Results of estimated models from 1000 simulations with sample of size of 200- Conditional model Conditional model Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0  1 20.2, 0.6P P  Conditional 1Y = 0 01  % 0.05of p  1.43 -2.77 1.18 1.06 0.23 80 100 0 100 0 :Wald test % 0.05of p  69 100 50 100 25 Conditional 1Y = 1 11  % 0.05of p  1.19 -2.42 1.06 1.22 0.23 70 100 0 100 22 :Wald test % 0.05of p  96 100 50 40 5 Table A3d: Results of estimated models from 1000 simulations with sample of size of 200: Marginal model Marginal Model Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0  1 20.2, 0.6P P  . 1( . )Av s e 0.37(0.35) -0.22(0.4) 0.08( 0.30) 0.02(0.21) 0.35(0.48) University of Ghana http://ugspace.ug.edu.gh 81 :Wald test % 0.05of p  28 0 0 0 14 Table A3e: Results of estimated models from 1000 simulations with sample of size of 200: Bivariate Bernoulli model Bivariate Bernoulli model Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0  1 20.2, 0.6P P  11  3.31 2.4 0.6 1.20 0.17 % 0.05of p  01  % 0.05of p  1.42 3.17 1.2 0.49 1.69 . 1VA  1.72 0.81 0.6 0.84 0.72 Likelihood ratio test (proposed model and regressive model) p<0.00001 (result is significant at 5% level of significance) Likelihood ratio test (proposed model and conditional 1Y = 0 ) p<0.00001 (result is significant at 5% level of significance) Likelihood ratio test (proposed model and conditional 1Y = 1 ) p<0.00001 (result is significant at 5% level of significance) University of Ghana http://ugspace.ug.edu.gh 82 Table A4a: Results of estimated models from 1000 simulations with sample of size of 500: Exploratory statistics Models Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0  1 20.2, 0.6P P   1 2Avg.Cor ,Y Y  sd in 0.2 -0.30(0.04) 0.1 0.3 0  1Avg.Cor ,Y X  sd in 0.2 -0.30(0.04) 0.1 0.3 0  2Avg.Cor ,Y X  sd in 0.2 -0.30(0.04) 0.1 0.3 0 Avg.Odds Ratio  1 2 ,Y Y 2.74 1.01 3.99 1.16 1.6 Marshall – Olkin correlation  1 2 ,Y Y 0.22 0.002 0.20 0.04 0.12 Marshall – Olkin correlation  2 1, / 0Y X Y  -0.003 0.002 0.0005 0.001 0.002 Marshall –Olkin correlation 2 1( , / 1)Y X Y  -0.002 -0.0005 0.0001 -0.0005 0.002 Marshall – Olkin correlation  1 2/ 1Y X Y , -0.001 0.0009 1.61^-0.05 -0.001 0.002 Marshall –Olkin correlation  1 2, / 0Y X Y  -0.002 -0.002 -.001 -0.001 0 3Dependence value based on  1.01 0.01 1.38 0.15 0.47 University of Ghana http://ugspace.ug.edu.gh 83 Table A4b: Results of estimated models from 1000 simulations with sample of size of 500: Regressive model Table A4c: Results of estimated models from 1000 simulations with sample of size of 500: Conditional models Conditional models Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0  1 20.2, 0.6P P  Conditional 1Y = 0 01  1.09 -2.33 1.15 1.19 -0.27 % 0.05of p  77 100 28 100 38 :Wald test % 0.05of p  100 100 85 100 20 Conditional 1Y = 1 11  1.34 -2.53 1.51 1.11 1.5 Regressive model Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0  1 20.2, 0.6P P  . 1VA  % 0.05of p  1.42 -2.63 0.7 0.94 -0.29 90 100 60 100 11 .Av  % 0.05of p  1.12 -2.67 1.18 1.19 -0.09 85 100 25 100 0 2.Av  : Wald test % 0.05of p  32.9 146.7 16.7 75.1 1.96 100 100 93 100 15 University of Ghana http://ugspace.ug.edu.gh 84 % 0.05of p  100 100 66 100 10 :Wald test % 0.05of p  100 100 94 100 0 Table A4d: Results of estimated models from 1000 simulations with sample of size of 500: Marginal model Marginal Model Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0  1 20.2, 0.6P P  . 1( . )Av s e -0.65(0.24) -0.02(0.25) -0.04(0.18) 0.03(0.13) 0.28(0.49) :Wald test % 0.05of p  0 0 0 0 0 Table A4e: Results of estimated models from 1000 simulations with sample of size of 500: Bivariate Bernoulli model Bivariate Bernoulli Model Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0  1 20.2, 0.6P P  11  2.14 2.99 3.21 1.98 0.99 % 0.05of p  5 5 01  4.2 1.36 0.24 0.60 1.53 % 0.05of p  . 1VA  1.53 1.51 1.86 0.12 0.24 University of Ghana http://ugspace.ug.edu.gh 85 % 0.05of p  Likelihood ratio test (proposed model and regressive model) p<0.00001 (result is significant at 5% level of significance) Likelihood ratio test (proposed model and conditional 1Y = 0 ) p<0.00001 (result is significant at 5% level of significance) Likelihood ratio test (proposed model and conditional 1Y = 1 ) p<0.00001 (result is significant at 5% level of significance) Table A5a: Results of estimated models from 1000 simulations with sample of size of 1000: Exploratory statistics Models Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0  1 20.2, 0.6P P   1 2Avg.Cor ,Y Y  sd in 0.2 -0.30(0.03) 0.1 0.3 0  1Avg.Cor ,Y X  sd in 0.2 -0.30(0.03) 0.1 0.3 0  2Avg.Cor ,Y X  sd in 0.2 -0.30(0.03) 0.1 0.3 0 Avg.Odds Ratio  1 2 ,Y Y 2.75 1.00 3.97 1.16 1.62 Marshall – Olkin correlation  1 2 ,Y Y 0.22 0.002 0.20 0.04 0.12 Marshall – Olkin correlation  2 1, / 0Y X Y  0.0008 -0.0004 -0.0008 0.001 0 Marshall –Olkin correlation 2 1( , / 1)Y X Y  -0.003 -0.0009 -0.0008 -0.001 -0.001 University of Ghana http://ugspace.ug.edu.gh 86 Marshall – Olkin correlation  1 2/ 1Y X Y , 0.0005 -0.002 0.0001 9.04^-05 0.001 Marshall –Olkin correlation  1 2, / 0Y X Y  -0.0005 -0.0003 -0.001 0.0003 0 3Dependence value based on  1.01 0.0001 1.38 0.15 0.48 Table A5b: Results of estimated models from 1000 simulations with sample of size of 1000: Regressive model Regressive model Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0  1 20.2, 0.6P P  . 1Av  % 0.05of p  1.40 2.60 0.63 0.94 (used 800) 0.09 100 100 80 100 20 .Av  % 0.05of p  1.19 -2.63 1.15 1.20 0.03 100 100 100 100 10 2.Av  : Wald test % 0.05of p  56.8 Out of 15.7 Out of 2.6 100 100 15 Table A5c: Results of estimated models from 1000 simulations with sample of size of 1000: Conditional models Conditional models Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0  1 20.2, 0.6P P  Conditional 1Y = 0 01  1.18 -2.71 1.06 1.14 0.09 University of Ghana http://ugspace.ug.edu.gh 87 % 0.05of p  100 100 100 100 20 Conditional 1Y = 1 11  1.25 -2.67 1.40 1.24( used 800 0.02 % 0.05of p  100 100 100 100 0 :Wald test % 0.05of p  100 100 100 Run out of iterations-Did not converge 0 Table A5d: Results of estimated models from 1000 simulations with sample of size of 1000: Marginal model Marginal model Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0  1 20.2, 0.6P P  . 1( . )Av s e -0.2(0.18) -0.06(0.18) 0.16(0.13) 0.03(0.09) -0.01(0.1) :Wald test % 0.05of p  0 0 29 0 0 Table A5e: Results of estimated models from 1000 simulations with sample of size of 1000: Bivariate Bernoulli model Bivariate Bernoulli model Estimates 1 ȡ 0.2 1 2( 0.1, 0.1)P P  2 ȡ 0. 1 2( 0.5, 0.5)P P  3 ȡ 0.1 1 2( 0.1, 0.8)P P  4 ȡ 0. 1 2( 0.5, 0.3)P P  5 ȡ 0  1 20.2, 0.6P P  11  3.69 1.32 0.38 1.07 1.91 % 0.05of p  5 01  -0.46 1.50 1..46 0.61 0.22 % 0.05of p  . 1VA  2.74 1.44 -0.03 0.51 1.91 University of Ghana http://ugspace.ug.edu.gh 88 % 0.05of p  Likelihood ratio test (proposed model and regressive model) p<0.00001 (result is significant at 5% level of significance) Likelihood ratio test (proposed model and conditional 1Y = 0 ) p <0.00001 (result is significant at 5% level of significance) Likelihood ratio test (proposed model and conditional 1Y = 1 ) p<0.00001 (result is significant at 5% level of significance) University of Ghana http://ugspace.ug.edu.gh 89 APPENDIX B: Codes used in data analysis B1:Codes used for simulation (Sample size 50 only) ############################################# R CODES ---SIMULATION EXERCISE--- R CODES ############################################ Y1 ~ Bernoulli (p1) Y2 ~ Bernoulli (p2) ###################################### Y1 and Y2 are correlated with correlation \rho ###################################### library (bindata) set.seed(1234) n=50 p1=0.1 p2=0.1 p3=0.5 rho=0.2 ################################################ Generating correlated multivariate binary random variables ################################################ count=0 corY1.Y2=0 corY1.X=0 corY2.X=0 while (count<=1000) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) column1 <- dfcmdata[,1:1] column1 column2 <- dfcmdata[,2:2] column2 column3 <- dfcmdata[,3:3] column3 corY1.Y2[count]<-cor(column1,column2) corY1.X[count]<-cor(column1,column3) corY2.X[count]<-cor(column2,column3) print(dfcmdata) print(corY1.Y2) print(corY1.X) print(corY2.X) count=count+1 corY1.Y2[count]=corY1.Y2[count]+0 corY1.X[count]=corY1.X[count]+0 corY2.X[count]=corY1.X[count]+0 } University of Ghana http://ugspace.ug.edu.gh 90 ############################################################# Calculating the true correlation between Y1 and Y2 for various sample sizes ############################################### corY1.Y2 require(psych) describeBy(corY1.Y2) ############################################################# Calculating the true correlation between Y1 and X for various sample sizes ############################################################# corY1.X describeBy(corY1.X) ############################################################# Calculating the true correlation between Y1 and X for various sample sizes ############################################################# corY2.X describeBy(corY2.X) ################################################################ Counting number of (Y1=0,Y2=0),(Y1=0,Y2=1) ,(Y1=1,Y2=0), (Y1=1,Y2=1) ################################################################ library (bindata) set.seed(1234) n=50 p1=0.1 p2=0.1 p3=0.5 rho=0.2 ############################### Counting the number of (Y1=0,Y2=0) ############################### counter=0 N00=0 while (counter<=1000) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) CountMydata<- function(numrow) { mymatrix <- matrix(c(cmdata),ncol=3,byrow=TRUE) mymatrix count = 0 numrow <- nrow(mymatrix) for (i in 1:numrow) { if(mymatrix[i,1]=="0"& mymatrix[i,2]== "0") { University of Ghana http://ugspace.ug.edu.gh 91 count = count + 1 } } return(count) } N00[counter]<-CountMydata(numrow) print(cmdata) print(N00) counter=counter+1 N00[counter]=N00[counter]+0 } print(N00) ############################ Average number of (Y1=0,Y2=0) ############################ print(N00) require(psych) describeBy(N00) ############################### Counting the number of (Y1=0,Y2=1) ############################### counter=0 N01=0 while (counter<=1000) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) CountMydata<- function(numrow) { mymatrix <- matrix(c(cmdata),ncol=3,byrow=TRUE) mymatrix count = 0 numrow <- nrow(mymatrix) for (i in 1:numrow) { if(mymatrix[i,1]=="0"& mymatrix[i,2]== "1") { count = count + 1 } } return(count) } N01[counter]<-CountMydata(numrow) print(cmdata) print(N01) University of Ghana http://ugspace.ug.edu.gh 92 counter=counter+1 N01[counter]=N01[counter]+0 } print(N01) ############################ Average number of (Y1=0,Y2=1) ############################ print(N01) require(psych) describeBy(N01) ############################### Counting the number of (Y1=0,Y2=1) ############################### counter=0 N10=0 while (counter<=1000) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) CountMydata<- function(numrow) { mymatrix <- matrix(c(cmdata),ncol=3,byrow=TRUE) mymatrix count = 0 numrow <- nrow(mymatrix) for (i in 1:numrow) { if(mymatrix[i,1]=="1"& mymatrix[i,2]== "0") { count = count + 1 } } return(count) } N10[counter]<-CountMydata(numrow) print(cmdata) print(N10) counter=counter+1 N10[counter]=N10[counter]+0 } print(N10) University of Ghana http://ugspace.ug.edu.gh 93 ########################### Average number of (Y1=0,Y2=1) ########################### print(N10) require(psych) describeBy(N10) ############################### Counting the number of (Y1=1,Y2=1) ############################### counter=0 N11=0 while (counter<=1000) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) CountMydata<- function(numrow) { mymatrix <- matrix(c(cmdata),ncol=3,byrow=TRUE) mymatrix count = 0 numrow <- nrow(mymatrix) for (i in 1:numrow) { if(mymatrix[i,1]=="1"& mymatrix[i,2]== "1") { count = count + 1 } } return(count) } N11[counter]<-CountMydata(numrow) print(cmdata) print(N11) counter=counter+1 N11[counter]=N11[counter]+0 } print(N11) #################### Average number of N11 #################### print(N11) require(psych) describeBy(N11) describeBy(N00) describeBy(N01) University of Ghana http://ugspace.ug.edu.gh 94 describeBy(N10) describeBy(N11) N00av<-31.12 N01av<-7.49 N10av<-7.04 N11av<-4.35 N00av/(N00av+N01av) N01av/(N00av+N01av) ######################### Transition Probabilities matrix ######################### Transitionprob<- matrix(c(N00av/(N00av+N01av),N01av/(N00av+N01av),N10av/(N10av+N11av),N11av/(N10av+N11av) ),nrow=2,byrow=TRUE) Transitionprob ############################### Labelling the Transitional probabilities ############################### dimnames(Transitionprob) <- list(c("0","1"),c("0","1")) Transitionprob ##################### Calculating the odds ratio ##################### Transitionprob.test <- prop.test(Transitionprob) Transitionprob.test$estimate odds <- Transitionprob.test$estimate/(1-Transitionprob.test$estimate) odds odds[1]/odds[2] oddsratio <-(Transitionprob[1,1]*Transitionprob[2,2])/(Transitionprob[2,1]*Transitionprob[1,2]) oddsratio ############################# Calculating odds ratio: Alternatively ############################# myoddsratio<-(N00av*N11av)/(N01av*N10av) myoddsratio ################################################## Calculating Marshall–Olkin correlation estimator (Y1 and Y2) ################################################## MO.cor.num<-(N11av/(N10av+N11av))*(N00av/(N00av+N01av))- (N10av/(N10av+N11av))*(N01av/(N01av+N00av)) University of Ghana http://ugspace.ug.edu.gh 95 P00<-(N00av/(N00av+N01av)) P01<-(N01av/(N01av+N00av)) P10<-(N10av/(N10av+N11av)) P11<-(N11av/(N10av+N11av)) P1<-P00+P01 P1 P2<-P10+P11 P3<-P00+P10 P4<-P01+P11 M0.cor.den<-sqrt(P1*P2*P3*P4) M0.cor.den MO.cor<-MO.cor.num/M0.cor.den MO.cor ################################### Correlation between Y2 and X given Y1=0 ################################### ww.sb<-subset(dfcmdata,X1==0) ww.sb column2 <- ww.sb[,2:2] column2 column3 <- ww.sb[,3:3] column3 cor(column2,column3) ########################################################## Correlation between Y2 and X given Y1=0;Using MO....process begins ########################################################## ############################## Counting the number of (Y2=0,X=0) ############################## counter=0 N00=0 while (counter<=1000) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) ww.sb<-subset(dfcmdata,X1==0) ww.sb mydata.mat<-as.matrix(ww.sb[,c(-1)]) mydata.mat CountMydata1<- function(numrow) { mymatrix <- matrix(c(mydata.mat),ncol=2,byrow=TRUE) University of Ghana http://ugspace.ug.edu.gh 96 mymatrix count = 0 numrow <- nrow(mymatrix) for (i in 1:numrow) { if(mymatrix[i,1]=="0"& mymatrix[i,2]=="0") { count = count + 1 } } return(count) } N00[counter]<-CountMydata(numrow) print(mymatrix) print(N00) counter=counter+1 N00[counter]=N00[counter]+0 } ########################### Average number of (Y2=0,X=0) ########################## print(N00) require(psych) describeBy(N00) ############################### Counting the number of (Y2=0,X=1) ############################### counter=0 N01=0 while (counter<=1000) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) ww.sb<-subset(dfcmdata,X1==0) ww.sb mydata.mat<-as.matrix(ww.sb[,c(-1)]) mydata.mat CountMydata1<- function(numrow) { mymatrix <- matrix(c(mydata.mat),ncol=2,byrow=TRUE) mymatrix count = 0 numrow <- nrow(mymatrix) University of Ghana http://ugspace.ug.edu.gh 97 for (i in 1:numrow) { if(mymatrix[i,1]=="0"& mymatrix[i,2]=="1") { count = count + 1 } } return(count) } N01[counter]<-CountMydata(numrow) print(mymatrix) print(N01) counter=counter+1 N01[counter]=N01[counter]+0 } ########################### Average number of (Y2=0,X=1) ########################## print(N01) require(psych) describeBy(N01) ############################### Counting the number of (Y2=1,X=0) ############################### counter=0 N10=0 while (counter<=1000) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) ww.sb<-subset(dfcmdata,X1==0) ww.sb mydata.mat<-as.matrix(ww.sb[,c(-1)]) mydata.mat CountMydata1<- function(numrow) { mymatrix <- matrix(c(mydata.mat),ncol=2,byrow=TRUE) mymatrix count = 0 numrow <- nrow(mymatrix) for (i in 1:numrow) { if(mymatrix[i,1]=="1"& mymatrix[i,2]=="0") { count = count + 1 University of Ghana http://ugspace.ug.edu.gh 98 } } return(count) } N10[counter]<-CountMydata(numrow) print(mymatrix) print(N10) counter=counter+1 N10[counter]=N10[counter]+0 } ########################## Average number of (Y2=1,X=0) ########################## print(N10) require(psych) describeBy(N10) ############################## Counting the number of (Y2=1,X=1) ############################## counter=0 N11=0 while (counter<=1000) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) ww.sb<-subset(dfcmdata,X1==0) ww.sb mydata.mat<-as.matrix(ww.sb[,c(-1)]) mydata.mat CountMydata1<- function(numrow) { mymatrix <- matrix(c(mydata.mat),ncol=2,byrow=TRUE) mymatrix count = 0 numrow <- nrow(mymatrix) for (i in 1:numrow) { if(mymatrix[i,1]=="1"& mymatrix[i,2]=="1") { count = count + 1 } } return(count) } University of Ghana http://ugspace.ug.edu.gh 99 N11[counter]<-CountMydata(numrow) print(mymatrix) print(N11) counter=counter+1 N11[counter]=N11[counter]+0 } ########################## Average number of (Y2=1,X=1) ########################## print(N11) require(psych) describeBy(N11) describeBy(N00) describeBy(N01) describeBy(N10) describeBy(N11) N00a<-4.34 N01a<-4.32 N10a<-4.34 N11a<-4.38 ######################################################### #Correlation between Y2 and X given Y1=0;Using MO....process ends ######################################################### MO.cor.numa<-(N11a/(N10a+N11a))*(N00a/(N00a+N01a))-(N10a/(N10a+N11a))*(N01a/(N01a+N00a)) P00a<-(N00a/(N00a+N01a)) P01a<-(N01a/(N01a+N00a)) P10a<-(N10a/(N10a+N11a)) P11a<-(N11a/(N10a+N11a)) P1a<-P00a+P01a P2a<-P10a+P11a P3a<-P00a+P10a P4a<-P01a+P11a M0.cor.dena<-sqrt(P1a*P2a*P3a*P4a) M0.cor.dena MO.cora<-MO.cor.numa/M0.cor.dena MO.cora ########################################################### #Correlation between Y2 and X given Y1=1;Using MO.... process begins ########################################################### University of Ghana http://ugspace.ug.edu.gh 100 ############################## Counting the number of (Y2=0,X=0) ############################## counter=0 N00=0 while (counter<=1000) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) ww.sb<-subset(dfcmdata,X1==1) ww.sb mydata.mat<-as.matrix(ww.sb[,c(-1)]) mydata.mat CountMydata1<- function(numrow) { mymatrix <- matrix(c(mydata.mat),ncol=2,byrow=TRUE) mymatrix count = 0 numrow <- nrow(mymatrix) for (i in 1:numrow) { if(mymatrix[i,1]=="0"& mymatrix[i,2]=="0") { count = count + 1 } } return(count) } N00[counter]<-CountMydata(numrow) print(mymatrix) print(N00) counter=counter+1 N00[counter]=N00[counter]+0 } ############################ #Average number of (Y2=0,X=0) ############################ print(N00) require(psych) describeBy(N00) ######################## Counting the number of N01 ####################### counter=0 N01=0 while (counter<=1000) University of Ghana http://ugspace.ug.edu.gh 101 { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) ww.sb<-subset(dfcmdata,X1==1) ww.sb mydata.mat<-as.matrix(ww.sb[,c(-1)]) mydata.mat CountMydata1<- function(numrow) { mymatrix <- matrix(c(mydata.mat),ncol=2,byrow=TRUE) mymatrix count = 0 numrow <- nrow(mymatrix) for (i in 1:numrow) { if(mymatrix[i,1]=="0"& mymatrix[i,2]=="1") { count = count + 1 } } return(count) } N01[counter]<-CountMydata(numrow) print(mymatrix) print(N01) counter=counter+1 N01[counter]=N01[counter]+0 } ########################### #Average number of (Y2=0,X=1) ########################### print(N01) require(psych) describeBy(N01) ############################## Counting the number of (Y2=1,X=0) ############################## counter=0 N10=0 while (counter<=1000) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) ww.sb<-subset(dfcmdata,X1==1) ww.sb University of Ghana http://ugspace.ug.edu.gh 102 mydata.mat<-as.matrix(ww.sb[,c(-1)]) mydata.mat CountMydata1<- function(numrow) { mymatrix <- matrix(c(mydata.mat),ncol=2,byrow=TRUE) mymatrix count = 0 numrow <- nrow(mymatrix) for (i in 1:numrow) { if(mymatrix[i,1]=="1"& mymatrix[i,2]=="0") { count = count + 1 } } return(count) } N10[counter]<-CountMydata(numrow) print(mymatrix) print(N10) counter=counter+1 N10[counter]=N10[counter]+0 } ########################## Average number of (Y2=1,X=0) ########################## print(N10) require(psych) describeBy(N10) ############################## Counting the number of (Y2=1,X=1) ############################## counter=0 N11=0 while (counter<=1000) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) ww.sb<-subset(dfcmdata,X1==1) ww.sb mydata.mat<-as.matrix(ww.sb[,c(-1)]) mydata.mat CountMydata1<- function(numrow) { mymatrix <- matrix(c(mydata.mat),ncol=2,byrow=TRUE) University of Ghana http://ugspace.ug.edu.gh 103 mymatrix count = 0 numrow <- nrow(mymatrix) for (i in 1:numrow) { if(mymatrix[i,1]=="1"& mymatrix[i,2]=="1") { count = count + 1 } } return(count) } N11[counter]<-CountMydata(numrow) print(mymatrix) print(N11) counter=counter+1 N11[counter]=N11[counter]+0 } ########################## Average number of (Y2=1,X=1) ########################## print(N11) require(psych) describeBy(N11) describeBy(N00) describeBy(N01) describeBy(N10) describeBy(N11) N00b<-4.46 N01b<-4.4 N10b<-4.41 N11b<-4.3 ######################################################### Correlation between Y2 and X given Y1=1;Using MO..... process ends ######################################################### MO.cor.num.b<-(N11b/(N10b+N11b))*(N00b/(N00b+N01b))- (N10b/(N10b+N11b))*(N01b/(N01b+N00b)) P00b<-(N00b/(N00b+N01b)) P01b<-(N01b/(N01b+N00b)) P10b<-(N10b/(N10b+N11b)) P11b<-(N11b/(N10b+N11b)) P1b<-P00b+P01b P2b<-P10b+P11b P3b<-P00b+P10b University of Ghana http://ugspace.ug.edu.gh 104 P4b<-P01b+P11b M0.cor.den.b<-sqrt(P1b*P2b*P3b*P4b) M0.cor.den.b MO.cor.b<-MO.cor.num.b/M0.cor.den.b MO.cor.b ########################################################## Correlation between Y1 and X given Y2=1;Using MO....process begins ########################################################## ############################## Counting the number of (Y1=0,X=0) ############################## counter=0 N00=0 while (counter<=1000) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) ww.sb<-subset(dfcmdata,X2==1) ww.sb mydata.mat<-as.matrix(ww.sb[,c(-2)]) mydata.mat CountMydata1<- function(numrow) { mymatrix <- matrix(c(mydata.mat),ncol=2,byrow=TRUE) mymatrix count = 0 numrow <- nrow(mymatrix) for (i in 1:numrow) { if(mymatrix[i,1]=="0"& mymatrix[i,2]=="0") { count = count + 1 } } return(count) } N00[counter]<-CountMydata(numrow) print(mymatrix) print(N00) counter=counter+1 N00[counter]=N00[counter]+0 } ########################## Average number of(Y1=0,X=0) ########################## University of Ghana http://ugspace.ug.edu.gh 105 print(N00) require(psych) describeBy(N00) ############################## Counting the number of (Y1=0,X=1) ############################## counter=0 N01=0 while (counter<=1000) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) ww.sb<-subset(dfcmdata,X2==1) ww.sb mydata.mat<-as.matrix(ww.sb[,c(-2)]) mydata.mat CountMydata1<- function(numrow) { mymatrix <- matrix(c(mydata.mat),ncol=2,byrow=TRUE) mymatrix count = 0 numrow <- nrow(mymatrix) for (i in 1:numrow) { if(mymatrix[i,1]=="0"& mymatrix[i,2]=="1") { count = count + 1 } } return(count) } N01[counter]<-CountMydata(numrow) print(mymatrix) print(N01) counter=counter+1 N01[counter]=N01[counter]+0 } ########################### Average number of (Y1=0,X=1) ########################### print(N01) require(psych) describeBy(N01) University of Ghana http://ugspace.ug.edu.gh 106 ############################## Counting the number of (Y1=0,X=0) ############################## counter=0 N10=0 while (counter<=1000) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) ww.sb<-subset(dfcmdata,X2==1) ww.sb mydata.mat<-as.matrix(ww.sb[,c(-2)]) mydata.mat CountMydata1<- function(numrow) { mymatrix <- matrix(c(mydata.mat),ncol=2,byrow=TRUE) mymatrix count = 0 numrow <- nrow(mymatrix) for (i in 1:numrow) { if(mymatrix[i,1]=="1"& mymatrix[i,2]=="0") { count = count + 1 } } return(count) } N10[counter]<-CountMydata(numrow) print(mymatrix) print(N10) counter=counter+1 N10[counter]=N10[counter]+0 } ########################### Average number of (Y1=0,X=0) ########################## print(N10) require(psych) describeBy(N10) ############################## Counting the number of (Y1=1,X=1) ############################## counter=0 N11=0 University of Ghana http://ugspace.ug.edu.gh 107 while (counter<=1000) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) ww.sb<-subset(dfcmdata,X2==1) ww.sb mydata.mat<-as.matrix(ww.sb[,c(-2)]) mydata.mat CountMydata1<- function(numrow) { mymatrix <- matrix(c(mydata.mat),ncol=2,byrow=TRUE) mymatrix count = 0 numrow <- nrow(mymatrix) for (i in 1:numrow) { if(mymatrix[i,1]=="1"& mymatrix[i,2]=="1") { count = count + 1 } } return(count) } N11[counter]<-CountMydata(numrow) print(mymatrix) print(N11) counter=counter+1 N11[counter]=N11[counter]+0 } ########################### #Average number of (Y1=1,X=1) ########################### print(N11) require(psych) describeBy(N11) describeBy(N00) describeBy(N01) describeBy(N10) describeBy(N11) N00c<-4.39 N01c<-4.38 N10c<-4.32 University of Ghana http://ugspace.ug.edu.gh 108 N11c<-4.37 ######################################## ################ Correlation between Y1 and X given Y2=1;Using MO.... process ends ######################################################### MO.cor.num.c<-(N11c/(N10c+N11c))*(N00c/(N00c+N01c))- (N10c/(N10c+N11c))*(N01c/(N01c+N00c)) P00c<-(N00c/(N00c+N01c)) P01c<-(N01c/(N01c+N00c)) P10c<-(N10c/(N10c+N11c)) P11c<-(N11c/(N10c+N11c)) P1c<-P00c+P01c P2c<-P10c+P11c P3c<-P00c+P10c P4c<-P01c+P11c M0.cor.den.c<-sqrt(P1c*P2c*P3c*P4c) M0.cor.den.c MO.cor.c<-MO.cor.num.c/M0.cor.den.c MO.cor.c ########################################################## #Correlation between Y1 and X given Y2=0;Using MO....process begins ########################################################## ############################## Counting the number of (Y1=0,X=0) ############################## counter=0 N00=0 while (counter<=1000) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) ww.sb<-subset(dfcmdata,X2==0) ww.sb mydata.mat<-as.matrix(ww.sb[,c(-2)]) mydata.mat CountMydata1<- function(numrow) { mymatrix <- matrix(c(mydata.mat),ncol=2,byrow=TRUE) mymatrix count = 0 numrow <- nrow(mymatrix) for (i in 1:numrow) { if(mymatrix[i,1]=="0"& mymatrix[i,2]=="0") { count = count + 1 University of Ghana http://ugspace.ug.edu.gh 109 } } return(count) } N00[counter]<-CountMydata(numrow) print(mymatrix) print(N00) counter=counter+1 N00[counter]=N00[counter]+0 } ########################## Average number of (Y1=0,X=0) ########################## print(N00) require(psych) describeBy(N00) ############################## Counting the number of (Y1=0,X=1) ############################## counter=0 N01=0 while (counter<=1000) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) ww.sb<-subset(dfcmdata,X2==0) ww.sb mydata.mat<-as.matrix(ww.sb[,c(-2)]) mydata.mat CountMydata1<- function(numrow) { mymatrix <- matrix(c(mydata.mat),ncol=2,byrow=TRUE) mymatrix count = 0 numrow <- nrow(mymatrix) for (i in 1:numrow) { if(mymatrix[i,1]=="0"& mymatrix[i,2]=="1") { count = count + 1 } } return(count) } N01[counter]<-CountMydata(numrow) print(mymatrix) University of Ghana http://ugspace.ug.edu.gh 110 print(N01) counter=counter+1 N01[counter]=N01[counter]+0 } ############################ #Average number of (Y1=0,X=1) ########################### print(N01) require(psych) describeBy(N01) ############################## Counting the number of (Y1=0,X=0) ############################## counter=0 N10=0 while (counter<=1000) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) ww.sb<-subset(dfcmdata,X2==0) ww.sb mydata.mat<-as.matrix(ww.sb[,c(-2)]) mydata.mat CountMydata1<- function(numrow) { mymatrix <- matrix(c(mydata.mat),ncol=2,byrow=TRUE) mymatrix count = 0 numrow <- nrow(mymatrix) for (i in 1:numrow) { if(mymatrix[i,1]=="1"& mymatrix[i,2]=="0") { count = count + 1 } } return(count) } N10[counter]<-CountMydata(numrow) print(mymatrix) print(N10) counter=counter+1 N10[counter]=N10[counter]+0 } University of Ghana http://ugspace.ug.edu.gh 111 ############################ #Average number of (Y1=0,X=0) ############################ print(N10) require(psych) describeBy(N10) ############################### Counting the number of (Y1=1,X=1) ############################### counter=0 N11=0 while (counter<=1000) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) ww.sb<-subset(dfcmdata,X2==0) ww.sb mydata.mat<-as.matrix(ww.sb[,c(-2)]) mydata.mat CountMydata1<- function(numrow) { mymatrix <- matrix(c(mydata.mat),ncol=2,byrow=TRUE) mymatrix count = 0 numrow <- nrow(mymatrix) for (i in 1:numrow) { if(mymatrix[i,1]=="1"& mymatrix[i,2]=="1") { count = count + 1 } } return(count) } N11[counter]<-CountMydata(numrow) print(mymatrix) print(N11) counter=counter+1 N11[counter]=N11[counter]+0 } ########################## Average number of (Y1=1,X=1) ########################## print(N11) require(psych) University of Ghana http://ugspace.ug.edu.gh 112 describeBy(N11) describeBy(N00) describeBy(N01) describeBy(N10) describeBy(N11) N00d<-4.38 N01d<-4.34 N10d<-4.39 N11d<-4.35 ######################################################## Correlation between Y2 and X given Y1=0;Using MO.... process ends ######################################################## MO.cor.numd<-(N11d/(N10d+N11d))*(N00d/(N00d+N01d))- (N10d/(N10d+N11d))*(N01d/(N01d+N00d)) P00d<-(N00d/(N00d+N01d)) P01d<-(N01d/(N01d+N00d)) P10d<-(N10d/(N10d+N11d)) P11d<-(N11d/(N10d+N11d)) P1d<-P00d+P01d P2d<-P10d+P11d P3d<-P00d+P10d P4d<-P01d+P11d M0.cor.dend<-sqrt(P1d*P2d*P3d*P4d) M0.cor.dend MO.cor.d<-MO.cor.numd/M0.cor.dend MO.cor.d ######################################################### #Checking dependence betwwen Y1 and Y2 (if 0,then no dependence) ######################################################### n<-(P00*P11)/(P01*P10) dependence.value<-log(n) dependence.value ############################################ #Conditional logistic regression Y2 conditioned on Y1 ############################################ library (bindata) n=50 p1=0.1 p2=0.1 p3=0.5 rho=0.2 counter=0 University of Ghana http://ugspace.ug.edu.gh 113 while (counter<=5) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) counter=counter+0 library(survival) rmodel<-clogit(X1~X2+X3,data=dfcmdata) print(model) } logLik(rmodel) ############### Conditional Y1=0 ############### library (bindata) n=50 p1=0.1 p2=0.1 p3=0.5 rho=0.2 counter=0 while (counter<=5) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) counter=counter+0 numrow <- nrow(dfcmdata) for (i in 1:numrow) { if(dfcmdata[i,1]=="0"& dfcmdata[i,2]== "1") { b<-dfcmdata[dfcmdata[,1]=="0"& dfcmdata[,2]== "1",] library(survival) c1model<-clogit(X1~X2+X3,data=dfcmdata) print(c1model) } } } logLik(c1model) ############## Conditional Y1=1 ############## library (bindata) n=50 University of Ghana http://ugspace.ug.edu.gh 114 p1=0.1 p2=0.1 p3=0.5 rho=0.2 counter=0 while (counter<=5) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) counter=counter+0 numrow <- nrow(dfcmdata) for (i in 1:numrow) { if(dfcmdata[i,1]=="1"& dfcmdata[i,2]== "1") { b<-dfcmdata[dfcmdata[,1]=="0"& dfcmdata[,2]== "1",] library(survival) c2model<-clogit(X1~X2+X3,data=dfcmdata) print(c2model) } } } logLik(c2model) ########################## Marginal Model: process begins ########################## library (bindata) library(reshape) n=50 p1=0.1 p2=0.1 p3=0.5 rho=0.2 counter=0 while (counter<=5) { cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata ys<-cmdata[,-3] s<-data.frame(t(ys)) melt(s) counter=counter+0 ############################# Generate random binary variable(x) University of Ghana http://ugspace.ug.edu.gh 115 ############################# rnum<-rnorm(100, mean = 0, sd = 1) rnum ##################### #Convert to binary values ##################### X<-ra2ba(rnum) X clmx<-melt(X) dfclmx<-data.frame(clmx) ################## #combine sample data ################## sampledata<-cbind(melt(s),dfclmx) sampledata colnames(sampledata)[1]<-"ys" colnames(sampledata)[2]<-"biresp" colnames(sampledata)[3]<-"cvr" sampledata tm<-1:2 melt(tm) time<-replicate(10,tm) melt(time) t2<-melt(time)[-1] ftime<-t2[-1] colnames(ftime)<-"time" fsampledata<-cbind(sampledata,ftime) counter=counter+0 ########################### Marginal modelling the GEE way ########################### library(MASS) library(geepack) geeglm.out<-geeglm(formula=biresp~cvr+time,data=fsampledata,family=binomial,id=ys) summary(geeglm.out) } p<-print(summary(geeglm.out)) logLik(geeglm.out) ####################################### Fitting the generalized bivariate Bernoulli model ####################################### library(maxLik) library (bindata) n=50 University of Ghana http://ugspace.ug.edu.gh 116 p1=0.1 p2=0.1 p3=0.5 rho=0.2 cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) colnames(dfcmdata)[1]<-"Y1" colnames(dfcmdata)[2]<-"Y2" colnames(dfcmdata)[3]<-"X" head(dfcmdata) log.likelihood <- 0 log.likelihood <- function(param) { B01<-param[1] B11<-param[2] B1<-param[3] N <- length(dfcmdata) N1 <- length("Y1") N2 <- length("Y2") Y1<-dfcmdata[,1] Y2<-dfcmdata[,2] X<-dfcmdata[,3] for (i in 1:length(dfcmdata)) { log.likelihood<- -sum(log(1+(X*exp(B01))))- sum(log(1+(X*exp(B1))))+sum((Y1*X*B1))+sum(Y1*log(1+(X*exp(B01))))- sum(Y1*log(1+(X*exp(B11))))+sum(Y2*X*B01)+sum((Y1*Y2*X*B11)-sum(Y1*Y2*X*B01)) result<-return(log.likelihood) } } mle <- maxLik(logLik =log.likelihood , start = c(B11=0.03,B1=0.013,B01=0.006)) summary(mle) mleNM <- maxLik(logLik = log.likelihood , start = c(B1=0.1,B11=4.03,B01=0.6), method = "NM") summary(mleNM) ############################################# Likelihood ratio test.....Fixing the estimated parameters ############################################# library (bindata) n=50 p1=0.1 University of Ghana http://ugspace.ug.edu.gh 117 p2=0.1 p3=0.5 rho=0.2 cmdata<-rmvbin(n, c(p1,p2,p3), bincorr=(1-rho)*diag(3)+rho) cmdata dfcmdata<-data.frame(cmdata) colnames(dfcmdata)[1]<-"Y1" colnames(dfcmdata)[2]<-"Y2" colnames(dfcmdata)[3]<-"X" N1 <- length("Y1") N2 <- length("Y2") Y1<-dfcmdata[,1] Y2<-dfcmdata[,2] X<-dfcmdata[,3] for (i in 1:length(dfcmdata)) { log.likelihood<- -sum(log(1+(X*2.81)))- sum(log(1+(X*1.4)))+sum((Y1*X*1.4))+sum(Y1*log(1+(X*2.81)))- sum(Y1*log(1+(X*5.9)))+sum(Y2*X*2.81)+sum((Y1*Y2*X*5.9)-sum(Y1*Y2*X*2.81)) } log.likelihood ####################################################################### With the regressive Model:2In(loglikelihood(proposed model)-loglikelihood(regressive) ####################################################################### test.stat<-2*((log.likelihood)-(logLik(rmodel))) ################################################################################ With the Conditional Model:2In(loglikelihood(proposed model)-loglikelihood(conditional model1)) ################################################################################ test.stat<- 2*((log.likelihood)-(logLik(c1model))) ################################################################################ With the Conditional Model:2In(loglikelihood(proposed model)-loglikelihood(conditional model2)) ################################################################################ test.stat<- 2*((log.likelihood)-(logLik(c2model))) University of Ghana http://ugspace.ug.edu.gh 118 B2:Codes used for real-life-application ################################## R CODES :REAL-LIFE-APPLICATION ################################## N00<-4132 N01<-707 N10<-1442 N11<-1559 N00/(N00+N01) N01/(N00+N01) ######################### Transition Probalilities matrix ######################## Transitionprob<- matrix(c(N00/(N00+N01),N01/(N00+N01),N10/(N10+N11),N11/(N10+N11)),nrow=2,byrow=TRUE) Transitionprob ################################ Labelling the Transitional probabilities ############################### dimnames(Transitionprob) <- list(c("0","1"),c("0","1")) Transitionprob ##################### Calculating the odds ratio ##################### Transitionprob.test <- prop.test(Transitionprob) Transitionprob.test$estimate odds <- Transitionprob.test$estimate/(1-Transitionprob.test$estimate) odds odds[1]/odds[2] oddsratio <-(Transitionprob[1,1]*Transitionprob[2,2])/(Transitionprob[2,1]*Transitionprob[1,2]) oddsratio #################### Odds ratio:Alternatively #################### myoddsratio<-(N00av*N11av)/(N01av*N10av) myoddsratio University of Ghana http://ugspace.ug.edu.gh 119 ################################################## Calculating Marshall–Olkin correlation estimator (Y1 and Y2) ################################################## MO.cor.num<-(N11/(N10+N11))*(N00/(N00+N01))-(N10/(N10+N11))*(N01/(N01+N00)) P00<-(N00/(N00+N01)) P01<-(N01/(N01+N00)) P10<-(N10/(N10+N11)) P11<-(N11/(N10+N11)) P1<-P00+P01 P1 P2<-P10+P11 P3<-P00+P10 P4<-P01+P11 M0.cor.den<-sqrt(P1*P2*P3*P4) M0.cor.den MO.cor<-MO.cor.num/M0.cor.den MO.cor ######################################################## Checking dependence betwwen Y1 and Y2 (if 0,then no dependence) ######################################################## n<-(P00*P11)/(P01*P10) dependence.value<-log(n) dependence.value ############### Conditional Y1=0 ############### library(foreign) file.choose() dataset = read.spss("C:\\Users\\user\\Desktop\\Dataset.Final\\B01.FINAL.FINAL..22.05.16.sav" ,to.data.frame=TRUE) View(dataset) library(survival) glm.out1=glm(formula= es ~ age+edu+sex,family=binomial(link=logit),data=dataset) summary(glm.out1) logLik(glm.out1) ############### Conditional Y1=1 ############### library(foreign) University of Ghana http://ugspace.ug.edu.gh 120 file.choose() dataset = read.spss("C:\\Users\\user\\Desktop\\Dataset.Final\\B11.FINAL.FINAL.22.05.16.INSTsav.sav" , to.data.frame=TRUE) View(dataset) library(survival) glm.out2=glm(formula= es ~ age+edu+sex,family=binomial(link=logit),data=dataset) summary(glm.out2) logLik(glm.out2) ################### Marginal the GEE way ################### library(foreign) file.choose() dataset = read.spss("C:\\Users\\user\\Desktop\\Dataset.Final\\B1.FINALFINAL.22.05.16..sav" ,to.data.frame=TRUE) View(dataset) library(MASS) library(geepack) geeglm.out<-geeglm(formula=es~age+edu+sex,data=dataset,family=binomial,id=id) summary(geeglm.out) logLik(geeglm.out)...spss-quasi-likelihood:8755.65 ######### B1 estimates ######### library(maxLik) library(foreign) file.choose() dataset = read.spss("C:\\Users\\user\\Desktop\\Dataset.Final\\proposed.model.22.05.16.sav" ,to.data.frame=TRUE) View(dataset) colnames(dataset)[1]<-"Y1" colnames(dataset)[2]<-"Y2" colnames(dataset)[3]<-"X1" colnames(dataset)[5]<-"X2" colnames(dataset)[7]<-"X3" head(dataset) University of Ghana http://ugspace.ug.edu.gh 121 log.likelihood <- 0 log.likelihood <- function(param) { B1.a<-param[1] B1.b<-param[2] B1.c<-param[3] N <- length(dataset) N1 <- length("Y1") N2 <- length("Y2") Y1<-dataset[,1] Y2<-dataset[,2] X1<-dataset[,3] X2<-dataset[,5] X3<-dataset[,7] for (i in 1:length(dataset)) { log.likelihood<- - sum((X1+X2+X3)*exp((X1*B1.a)+(X2*B1.b)+(X3*B1.c)))/(1+exp((X1*B1.a)+(X2*B1.b)+(X3*B1.c))) +sum((Y1*(X1+X2+X3))) result<-return(log.likelihood) } } mleNM <- maxLik(logLik = log.likelihood , start =c(B1.a=11.43,B1.b=-13.99,B1.c=10.74), method = "NM") summary(mleNM) mle <- maxLik(logLik =log.likelihood , start =c(B1.a=-10.32,B1.b=14.3,B1.c=10.02)) summary(mle) ############ B11 estimates ############ library(maxLik) library(foreign) file.choose() dataset = read.spss("C:\\Users\\user\\Desktop\\Dataset.Final\\proposed.model.B11.NEW.sav" ,to.data.frame=TRUE) View(dataset) colnames(dataset)[1]<-"Y1" colnames(dataset)[2]<-"Y2" colnames(dataset)[3]<-"X1" colnames(dataset)[5]<-"X2" colnames(dataset)[7]<-"X3" head(dataset) University of Ghana http://ugspace.ug.edu.gh 122 log.likelihood <- 0 library(formatR) tidy_source() log.likelihood <- function(param) { B11.a<-param[1] B11.b<-param[2] B11.c<-param[3] N <- length(dataset) N1 <- length("Y1") N2 <- length("Y2") Y1<-dataset[,1] Y2<-dataset[,2] X1<-dataset[,3] X2<-dataset[,5] X3<-dataset[,7] head(dataset) for (i in 1:length(dataset)) { log.likelihood<- - sum((X1+X2+X3)*exp((X1*B11.a)+(X2*B11.b)+(X3*B11.c)))/(1+exp((X1*B11.a)+(X2*B11.b)+(X3* B11.c)))+sum((Y1*Y2*(X1+X2+X3))) result<-return(log.likelihood) } } mleNM <- maxLik(logLik = log.likelihood , start =c(B01.a=-14.11,B01.b=11.31,B01.c=10.02), method = "NM") summary(mleNM) mle <- maxLik(logLik =log.likelihood , start = c(B11.a=6.32,B11.b=10.3,B11.c=12.02)) summary(mle) ########### B01 estimates ########### library(maxLik) library(foreign) file.choose() dataset = read.spss("C:\\Users\\user\\Desktop\\Dataset.Final\\proposed.model.B01.NEW.sav" ,to.data.frame=TRUE) View(dataset) colnames(dataset)[1]<-"Y1" University of Ghana http://ugspace.ug.edu.gh 123 colnames(dataset)[2]<-"Y2" colnames(dataset)[3]<-"X1" colnames(dataset)[5]<-"X2" colnames(dataset)[7]<-"X3" head(dataset) log.likelihood <- 0 library(formatR) tidy_source() log.likelihood <- function(param) { B01.a<-param[1] B01.b<-param[2] B01.c<-param[3] N <- length(dataset) N1 <- length("Y1") N2 <- length("Y2") Y1<-dataset[,1] Y2<-dataset[,2] X1<-dataset[,3] X2<-dataset[,5] X3<-dataset[,7] for (i in 1:length(dataset)) { log.likelihood<- - sum((X1+X2+X3)*exp((X1*B01.a)+(X2*B01.b)+(X3*B01.c)))/(1+exp((X1*B01.a)+(X2*B01.b)+(X3* B01.c)))+sum(Y1*(X1+X2+X3)*exp((X1*B01.a)+(X2*B01.b)+(X3*B01.c)))/(1+exp((X1*B01.a)+(X2* B01.b)+(X3*B01.c)))+sum((Y2*(X1+X2+X3)))+sum((Y1*Y2*(X1+X2+X3))) result<-return(log.likelihood) } } mle <- maxLik(logLik =log.likelihood , start = c(B01.a=-10.2,B01.b=11.3,B01.c=10.02)) summary(mle) mleNM <- maxLik(logLik = log.likelihood , start =c(B01.a=-11.1,B01.b=11.31,B01.c=10.02), method = "NM") summary(mleNM) ########################################################################## Conditional Model.1:2In(loglikelihood(proposed model)-loglikelihood(conditional model1)) ########################################################################## test.stat<-2*((-15454010)-(logLik(glm.out1))) University of Ghana http://ugspace.ug.edu.gh 124 ########################################################################## Conditional Model.2:2In(loglikelihood(proposed model)-loglikelihood(conditional model1)) ########################################################################## test.stat<-2*((3860420)-(logLik(glm.out2))) ######################################################################### the Marginal Model.2:2In(loglikelihood(proposed model)-loglikelihood(marginal.model)) ######################################################################### test.stat<-2*((23652232)-(8755.65)) ##################### Intercept for B1 estimates ##################### library(maxLik) library(foreign) file.choose() dataset = read.spss("C:\\Users\\user\\Desktop\\Dataset.Final\\proposed.model.22.05.16.sav" ,to.data.frame=TRUE) View(dataset) colnames(dataset)[1]<-"Y1" colnames(dataset)[2]<-"Y2" colnames(dataset)[3]<-"X1" colnames(dataset)[5]<-"X2" colnames(dataset)[7]<-"X3" head(dataset) log.likelihood <- 0 count=0 N <- length(dataset) N1 <- length("Y1") N2 <- length("Y2") Y1<-dataset[,1] Y2<-dataset[,2] X1<-dataset[,3] X2<-dataset[,5] X3<-dataset[,7] while (count<=7840) { log.likelihood<- - sum((X1+X2+X3)*exp((X1*0)+(X2*0)+(X3*0)))/(1+exp((X1*0)+(X2*0)+(X3*0)))+sum((Y1*(X1*0)+( X2*0)+(X3*0))) log.likelihood[count]=log.likelihood[count]+0 count=count+1 University of Ghana http://ugspace.ug.edu.gh 125 } require(psych) describeBy(log.likelihood/7840) ###################### Intercept for B11 estimates ###################### library(maxLik) library(foreign) file.choose() dataset = read.spss("C:\\Users\\user\\Desktop\\Dataset.Final\\proposed.model.B11.NEW.sav" ,to.data.frame=TRUE) View(dataset) colnames(dataset)[1]<-"Y1" colnames(dataset)[2]<-"Y2" colnames(dataset)[3]<-"X1" colnames(dataset)[5]<-"X2" colnames(dataset)[7]<-"X3" head(dataset) log.likelihood <- 0 count=0 N <- length(dataset) N1 <- length("Y1") N2 <- length("Y2") Y1<-dataset[,1] Y2<-dataset[,2] X1<-dataset[,3] X2<-dataset[,5] X3<-dataset[,7] while (count<=3001) { log.likelihood<- - sum((X1+X2+X3)*exp((X1*0)+(X2*0)+(X3*0)))/(1+exp((X1*0)+(X2*0)+(X3*0)))+sum((Y1*Y2*(X1 +X2+X3))) log.likelihood[count]=log.likelihood[count]+0 count=count+1 } require(psych) describeBy(log.likelihood/3001) University of Ghana http://ugspace.ug.edu.gh 126 ###################### Intercept for B01 estimates ###################### library(maxLik) library(foreign) file.choose() dataset = read.spss("C:\\Users\\user\\Desktop\\Dataset.Final\\proposed.model.B01.NEW.sav" ,to.data.frame=TRUE) View(dataset) head(dataset) log.likelihood <- 0 count=0 colnames(dataset)[1]<-"Y1" colnames(dataset)[2]<-"Y2" colnames(dataset)[3]<-"X1" colnames(dataset)[5]<-"X2" colnames(dataset)[7]<-"X3" head(dataset) log.likelihood <- 0 N <- length(dataset) N1 <- length("Y1") N2 <- length("Y2") Y1<-dataset[,1] Y2<-dataset[,2] X1<-dataset[,3] X2<-dataset[,5] X3<-dataset[,7] while (count<=4839) { log.likelihood<- - sum((X1+X2+X3)*exp((X1*0)+(X2*0)+(X3*0)))/(1+exp((X1*0)+(X2*0)+(X3*0))+sum(Y1*(X1+X2+ X3)*exp((X1*0)+(X2*0)+(X3*0)))/(1+exp((X1*0)+(X2*0)+(X3*0)))+sum((Y2*(X1+X2+X3)))+sum(( Y1*Y2*(X1+X2+X3))) log.likelihood[count]=log.likelihood[count]+0 count=count+1 } } require(psych) describeBy(log.likelihood/4839) University of Ghana http://ugspace.ug.edu.gh