University of Ghana http://ugspace.ug.edu.gh UNIVERSITY OF GHANA COLLEGE OF BASIC AND APPLIED SCIENCES SCHOOL OF PHYSICAL AND MATHEMATICAL SCIENCES LONGITUDINAL ANALYSIS OF CEREAL YIELDS IN GHANA BY ALFRED KWABENA AMOAH (10507283) THIS THESIS IS SUBMITTED TO THE UNIVERSITY OF GHANA, LEGON IN PARTIAL FULFILLMENT OF THE REQUIREMENT FOR THE AWARD OF MASTER OF PHILOSOPHY IN STATISTICS DEGREE JULY, 2016 University of Ghana http://ugspace.ug.edu.gh DECLARATION I hereby declare that this thesis is the result of my own research work and that no part of it has been presented for another degree in this university or elsewhere. SIGNATURE…………………………… DATE:………..……… ALFRED KWABENA AMOAH (10507283) Supervisors’ Declaration We hereby certify that this thesis was prepared from the candidate’s own research work and supervised in accordance with the guidelines on the supervision of thesis laid down by the University of Ghana. SIGNATURE……………………………….. DATE:………………… DR. SAMUEL IDDI (PRINCIPAL SUPERVISOR) SIGNATURE…………………………………. DATE:………………… DR. ISAAC BAIDOO (CO-SUPERVISOR) i University of Ghana http://ugspace.ug.edu.gh ABSTRACT The aim of this study was to investigate cereal crop yields in Ghana. Specifically, this research determines whether there is a significant difference in the yields across the ten regions in Ghana and also finds out the evolution of crop yields among the regions. Data on two major cereal crops (Maize and Rice) produced and consumed in Ghana was attained from Ministry of food and Agriculture (MOFA). Multivariate Analysis of Variance (MANOVA) and Linear Mixed model (LMM) were employed for the study. Diagnostic plots for the fitted Linear Mixed Model and MANOVA revealed a valid model for the analysis. The study revealed that significant differences exist in the yields of the two major cereal crops in all the regions in Ghana. The study identified that significant differences occurred in the average yields of most regions’ yields with other regions for both maize and rice. Further analysis by LMM indicated that the yields of maize and rice varied between and within the regions of Ghana. It also indicated that there is decelerating trend in maize yields and gradual increasing trend in rice yields across all the regions in Ghana. Based on these findings, we recommend that intensive support in the form of credit facilities and farm inputs must be given to farmers who engage in cereal crops production in all the regions in Ghana to help reduce this variability in the two major cereal crop yields. Also maize production especially must be encourage in all the regions to help avenge the declining trend in the yields of maize as it is considered the most consumed cereal crop in Ghana. We further recommend the use of Joint Models to simultaneously study the trend of crops and also study factors such as rainfall and climate data which may influence cereal crop production in Ghana. ii University of Ghana http://ugspace.ug.edu.gh DEDICATION I dedicate this special study to my Dear wife Ernestina Amoah and my son Harvey Adjei- Tachie Amoah. iii University of Ghana http://ugspace.ug.edu.gh ACKNOWLEDGEMENT I would like to express my candid and sincere thanks to Dr. Samuel Iddi and Dr. Isaac Baidoo, Lectures at Statistics Department of University of Ghana whose supervision, helpful comments, suggestions and incessant advice spurred me to carry out the work successfully. I wish to place on record my heartfelt thanks to all the lecturers of Department of Statistics whose tuition equipped me with prerequisite knowledge required for this study as well as workers at SRID of Ministry of Food and Agriculture for helping me with the data. I cannot also forget the support I received from Madam Felicitas Agana of Bank of Ghana, Sunyani, Solomon Owusu-Bempah of Ghana Statistical Service, Head Office and Sarah Kwakye my mate at UG for her encouragement as well as all my friends. Finally, I sincerely acknowledge the support given to me by my dear wife, Ernestina Amoah throughout this study. Dear, may God richly bless you for a wonderful encouragement and support given to me. iv University of Ghana http://ugspace.ug.edu.gh TABLE OF CONTENTS DECLARATION ................................................................................................................... i ABSTRACT ......................................................................................................................... ii DEDICATION ..................................................................................................................... iii ACKNOWLEDGEMENT .................................................................................................... iv TABLE OF CONTENTS ...................................................................................................... v LIST OF TABLES ............................................................................................................. viii LIST OF FIGURES ............................................................................................................. ix LIST OF ABBREVIATIONS ............................................................................................... x CHAPTER ONE ................................................................................................................... 1 INTRODUCTION ................................................................................................................ 1 1.0. Background of the study ......................................................................................... 1 1.1. Motivation of the research ....................................................................................... 3 1.2. Problem Statement .................................................................................................. 4 1.3. Objective ................................................................................................................ 5 1.4. Research Questions ................................................................................................. 5 1.5. Significance of the study ......................................................................................... 5 1.6. Scope and Methodology .......................................................................................... 6 CHAPTER TWO .................................................................................................................. 8 LITERATURE REVIEW ...................................................................................................... 8 2.0. Introduction ............................................................................................................ 8 2.1. Theoretical Framework ........................................................................................... 8 2.2. Conceptual Framework ......................................................................................... 11 2.2.1. A mixed effects model of crop yield............................................................... 11 2.2.2. Modeling for crop yield in Ghana .................................................................. 16 2.2.3. The uniqueness of the study ............................................................................... 18 CHAPTER THREE ............................................................................................................. 20 METHODOLOGY.............................................................................................................. 20 3.0. Introduction .......................................................................................................... 20 v University of Ghana http://ugspace.ug.edu.gh 3.1. Data Description ................................................................................................... 20 3.2. Multivariate Analysis Of Variance (MANOVA) Model ........................................ 20 3.2.1. One-way MANOVA model ........................................................................... 21 3.2.2. Two-way MANOVA with interactions ........................................................... 21 3.2.3. Hypothesis testing in the MANOVA model ................................................... 24 3.2.4. Construction of Test Statistics for MANOVA-model .................................. 25 3.2.5. Assumptions of MANOVA model ................................................................. 29 3.2.6. Advantage of MANOVA model ..................................................................... 31 3.2.7. Limitations of MANOVA model .................................................................... 32 3.3. Linear Mixed Model (LMM) ................................................................................. 33 3.3.1. The model formulation ................................................................................... 33 3.3.2. Assumptions of LMM .................................................................................... 34 3.3.3. The implied marginal model .......................................................................... 35 3.4. Estimation of model parameters ............................................................................ 36 3.4.1. Maximum Likelihood Estimation (MLE) ....................................................... 36 3.4.2. Restricted Maximum Likelihood Estimation (REML) .................................... 38 3.5. Information Criteria .............................................................................................. 38 3.6. Diagnostics ........................................................................................................... 39 3.7. Statistical software used ........................................................................................ 40 3.8. Compound Symmetry (CS) ................................................................................... 40 CHAPTER FOUR ............................................................................................................... 42 DATA ANALYSIS AND DISCUSSION OF RESULTS .................................................... 42 4.0. Introduction ........................................................................................................ 42 4.1. Descriptive analysis .............................................................................................. 42 4.2. Preliminary analysis ............................................................................................. 43 4.2.1 Mean Plots over time ......................................................................................... 44 4.3. MANOVA Analysis ............................................................................................. 45 4.3.1. Test on Assumptions ......................................................................................... 45 4.3.2. MANOVA Analysis .......................................................................................... 46 4.4. LMM analysis ........................................................................................................ 50 4.4.1. The Model Specification .................................................................................... 50 4.4.2. Diagnostics on the fitted LMM model ............................................................... 52 4.4.3. Random effects estimates .................................................................................. 54 vi University of Ghana http://ugspace.ug.edu.gh 4.4.4. Fixed effects parameter estimates ...................................................................... 55 4.5. Discussion ........................................................................................................... 60 CHAPTER FIVE ................................................................................................................ 62 CONCLUSION AND RECOMMENDATION ................................................................... 62 5.0. Introduction .......................................................................................................... 62 5.1. Conclusion ........................................................................................................... 62 5.2. Recommendation .................................................................................................. 63 REFERENCE ..................................................................................................................... 64 APPENDIX ......................................................................................................................... A Appendix A: Test of Assumptions .................................................................................... A Appendix B: Test on significant differences of maize and rice yields ................................ D Appendix C: Approximate 95% confidence intervals ......................................................... F Appendix D: Marginal Variance Covariance Matrix ......................................................... G Appendix E: Pairwise Comparisons of difference in Mean Yields..................................... H Appendix F: R Codes used for the study ............................................................................... K vii University of Ghana http://ugspace.ug.edu.gh LIST OF TABLES Table 4.1: Descriptive statistics of the two major cereal crops………….………....…..…….43 Table 4.2a: MANOVA Test for Differences in crop yields across the ten regions…...…..... 47 Table 4.2b: Test of Significant difference in crop yields……………………….…...……….47 Table 4.3: Pairwise Comparisons of differences in mean yields………………….…………49 Table 4.4: Selection of best model………………………………………………...…...…….51 Table 4.5: Random effects parameter estimates………………………………..…………….54 Table 4.6a: Fixed effects parameter estimates for maize yields.........…..…………......….…58 Table 4.6b: Fixed effects parameter estimates for rice yields.……….....................…….…...59 viii University of Ghana http://ugspace.ug.edu.gh LIST OF FIGURES Figure 4.1: Line graph showing average yields by year………………………….….…….…43 Figure 4.2: Boxplot of preliminary analysis .................……………………….…..………....44 Figure 4.3: Mean profile of maize yields per region…....……...………………..…………...44 Figure 4.4: Mean profile of rice yields per region…....………...……….………..……....….45 Figure 4.5: Diagnostic plots for fitted Linear Mixed model……………………..…….…….53 Figure 4.6: Distribution of random intercepts (District) for maize and rice yields………….55 ix University of Ghana http://ugspace.ug.edu.gh LIST OF ABBREVIATIONS AIC - AKAIKE INFORMATION CRITERIA AMMI - ADDICTIVE MAIN EFFECT AND MULTIPLICATIVE INTERACTION ANOVA - ANALYSIS OF VARIANCE BIC - BAYES INFORMATION CRITERIA FAO - FOOD AND AGRICULTURAL ORGANIZATION GDP - GROSS DOMESTIC PRODUCT GSS - GHANA STATISTICAL SERVICE KNUST - KWAME NKRUMAH UNIVERSITY OF SCIENCE AND TECHNOLOGY LMM - LINEAR MIXED MODEL MANOVA - MULTIVARIATE ANALYSIS OF VARIANCE MOFA - MINISTRY OF FOOD AND AGRICULTURE NRDS - NATIONAL RICE DEVELOPMENT STRATEGY OECD - ORGANIZATION FOR ECONOMIC COOPERATION AND DEVELOPMENT SDG - SUSTAINABLE DEVELOPMENT GOALS SRID - STATISTICS, RESEARCH AND INFORMATION DEPARTMENT UK - UNITED KINGDOM UN - UNITED NATION USA - UNITED STATE OF AMERICA US - UNITED STATE x University of Ghana http://ugspace.ug.edu.gh CHAPTER ONE INTRODUCTION 1.0. Background of the study In the last two decades, the concern of production and the role of guaranteeing food security seem to have moved away from global focus. There has been a decline in investment in research, infrastructure and advancements in technology have decelerated (Rosegrant & Cline, 2003). Agricultural development has been limited and about to face its biggest challenges. “The past 35 years have seen food reserves been at its lowest because of the demand that faced food production at a time when crop yields are stagnant” (OECD & FAO, 2009; Premanandh, 2011; Rosegrant & Cline, 2003). In 2008 the FAO, World Bank and G8 all called for a transformed investment in agriculture directed at improving and increasing agricultural productivity and production (OECD & FAO, 2009). The World Bank report (2009) highlights the global accessibility of agricultural land and the possible ways for improving production, particularly in areas with low agricultural development, such as Africa. Agreement appears to be establishing that “underutilized” land in developing countries can be the key to continuing global food safety, whether this is through the introduction of new farmland or the closing of yield gaps (FAO, 2009; Foley, 2011; Kearney, 2010; Tran, 2011; World, 2009). Unlike US or Europe where Agricultural prospective is very closed to maximized and soil quantity is degrading, advancing in the agricultural sectors of developing countries will improve productivity to a very large degree (Beddington, 2010). This investment in policies, technology, science and infrastructure is essential to access the potential productivity from the land and resources of 1 University of Ghana http://ugspace.ug.edu.gh Africa (Bank, 2011; Rosegrant & Cline, 2003). The sense is that investment in developing nations, which are currently only accomplishing a small amount of their agricultural prospective, could contribute to future global food safety by increasing crop yields. This was reemphasized by UN General Assembly on 11th August, 2015 on the need to improve agriculture productivity by ensuring the attainment of Sustainable Development Goal 2 (SDGs 2) on hunger, achieved food security and improved nutrition and promotion of sustainable agriculture (Griggs et al., 2013). In Ghana, we produce food crops such as maize, rice, plantain, cassava, yam and other vegetables are grown both on small scale and commercial level. “Annual production over the years continues to decline in relative term in spite of several programme interventions by the Municipal and District offices of the Ministry of Food and Agriculture” (Agyare et al., 2014). “This is due to increasing cost of farm inputs and the low soil fertility. Crop production is largely rain-fed and traditional technology of production continues to dominate the sector with peasant farmers using simple tools such as hoes and cutlasses. The average land holding per farmer is relatively low and is about 0.5 hectares” (Agyare et al., 2014). Maize is one of the major cereals produced and consumed in Ghana. Majority of maize is produced by peasant farmers who use local farm inputs under rain fed conditions. Under this method of production, yields of maize have not been encouraging as it is below the attainable levels of 5.0-5.5 metric tons per hectare. The estimates of annual maize yields according to MOFA from 2010 to 2015 are between deficits of 84,000 to 145,000 metric tons over the last four years. This signifies a deficit in local production of between 9 and 15 percent of total consumptions in these years with only 2.6 annual growths. Rice follow closely as the second most important cereal crop consumed by most Ghanaians. It has average annual production yields of 1.0-2.4 metric tons per hectare. “Also between 2010 and 2015, rice demand is 2 University of Ghana http://ugspace.ug.edu.gh projected to have grown at a compound annual growth of 11.8 percent from 939,920 metric tons to 1644,221 metric tons” (SRID, 2007). Nevertheless, the country is not able to feed itself in either of its two most important cereal crops, as Ghana has experienced average downfalls in local maize supplies of 12 percent and domestic rice supplies of 69 percent in recent years (Agyare et al., 2014). The yields of these two crops have therefore; become a concern to most agriculturists in Ghana with intensive demand on Ministry of Food and Agriculture (MOFA) to find a way of containing this problem. Over the years, Statistics, Research and Information Department (SRID) of MOFA have been analyzing the yields of all major crops in the country to find out the percentage changes (increases or decrease) that occur in these crops at each crop/farming season. In this thesis, we used multivariate analysis of variance (MANOVA) and linear mixed model (LMM) to analyze the yields of the two most important cereal crops (Maize and Rice) to investigate significant differences and trend over ten years among them across all the ten (10) regions in Ghana. 1.1. Motivation of the research This motivation of the research is indicated by the fact that Agriculture contributes meaningfully to Ghana’s GDP and employs majority of Ghanaians especially the rural folks who form the majority of Ghana’s population. Modeling the differences in major food crops will thus help farmers to know which of the crops to increase productivity across the various districts of the region. Again, it is motivated by the need to redirect stake holders’ attention to the achievement of Sustainable Development Goal 2 (SDGs 2) on hunger, achieved food security and improved nutrition and promote sustainable agriculture (Griggs et al., 2013). 3 University of Ghana http://ugspace.ug.edu.gh 1.2. Problem Statement The growth of population, development, and stationary agricultural production are backing to an incipient structural scarcity of food crops in the West Africa sub-region. “Finding ways of effectively coping with this emerging food deficit is critical for fostering economic growth, reducing poverty, and enhancing food/nutrition security for the people of West Africa. Addressing this challenge requires placing agriculture and the associated processes of production, trade, processing, and consumption at the forefront of any economic development strategy for the region” (Oerke et al., 2012). Farming is major source of income for many people in developing countries. In Ghana it represent 22% of the GDP (Ghana Statistical Service, 2014) and it is the main source of income to 60% of the population. The agricultural sector of the Ghanaian economy is dominated by peasant farmers who over the years continue to use traditional technology and farm inputs to produce crops. Although food crops such as maize, rice, plantain, cassava, yam, cocoyam and other vegetables are grown both on commercial and subsistence level all over Ghana, it is not sufficient enough to feed the Ghanaian populace as there is higher importation of cereals such as rice in the Country. The annual production in recent years continue to decline despite several programme and interventions by MOFA (Agyare et al., 2014). However, as stated by Quansah in 2010 “increasing human population has led to intensive cultivation without adequately replenishing soil nutrients. The result has been the deterioration in crop yields and depletion of the resource base”. This decline is usually associated with the inputs cost and the fertility of the soil because crop production is strongly reliant on rainfall which varies due to climate changes. "Ghana is in a unique position to not only leverage agriculture as an engine for poverty reduction and improved nutrition, but to become the breadbasket for the West Africa 4 University of Ghana http://ugspace.ug.edu.gh countries in the sub-region” (Quansah, 2010). Ghana has an abundance of fertile land, water, and a generally favorable climate for agricultural production. However, in modern times the yields of agricultural crops especially cereal crops has not been encouraging. We therefore sort to assess the trend and significant differences in yields of major cereal crops across the ten (10) regions of Ghana over 2005 to 2014 period. 1.3. Objective The main objective of the study was to analyze major cereal crop yields in Ghana over ten years period using MANOVA and Linear Mixed model. The specific objectives were:  To determine whether significant differences exist in annual maize yields in the ten regions of Ghana.  To determine whether significant differences exist in annual rice yields in the ten regions of Ghana.  To study the evolution of cereal crop yields between the regions in Ghana. 1.4. Research Questions  Do differences exist in major cereal crops yields across the regions in Ghana?  What is the trend of the major cereal crops yields in the country? 1.5. Significance of the study The findings from this study would be contributing to the country’s agricultural sector in the following forms: First, it will guide farmers to know which cereal crop to invest in to increase yields. It will, again, enable the Ministry of Food and Agriculture (MOFA) to know which 5 University of Ghana http://ugspace.ug.edu.gh cereal crop to intensify education on in order to increase yields across the districts and regions in Ghana. Secondly, by examining yield differences among the districts, policy makers can be better informed and tailored to respond to challenges of food crop production among this important group of producers or farmers. The results will thus enable various stakeholders in Agriculture to increase their credit allocations to farmers. Thirdly, the results will redirect stakeholders’ attention on the achievement of hunger target of the Sustainable Development Goal 2 (SDG 2) – on hunger, achieved food security and improved nutrition and promote sustainable agriculture by intensively increasing food crop production. Finally, the results would contribute to the existing knowledge and literature of academic sector. It will also be used as basis for research work in a related area or field of study. 1.6. Scope and Methodology The study was carried out in Ghana. The data was obtained Statistics, Research and Information Department (SRID) of Ministry of Food & Agriculture in Ghana. It comprised of crop yields data of two major cereal crops (Maize and Rice) in ten regions from 2005 to 2014. Crop yields were obtained from division of crop productivity in metric tons by the crop area in hectares (Mt/Ha). The data was then analyzed using MANOVA and Linear Mixed Models with aid of R software. 1.7. Organization of the study The research is organized into five chapters. Chapter one is the introduction which comprises the background of the study, the motivation, objectives, research question, significant of the study, scope and methodology and organization. Chapter Two examines the related literature 6 University of Ghana http://ugspace.ug.edu.gh to the research. The methods used for analysis of the data constitute Chapter three. Chapter four is the results and discussion of data and lastly chapter five presents conclusion and recommendations. 7 University of Ghana http://ugspace.ug.edu.gh CHAPTER TWO LITERATURE REVIEW 2.0. Introduction This chapter explores related topics on how other researchers handled crop yields. It includes theoretical framework and conceptual topics such as mixed effects model of crop yields and modeling of crop yields in Ghana. 2.1. Theoretical Framework “Major scientific research advancement has been considering food production as part of food security for quite a long time” (Ingram, 2011). “For instance in 1843 the Roth Amsted Experimental Station was established in UK, whilst the later part of the 19th century saw the swift growth of commercial plant breeding in Germany” (Harwood, 2005). Notwithstanding these many years’ research, there is still a need to institute how to produce more food given anticipated request (Godfray et al., 2010). “However, satisfying these increased demands poses huge challenges for the sustainability of both food production and aquatic ecosystems and the services they deliver to society” (Tilman et al., 2002). Mindful of the adverse environmental concerns of most current food production techniques, it is clear that the necessary gains will have to be made in a more environmentally benign manner (Foresight, 2011; Gregory et al., 2002). “To this end, research has progressively concentrated on the production scheme in search of increasing efficiency by which inputs (especially nitrogen and water) are used, and reduce undesirable externalities such as water pollution, soil degradation, loss of biodiversity and conservatory gas emissions” (Gregory et al., 2002; Ittersum & Rabbinge, 1997). 8 University of Ghana http://ugspace.ug.edu.gh Meanwhile, because the climate modification has intensified, research on food production has speedily increased. It is however clear that the change in climate will definitely affects crop growth in most part of the globe, with most harmful influence anticipated in the developing countries(Foresight, 2011; Parry et al., 2004; Parry, Rosenzweig & Livermore, 2005).In this regard reaching higher yields is part of the strategy for achieving food security while protecting the natural environment in developing Countries (Foley, 2011; Lobell, Cassman & Field, 2009). Onumah and Aning (2009) in their “report specified that production of agricultural commodities in Ghana has recorded significant improvement since the late 1990s, with average growth in output rising between 2003-2006 to 5.8% from 3.6% in 1999- 2002. He also noted that average yields for the major agricultural commodities have hardly risen since 2000, implying that the enhanced sector growth was chiefly due to strengthening of production. In addition, cereal crops such as maize, rice and millet, yam and cassava (starchy staples) amongst others have substantial potential for productivity improvements, which were 60 percent lower than attainable yields” (Onumah & Aning, 2009). These crops could increase yield over 60 percent if production technology is further improved. “Major food and industrial commodities production is believed to have accounted for nearly 80% of Ghanaian’s GDP in the past. But this was stagnated as results of unimproved farm inputs used for production in the agricultural sector of the economy” (Onumah & Aning, 2009). Improving output and production in major crops aside cocoa will significantly reduce poverty amongst farmers who engage in non-cocoa growing crops especially in the northern part of Ghana. “Farm productivity has to be improved through measures such as modernizing the country’s agricultural marketing systems; aside the adoption of improved production technology such as planting high-yielding varieties and improved access to farm extension 9 University of Ghana http://ugspace.ug.edu.gh and finance” (Onumah & Aning, 2009). Additional literary works by Denise Wolter in 2008 supports the assertion that Ghana’s current food production is low and must improve yields to fully enjoy the benefits of the commodities exchange. Wolter affirms that Ghana continues to face problem with food security because of declined productivity in food crop sector and undeveloped local markets for most part of the country especially in northern part of Ghana (Wolter, 2008). In order to take advantage of the increasing demand of food in Ghana’s middle income economy, effort should be made to increase crop yields, improve the local food markets aside solving food security problems (Wolter, 2008). It is, however, disturbing how Ghana with favorable natural conditions for agricultural production but continues to rely heavily on imported food. “According to the Ministry of Food and Agriculture (MOFA), Ghana’s agricultural production currently meets only half of domestic cereal and meat needs and 60 per cent of domestic fish consumption (Wolter, 2008). Self-reliance is achieved only in starchy staples such as cassava, yam and plantain, while rice and maize production falls far below demand” (Wolter, 2008). “The author further elaborated on various reasons why Ghana’s food crop production lies below potential yields. These factors in his research include untapped potential in irrigation; Ghana’s agriculture is mainly rain fed, with conventional systems of farming still dominant in most parts of the country (Wolter, 2008). Also, poor technology and small production units prohibits economies of scale and lead to sub-optimal yields. For instance, maize and rice are produced at a third of their potential yields per hectare due to poor technology and farm inputs” (OECD & FAO, 2009). Also, inadequate transportation facilities and storage infrastructure to link farmers and market centers prevent small farmers to have access to internal and intentional markets. Currently, Ghana produces less than 30% of local based industry raw materials to process some of the 10 University of Ghana http://ugspace.ug.edu.gh food crop (Wolter, 2008). “The Government of Ghana has implemented enticements such as tax holidays and subsidies to inspire food processing, but results have been minimal as holdups such as lack of infrastructure, and finance amongst others remain (OECD & FAO, 2009). As internal food processing remains below the potential to meet local demand, high- value food imports have been increasing” (Wolter, 2008). In Diao, Hazell, and Thurlow (2010) paper on the “role of Agriculture in African development they discussed the yields of major crops in Ghana. According to them, the actual yields obtained are much lower than the achievable yields for many crops in most zones of Ghana. According to the Ministry of Food and Agriculture data they used, yields for most crops are 20%-60% below their achievable level under existing technologies combined with the use of modern inputs such as fertilizers and improved seeds, which provides an opportunity for agricultural growth” (Diao, Hazell & Thurlow, 2010). 2.2. Conceptual Framework This section of the literature reviews the concepts and methods of other researchers that are related to this study. It includes topics such as; mixed effects model of crop yields, longitudinal study on crop yield and modeling for crop yields in Ghana. 2.2.1. A mixed effects model of crop yield A study conducted by Verma et al. (2014) of department of mathematics & statistics, Haryana Agriculture University India, developed a methodology for pre-harvest crop yield prediction of major mustard growing districts in Haryana (India). They use linear mixed effects models with random time effects at district, zone and state level, to fit crop yield estimate. For mixed modeling incorporating the hierarchical data structure of yield was represented as yijt  st  z it  d ijt , (2.1) 11 University of Ghana http://ugspace.ug.edu.gh where yijt  yield in the j th district within ith zone in the t th year, st  general state effect in the t th year, d thijt  effect of the j district within i th zone in the t th year, zij  effect of the j th zone in the t th year. For each of the three effects (state: st , zone: zit , district: dijt ), they set up a time-series model with three components: Regression +Time Trend + White noise. Regression was a fixed part comprising regression on time as well as on the meteorological covariates. Time Trend comprised of a random part for serial correlation with covariance structure and regression splines. White noise was an additional independently distributed random error term. The purpose of their study was to show the usefulness of the mixed model framework for pre- harvest crop yield forecasting. The findings indicated that there was improvement in the predictive accuracy of the zonal yield models using linear mixed modeling. The linear mixed models substantially improved the predictive accuracy and produced what they considered to be satisfactory district-level yield(s) estimation. They concluded by recommending the use of linear mixed models for pre-harvest yield forecasting of crop to enhance the predictive accuracy of the zonal models. Roberts and Tack (2011) also presented a “paper on a mixed effects model of crop yields for purposes of premium determination at the Agricultural & Applied Economics Association at Pittsburgh, Pennsylvania, USA. The goal of their research was to determine empirical estimation of farm-level yield distributions, calculation of actuarially fair risk premiums, and prediction of potential efficiency gains using empirical mixed model for premium determination given below” (Robert &Tack, 2011). 12 University of Ghana http://ugspace.ug.edu.gh yit  ai bct ccinsct it , (2.2) where yit represents “yield on farm i in year t , a is a farm-specific intercept, bc is a county-i specific trend, cc is a county-specific insurance effect, ct is a county-specific random shock in year t , and it is an idiosyncratic random shock on farm i in year t . The farm-specific intercept according to them can be written as a county-specific mean plus a farm-specific shock, ai  ac  i , and the county-specific slope parameters can be written as a population mean plus a county shock”, bc  b c1 and cc  cc2 (Robert &Tack, 2011). These modifications generated the model yit ac btcinsi ct c1tc2insit. (2.3) They indicated two types of effects in this model. The first part ac  bt  cins, represents the fixed effects of crop yields, and the second part, i  ct  c1t  c2ins   it , represents random effects. They cited two random intercepts, farm-specific {i} and county/time- specific{ct }; and, there were two random slopes at the county level, for the trend parameter (c1) and for the insurance parameter (c 2 ). “Followed conventional methods for estimating linear mixed models, they assumed that the random components follow a multivariate normal distribution and employed maximum likelihood estimation. With a separate model for each state-crop combination in their dataset, their empirical results suggested that there were several important sources of heterogeneity for crop yield distribution” (Robert &Tack, 2011). 13 University of Ghana http://ugspace.ug.edu.gh For example, for Arkansas rice, the farm-specific random intercept accounted for 42% of the random variation of their model, the county-specific intercept accounted for 7.5%, the random slope for the time trend accounted for 0.1% and the random slope accounted for 1% (Roberts & Tack, 2011). These findings confirm their objectives that there were significant farm-level crop distribution shifters across space and time. Another study conducted on longitudinal and spatial analyses applied to corn yield data from a long-term rotation trial by (Brownie, Larry & Tina, 1993) investigated a number of analyses of rotation trial of corn yields. The study used several types of split plot, repeated measures and spatial analyses, each with and without soil covariates and restricted attention to models that could be implemented using standard software. Important features of the yield data from these authors trial include considerable unbalance, possible correlations across time and space, possible heterogeneity in the error variances across years, and the availability of pretreatment measurements of soil properties. Ignoring the soil covariates, a linear mixed model for the corn yields in their methodology was given as Yijk     i  (RY ) jk  ij   ijk , (2.4) whereY is the yield in year k for rotation j in block i , k 1,...,6,8,9,10,ijk j 1,...,27, i 1,...,4, i is a random effect for the i th block or rep, (RY)jkis a fixed effect for rotation j in year k , with (RY ) jk  0, jk  ij is a random effect for the ij th plot (the plot in block i containing rotation j ), and  ijk is a random error associated with the ij th plot in year k . 14 University of Ghana http://ugspace.ug.edu.gh “The corn yield data were highly unbalanced because of the different crop sequences with corn planted in 155 of the possible 9 x 27 = 243 rotation-year combinations” (Brownie et al., 1993). “There were also several missing yields, thus instead of including main and interaction effects for rotation and year, they fitted an effect for each observed rotation-year combination, represented as (RY) jk in the model above. The Authors’ diagnostic graphs and AIC values indicated that the model should allow for heterogeneity across years in the error variance. The precision of contrasts in years with high error variance according to their recommendation was underestimated in the analysis that assumed constant variance. Allowing spatial correlations resulted in a small reduction in estimates of precision, and including soil covariates which were important in two of the nine years used. The authors indicated that the best models were repeated measures with banded covariance, as well as autoregressive repeated measures and isotropic spatial models, both modified to allow heterogeneity of the error variance across years” (Brownie et al., 1993). Also a study on climate variability and crop production in Tanzania conducted by Rowhani et al. (2011) revealed the relationship between seasonal climate and crop yields in Tanzania, focusing on maize, sorghum and rice. The study which makes use of linear mixed model outline below revealed an interesting result. In order to determine the effects of climate on agricultural yields, and to exploit the cross- sectional and temporal attributes of their dataset, they developed linear mixed models for each of the three crops (maize, rice, and sorghum). The method was appropriate for longitudinal designs (Pinheiro et al., 2007; Zuur, 2010) where observations within a group are often more similar than would be predicted on a pooled-data basis. The model was given as yij   2 0  1Ti , j  2Pi , j  3Pi , j  4CWT i , j  5CVPi , j  ai   ij , (2.5) 15 University of Ghana http://ugspace.ug.edu.gh where yij is yield, i represents the regions and j the observations within a region, 05 represent model parameters, ai represents the random intercept term, T represent temperature, P represents precipitation and  ij is an error term. Also the model included a fixed part comprised of P and T (and their interaction term), CVT and CVP (and their interaction term), as well as P2, and random intercepts. “For comparison, the coefficients resulting from their mixed models were compared to those obtained from simple linear regression models that included the different regions as a dummy variable to account for fixed effects. Again a time variable (year from 1992) was also used in the linear models to capture yield changes related to non-climatic factors and other technological development” (Zuur, 2010). yj 0 1Tj  P  2 2 j 2Pj B4CVT j 5CVP j 6 Regionj 7Yearj  j. (2.6) “The linear models were developed using stepwise model selection based on the AIC. In order to compare climate data effects on yield estimates, they used both sets of models, the mixed and linear models, were also developed using climate data they extracted from the CRU dataset. The results of their study indicated that both intra and inter seasonal changes in temperature and precipitation influenced cereal yields in Tanzania. They projected that by their studies, in Tanzania, by 2050, projected seasonal temperature increases by 20C will reduce average maize, sorghum and rice yields by 13%, 8.8% and 7.6% respectively” (Rowhani et al., 2011). 2.2.2. Modeling for crop yield in Ghana Some researchers in Ghana have also conducted studies on crop yields using mixed modeling and other methods but usually concentrated on a single crop. For instance a study conducted 16 University of Ghana http://ugspace.ug.edu.gh by Isaac (2012) of KNUST uses multiple comparison and random Effect model to analyze Cocoa production in Ghana from 1969/70 to 2010/11 production years. In his studies, he set the linear mixed effects models as Yi  X i  Z ibi   i i  1, 2, ...N , (2.7) whereYi is a vector of observations with mean E(Y)  Xi X i and Zi are the design matrices corresponding to the fixed and random effects respectively, is fixed effects vector,bi is a vector of independent and identically distributed (iid) random effects with mean E(bi )  0 and variance-covariance matrix variable (bi )  D.  i is a vector of iid random errors with mean E( i )  0and variance var( i )  R. It was assumed that bi  N (0, D) and  i  N (0,), with bi independent of  i . His analysis using mixed effect model, revealed that from 1969/70 production, all the six regions used in the study experienced increasing cocoa production trend with the exception of Volta region. Western and Ashanti regions had the highest production over the years. Again, a study was carried out by Azinu (2014) of department of Crop Science University of Ghana, on the evaluation of hybrid maize varieties in three agro-ecological zones (transition forest, Guinea savannah and coastal savannah) in Ghana. The study which was undertaken to assess the relative yielding abilities and stability of 20 hybrids selected from the breeding programme of the West Africa Centre for Crop improvement of maize uses analysis of variance (ANOVA) model. ANOVA per location and across location or environment for agronomic traits were carried out using Genstat 12th edition. 17 University of Ghana http://ugspace.ug.edu.gh Genotypes were considered as fixed effects, whilst environments and replication were considered as random effects. For each agronomic and morphological trait, an individual ANOVA was conducted by the researcher to determine the statistical significance of the genotypes at each environment and across environment. According to the author, the significant genotype-environment interaction revealed by addictive main effect and multiplicative interaction (AMMI) analysis of variance for grain yield suggested that the relative performance of the genotypes changed for grain yield across all environments. The study also identified Wenchi as the location for the best grain yield performance and Tamale as an environment yielded low grain yield in both seasons. 2.2.3. The uniqueness of the study The exploration of the related literature to our study has revealed some similarities and differences in our work. For instance the study by (Rowhani et al., 2011) in Tanzania uses LMM on three cereal crops (maize, rice and sorghum) with climate data to predict inter seasonal variations in temperature and rainfall influenced on cereal yields in Tanzania similar to our study on two major cereal crops (maize and rice) in Ghana which also uses LMM and MANOVA to investigate significant differences and trend of the yields over time. However, our study does not consider climatic data and it was not for prediction purpose. Unlike studies conducted by (Azinu, 2014; Isaac, 2012) in Ghana and considered only one crop at a time, our work is different in the sense that it considered two cereal crops at the same time. Also our study uses longitudinal data which most researchers in our terrain have not explored. Conducting research on crop yields again is a bit challenging due to missing values which might be present in the data setup and that was not different in our work. This usually occurred as results of unrecorded production by farmers. So being able to model two 18 University of Ghana http://ugspace.ug.edu.gh most important cereal yields data in Ghana with our work prove the uniqueness of our study. Our work might be among few studies conducted in this area of research in Ghana. 19 University of Ghana http://ugspace.ug.edu.gh CHAPTER THREE METHODOLOGY 3.0. Introduction The chapter describes the models used and methods of analyzing the available data to satisfy the objectives of the study. Models under this discussion include MANOVA used to establish the differences in yields among the regions for each year and Linear mixed models used to assess the evolution of crop yields among the regions. 3.1. Data Description The data used for this study is secondary data from the Department of Statistics, Research and Information of Ministry of Food & Agriculture (MOFA) in Ghana. It comprised of crop yields data of the two major cereal crops (maize and rice) in the ten (10) regions. Crop yield was defined by dividing crop productivity in metric tons with the crop area in hectares (Mt/Ha). Maize and Rice data were studied in each of the ten regions for the period 2005 to 2014. As at the time this study was carried out, maize and rice were the first and second most used cereals by Ghanaians. Also the current data on the two cereals for the ten years period available at MOFA was up to 2014 hence the period 2005 to 2014. 3.2. Multivariate Analysis Of Variance (MANOVA) Model “MANOVA is an extension of analysis of variance (ANOVA) in which the effects of factors are assessed on a linear combination of several response variables. A multivariate generalization of the ANOVA-model was first addressed” by Wilks (1932). “MANOVA methodology is well established and widely used in many research areas ranging from Agriculture to psychology” (Casella & Berger, 2002; Zhang & Xiao, 2012). The model was 20 University of Ghana http://ugspace.ug.edu.gh considered for our data because of the multivariate nature of our data and also because it usually give appropriate results for testing and comparing differences in means (yields ) of crops (Zhang & Xiao, 2012). 3.2.1. One-way MANOVA model In the one-way MANOVA-model, a single factor explains the variation in a set of response variables. The factor effect of one-way MANOVA-model for comparing g population mean is the following: y  i         i , (3.1) where y : p ×1 is vector of p response variables for the ith th i replicate on the  level of factor A,  is the overall mean,  is group effect, i 1,2,...,n ,  1,2,..., g and i : p × 1 is the vector of effects for level i of factor A. More specifically,      showing that the vector of effects could be interpreted as the deviation from the vector of overall means. Further, it is assumed that errors are independently normally distributed with zero mean and constant covariance matrixes,  ik  N p (0,) thus, E(yi )    and V (yi ) V (i )   . (3.2) 3.2.2. Two-way MANOVA with interactions Similarly to the univariate model (3.1), the two-way MANOVA model with interactions which is considered in this thesis is expressed as: yij       j j   ij , (3.3) 21 University of Ghana http://ugspace.ug.edu.gh thwhere yi : p ×1 is vector of p response variables for the replicate on the i th level of factor A, and the jth level of factor B, =1,2,…,g, j=1,2,…,b, i=1,2,…n. In the two-way MANOVA-model, vectors  ,  j and j represent main and interaction effects i i d respectively. Also, it is assumed that  i j   N p ( 0 ,  ) so that: E(yij)    j j and V (yij ) V (ij )   . (3.4) We note that from (3.1) the model is over parameterized and so we impose the identifiability  gconstraints 1 ni   0, that is, all group effects add up to zero and there are ni individuals in each group. The following terminology is sometimes useful: 1. Systematic part: + ℓ 2. Random part: ℰℓ The observations can be decomposed as _ _ _ _ x  i  x ( x   x )  ( x  i  x  ) (3.5) That is, Observation is equal to overall mean + treatment effect (or group effect) + residuals This implies _ _ _ _ _ _ _ _ (x x)(x x)' = [(xx)(xix)][(xx)(x ' i i i x)] 22 University of Ghana http://ugspace.ug.edu.gh _ _ _ _ _ _ _ = (xx)(xx) ' (x 'x)(xi x) _ _ _ _ _ + (xi x)(xx) ' (x x 'i )(xix) Summing over i, we obtain the following; n _ _ n _ _ _ _ _ _ _ n (xi  x)(xi x) ' =  (x x)(x x) '  (x x) (xi  x ) '  i1 i1 i1 n _ _ _ n _ _ (x  x '  ' +  i )(x x) (xi  x)(xi  x ) i1 i1 _ _ _ _ n _ _ n _ n (x  x)(x  x)'   (x  x )(x  x )' =     i  i  Since (xi x) 0. i1 i1 Next, we sum over ℓ and obtain; g n _ _ (x ' i x)(xi  x) = 1i1 g _ _ _ _ g n _ _  n  ( x   x ) ( x   x ) '    ( x  i  x  ) ( x  i  x  ) ' 3 .6  1  1 i 1 Total (corrected) Sum  Treatment (Between)  Residual (Within) Sums        of Squares and Cross =  Sum of Squares and +  of Squares and Cross   Products (SSCP)    Cross Products  Products    T  B  W where T: p×p is the total SSCP matrix, B is the “between” matrix which is denoted by H, the “hypothesis” and “within” matrix W is often denoted as E, “error” matrix. The within matrix is only meaningful if  is constant over samples. 23 University of Ghana http://ugspace.ug.edu.gh Certainly; g n _ _ E  '  ( xi  x )( xi  x )  1 i1 = (n1  1) S1  (n 2  1) S 2  ...  (n g  1) S g (3.7) 3.2.3. Hypothesis testing in the MANOVA model “The testing of hypothesis for MANOVA-model is based on the partitioning of sums of squares becomes more complex because of the interrelationships between the p include ANOVA-models. Unlike the univariate models, one must now consider sums of squares but also across products for the factors in the MANOVA-model. In the resulting matrices, called sums of squares and cross products (SSCP), diagonal elements corresponds to the usual sums of squares for each of the p response variables whereas the off-diagonal elements correspond to the cross products for each response variable pair” (Zhang & Xiao, 2012). When data is balanced, the partitioning of SSCP matrices is independent in analogy with the ANOVA-models described earlier. For instance, in the MANOVA-model: T  H  E (3.8) where T: p×p is the total SSCP matrix, H:p×p is the hypothesis SSCP matrix and E:p×p is the error SSCP matrix. The general hypothesis under the MANOVA can be written as: : = = . . . = (3.9) H1 : at least one group centroid is different, with p responses,1 factor and g groups. The assumptions are given as 1. Independent groups, independent observations 24 University of Ghana http://ugspace.ug.edu.gh 2. Responses are multivariate normal with each group 3. Population covariance matrices are equal across groups 3.2.4. Construction of Test Statistics for MANOVA-model From (3.6), we recall that; g _ _ _ _ n ( x  x)(x  x) 'H=     1 g n _ _ E = '  ( x  i  x  )( x  i  x  )  1 i 1 g n _ _ H+E =   ( x  i  x )( x '  i  x )  1 i 1 The simplest way to characterize information contain in positive semi-definite matrices is through eigenvalues (,,) . Now we consider the root equations E  (H E)  0 H  (H E)  0 and H   E  0 . Thus, the eigenvalues of matrices E(H E)1, H(H E)1, E1 H . Indeed, H E  0  E 1H  I  0. Now rephrasing the first and the second root equation, we obtain; 25 University of Ghana http://ugspace.ug.edu.gh (1 )EH  0 1   H  E  0  H   E  0 and (1 )HE  0  H  E  0 (1 ) Thus 1        ,   1   (3.10)  1   1   1    1   . 1   Johnson and Wichern (2007) outline four (4) statistics commonly use in MANOVA-model and considered in this study as 1. Wilks’ Lambda(): Under the null hypothesis of no factor  effects: H0 :  0 , (3.11) where is average yields of the regions. The likelihood ratio test statistic is given as 26 University of Ghana http://ugspace.ug.edu.gh E = det(E(H  E)1)   E  H p p p 1         (3.12) j j 1 j1 j 1 1   j is generally known as Wilks’  after Wilks (1932). The null hypothesis is rejected for small values of  , showing that E is small compared to the total SSCP matrix E+H. 2. Hotelling-Lawley Trace The test statistic is given by:  j U  tr ( HE 1 )   pj1 j  p j1 (3.13) 1   j is often referred to as Hotelling-Lawley Trace after Lawley (1938) and Hotelling (1951) who took part in developing the statistic. Naturally, a large H relative to E would indicate a larger support for H and a larger trace. Hence the null hypothesis (3.11) of no effects is rejected for large values of U. 3. Pillai’s Trace: Pillai (1955) developed the following statistic: V  tr((E H)1 H)   pj1 j , (3.14) which is commonly known as Pillai’s Trace. As with Hotelling-Lawleys Trace, the null hypothesis is rejected for large values of V, indicating a large H relative to E (Khatri & Pillai, 1968). 27 University of Ghana http://ugspace.ug.edu.gh 4. Roy’s Greatest Root: Roy (1953) also developed the Statistic: The largest root of the equation H E  0    m a x , (3.15) m a x 1   m a x which is commonly known as Roy’s Greatest Root is also one of the statistic usually used. “Wilk's Lambda is use for measuring amount of variability not explained by the levels of the independent variables. Pillai's Trace is best for general use especially when the sample size is uneven and data is correlated. Hotelling-Lawley Trace is generally converted to Hotellings T- square which is used when the independent variables forms two groups and represents the most significant linear combination of the dependent variable. Roys greatest root or eigenvalue is generally preferred if the outcome variables are highly inter-correlated reflecting single construct” (Harrar, 2008; Khatri, 1966). “Wilks’ Lambda, Hotelling-Lawley Trace, Pillai’s Trace and Roy’s Greatest Root are exact tests, meaning that the probability of rejecting H 0 in (3.11) is equals to α” (Rencher, 2003). “However, these tests have different probabilities of rejecting H 0 when it is false, thus implying that the tests have different power for a given sample. In general none of these multivariate tests is uniformly better than the other, although there might be situations where one test is preferred (Harrar & Bathke, 2008; Littell, Stroup & Freund, 2002). Also for large samples, all these tests are essentially equivalent” (Johnson & Wichern, 2007). 28 University of Ghana http://ugspace.ug.edu.gh 3.2.5. Assumptions of MANOVA model The following assumptions as applied to MANOVA were discussed by Casella and Berger (2002) have been outlined here. Normal Distribution: “The dependent variable should be normally distributed within groups. Overall, the F test is robust to non-normality, if the non-normality is caused by skewness rather than by outliers. Tests for outliers should be run before performing a MANOVA, and outliers should be transformed or removed. Normal probability plot is used to test for normality. Here each residual is plotted against its expected value under normality. A plot that is nearly linear suggests agreement with normality, whereas a plot that departs substantially from linearity suggests that the error distribution is not normal” (Casella & Berger, 2002). Linearity: “MANOVA assumes that there are linear relationships among all pairs of dependent variables, all pairs of covariates, and all dependent variable-covariate pairs in each cell. Therefore, when the relationship deviates from linearity, the power of the analysis will be compromised. This can be checked by conducting scatter plot matrix between the residuals of the dependent variables” (Casella & Berger, 2002).A good fit for scatter plot should not produce a pattern in the plot. Linearity should be met for each group of the MANOVA separately. Homogeneity of Variances: “Homogeneity of variances assumes that the dependent variables exhibit equal levels of variance across the range of predictor variables” (Littell et al., 2002). We recollect that the error variance is computed (SS error) by adding up the sums of squares within each group. “If the variances in the two groups are different from each other, then adding the two together is not appropriate, and will not yield an estimate of the common within-group variance. Homoscedasticity can be examined graphically or by means 29 University of Ghana http://ugspace.ug.edu.gh of a number of statistical tests such as Bartlett's and Brown Forsyth Test. The significant of the p-value (p < 0.05) for both tests indicates that variances are not identical” (Littell et al., 2002). Homogeneity of Covariance: Equality of covariance matrices is an assumption checked by running a Box’s M test. Unlike most tests, the Box’s M test tends to be very strict, and thus the level of significance is typically 0.001. So as long as the p value for the test is above 0.001, the assumption is met (Box, 1949). Thus, with Box’s M, one is interested in testing the null hypothesis: H0 :1  2  ... i  ... k  , (3.16) where  : p × p is the covariance matrix of the ithi combination of factor, i = 1, 2, . . . k, in the MANOVA-model with p response variables. Setting n = ki1ni and  i  ni 1, under the null hypothesis (3.16), the pooled estimator of the total covariance matrix is: k  s S  i i , i1 nk (3.17) where ni is the number of replicates on the i th factor combination, and Si is unbiased estimator of i.A generalized likelihood ratio test statistic can then be calculated as k M  (n  k ) log S   i log Si . (3.18) i1 Using scale factors, Box’s M could be approximated to either a  2 or a F -distribution. For both approximations, the null hypothesis of homoscedasticity is rejected for large values of the scaled test statistics (Box, 1949). As Timm (2002) notes, the  2 approximation is 30 University of Ghana http://ugspace.ug.edu.gh preferred when ni< 20, p < 6 and k < 6. Otherwise, F approximation is recommended. “Box’s M test unlike most tests tends to be very strict, and thus the level of significance is typically 0.001. So as long as the p value for the test is above 0.001, the assumption is met” (Box, 1949). Multicollinearity: “This assumption occurred when the correlations among the independent variables is so high to be accepted. When this happens the effects of independents cannot be separated. Under this assumption, the estimates are unbiased but the explanatory variables and their joint effects assessment are not reliable. As a rule of thumb, when inter-correlation among independent variables are above 0.80, it signals a possible problem of multicollinearity” (Garson, 2012). 3.2.6. Advantage of MANOVA model The MANOVA-model has many advantages over simultaneous estimation of several ANOVA-models; MANOVA tests whether there are significant differences among the means of factor levels on several dependent variables. Thus using MANOVA, one is able to test joint hypotheses of all univariate ANOVA models and more likely to observe differences between factor levels. For instance, two factors may have no main or interaction effects on two different response variables separately but only jointly (Sawyer, 2009) . “Fitting one MANOVA-model instead of several ANOVA-models decreases the experiment with Type I error probability. As a simple example, suppose that α = 5% for F-tests in 6 separate ANOVA-models. Then, the experiment type I error would be equal to 30% whereas an overall F-test for models in the MANOVA-model would imply a 5% Type I error probability” (Littell et al., 2002). 31 University of Ghana http://ugspace.ug.edu.gh Several ANOVA-models estimated individually does not take into account the co-variance pattern among dependent variables. On the other hand, the MANOVA model is useful not only to mean differences of factor levels but also to the covariation between response variables. When response variables are studied together, they are likely to be correlated to at least some extend and by conducting several ANOVA analyses this correlation would be lost (Littell et al., 2002). 3.2.7. Limitations of MANOVA model Outliers: Like ANOVA, MANOVA is extremely sensitive to outliers. Outliers may produce either a Type I or Type II error and give no indication as to which type of error is occurring in the analysis. There are several programs available to test for univariate and multivariate outliers. Tests for outliers should be run before performing a MANOVA, and outliers should be transformed or removed. Multicollinearity and Singularity: “When there is high correlation between dependent variables, one dependent variable becomes a near-linear combination of the other dependent variables. Under such circumstances, it would become statistically redundant and suspect to include both combinations. Absence of multicollinearity is checked by conducting correlations among the dependent variables. The dependent variables should all be moderately related, but any correlation over 0.80 presents is a concern for multicollinearity” (Zhang & Xiao, 2012). “Another setback of MANOVA as compared to univariate ANOVA-model, is the difficulty of the MANOVA-model as the number of factors included in the model hastily increases. The model specification in many ways similar to its univariate analogues presented earlier. The assumptions for the MANOVA-model are the same as for the ANOVA-model, but extended to comprise multivariate normality” (Littell et al., 2002). 32 University of Ghana http://ugspace.ug.edu.gh Also equality of covariance matrices for factor combinations is assumed that 11  12  ... 1b  a1  ... ab  , (3.19) where : p × p is an unknown covariance matrix. 3.3. Linear Mixed Model (LMM) Another model used in the study is Linear mixed model which is an extension of Linear model for data that were collected and summarized in groups (profile) over a period (Faraway, 2005; Johnson & Wichern, 2007; West, Welch & Galecki, 2014). The model is used in outcome variables in which the residuals are normally distributed but may not be independent or have constant variance. We used LMM because our data is longitudinal or repeated-measures in which subjects (yields) are measured repeatedly over time. LMM may include both fixed effects and random effects parameters. 3.3.1. The model formulation Unlike linear model where E(Y)  i with as fixed effects; in LMM,i is used for the fixed effects and bi used as random effects. The model is specified conditionally as E Y b   i  Zijbi (3.20) where bi is the conditional mean and realization of the random variable. The distribution assumptions are: bi  N (0, D) and i  N(0,i ). (3.21) The random componentsbi and  ij are independent. Also 33 University of Ghana http://ugspace.ug.edu.gh var(Y b)   andY  (,ZDZ ). (3.22) The parameters of this model are: Fixed –effects and random effect components, unknowns in the variance matrices D and  . The unknown variance elements are also referred to as the covariance parameters and collected in the vector . The vector of covariance parameters,     D  , (3.23)   combines all parameters from the covariance matrices D and i in the vectors D and respectively. 3.3.2. Assumptions of LMM Linear Mixed model has similar assumptions as linear model but in the case of LMM, the data might not necessary have constant variance although it might be normally distributed and linear. This is because of the presence of correlation within independent variables. The assumptions are given as 1. E( )  0 for all i 1, 2,..., n or, equivalently, E(Yi )  0  1Xi . 2. V ar( i )  2 for all i 1,2,..., n or, equivalently, var(Y 2i )   . 3. C ov( i , j )  0 for all i  j, or equivalently, cov(Yi ,Yj )  0. The first assumption is thatYi depends only on Xi and that all other variation inYi is random. The second assumption asserts that the variance of i onYi does not depend on the value ofYi . Assumption two is also known as the assumption of homoscedasticity, homogeneous 34 University of Ghana http://ugspace.ug.edu.gh variance or constant variance. Under the third assumption, the ℰ variables (or the Y variables) are uncorrelated with each other (Rencher & Schaalje, 2008). 3.3.3. The implied marginal model The linear mixed effects model (3.20) implies the marginal linear model Yi   i   * i , (3.24) where  *i  Zibi  . Thus, the *i is normally distributed with expected value E  *i   E (Z ibi )  E ( ) i  Zi E(bi )  E(i )  Zi 0 0  0 and covariance matrix Cov( *i )  Cov(Zibi ) Cov(i )  ZiCov(b T i )Zi  Cov( i )  Z Ti DZi   i . By defining the marginal variance-covariance matrix as Vi  Zi DZ T i   i , (3.25) we get  *i  Nni (0, Vi ). 35 University of Ghana http://ugspace.ug.edu.gh Hence, the marginal distribution of Yi is defined as Yi  N (i , Vi ). (3.26) 3.4. Estimation of model parameters The most commonly used estimation of model parameters is done by maximum likelihood estimation (MLE) and restricted maximum likelihood estimation (REML). Both estimation procedures were employed in this study. 3.4.1. Maximum Likelihood Estimation (MLE) This requires maximizing the implied marginal likelihood. Bruce Schaalje (2008); (Rencher & Schaalje, 2008) developed the likelihood by specifying the likelihood contribution from measurement Yi conditioned on the random effectsbi. The marginal distribution of Yi (3.24), is the multivariate normal density function ni 1  1 f (Y / , )=(2 ) 2 det(V 2i ) exp( (Y    ) T V 1 i i i (Yi   i )), 2 where det is the determinant and Vi is given by (3.25). th Hence, given the observed data Yi  yi , the likelihood function contribution for the i subject n  i 1  1 is Li ( , )  (2 ) 2 det(Vi ) 2 exp( ( yi   i ) T V1i ( yi   i )). 2 We have observed m independent subjects and the product of these m likelihood functions, gives us the joint likelihood function m L( , )   L i ( , ) i1 36 University of Ghana http://ugspace.ug.edu.gh n 1 m  i  1   (2 ) 2 det(V ) 2i exp( ( yi    ) T V1i i (yi   i )). i1 2 Hence, the log-likelihood function is 1 1 m 1 m l( , )   n ln(2 )   ln(det(V ))   ( y    )T V1 (3.27) i i i i ( yi   i ). 2 2 i1 2 i1 By assuming that  is known, the log-likelihood function becomes a function of  only. This leads to the maximization of the log-likelihood function (3.27) being equivalent to the minimization of its last term 1 mq ( )    ( y    )T V 1i i i ( y i   i ). 2 i 1 ^ By using the method of generalized least squares, we minimize q() to find  . q( )  1 m   ( yT V1 T i i i    T V1i i y T T 1 i    i Vi  i )  0   2 i1 m  (T 1 T 1  i Vi yi i Vi i)  0 i1 ^ m m    ( T V1 )1 T 1  V y . (3.28) i i i i i i i1 i1 ^  can be written asb T Y, which is the best linear unbiased estimator (BLUE) of  which means that E[bT Y]= and that it has the smallest variance among all unbiased linear estimators. 37 University of Ghana http://ugspace.ug.edu.gh 3.4.2. Restricted Maximum Likelihood Estimation (REML) The REML estimation is an alternative way of estimating the covariance parameters in , which is often preferred to ML estimation due to the fact that it produces unbiased estimates of covariance parameters by taking into account the degrees of freedom that were lost as a results from estimating the linear fixed effects in (Bruce Schaalje, 2008). The REML log-likelihood function is given by 1 1 m lREML ( )   (n  p) ln(2 )   ln(det(V )) (3.29) i 2 2 i1 1 m T 1 1 m  T 1 (r Vi ri )   ln(det( (3.30) i Vi  i )) 2 i1 2 i1 ^ m m where r is given as r  y     y T T 1i i i i i   i ((  i Vi  i )   T 1 i Vi yi ). i1 i1 Here we observe that the difference between the ML and the REML log-likelihood function is that the REML subtracts less in the first term, n p and an extra term 1 m ln(det( T V 1  )). The general motivation for using REML is to obtain unbiased i i i 2 i1 estimates of the covariance parameters. 3.5. Information Criteria The study used two types of information criteria often used to choose the best fitted model for the data, the Akaike information criteria and the Bayes information criteria developed by (Akaike, 1981) and (Burnham & Anderson, 2004) also known as Schwarz Criterion (Schwarz, 1978), respectively. The Akaike information criteria, AIC, is defined by 38 University of Ghana http://ugspace.ug.edu.gh ^ ^ AIC  2l( , ) 2p (3.31) ^ ^ where l( , ) can be either the ML or REML log-likelihood function and prepresents the total number of parameters, both the fixed and random effects, being estimated in the model. The model with the lowest AIC value is assumed to be the best fit for the data. The Bayes information criteria, BIC is defined by ^ ^ BIC  2l( , ) p ln(n) (3.32) ^ ^ where l( ,) is the ML log-likelihood function, p represent the total number of parameters, both the fixed and random effects, being estimated in the model and n is the total number of m observations used in estimation of the model. That is n   n i . m1 According to Bates and Pinheiro (2006)we can calculate the REML version of the BIC by simply using REML log-likelihood function and replacing ln(n) by ln(n  p fixed ), where pfixed is the number of estimated fixed effect parameter in the model, in Equation (3.32). In other words, the BIC applies a greater penalty for models with more parameters than the AIC. And as such, the lowest BIC value is assumed to be the best fit model for the data and preferable for the data. West et al. (2014) cited that there is no information criterion which stands apart as the best criterion to be used when selecting linear mixed effects models. 3.6. Diagnostics It is necessary after a linear mixed effects model is fitted to check whether the underlying distributional assumptions for the random effects and the residuals appear valid for the data. Diagnostic methods for linear models are well established, but diagnostics for linear mixed 39 University of Ghana http://ugspace.ug.edu.gh effects models are, however, more difficult to perform and interpret due to the complexity of the model. “The most useful method for diagnostics according to Bates and Pinheiro (2006) are based on plots of the residuals, the fitted values and the estimated random effects”. In this thesis, we will do diagnostic by using the functions qqnorm.lme, qqline, histogram and plot.lme in (Pinheiro et al., 2011). Here, the standardized residuals, defined as the raw residuals divided by the estimated corresponding standard deviation, are used. 3.7. Statistical software used MANOVA and Linear mixed procedures in the Statistical Software (R) were used in fitting the models since it allows for numerical integration of the normal random effects by Gaussian and adaptive Gaussian quadrature methods (Bates & Pinheiro, 2006; Jose & Bates, 2000). The conditional likelihood can be fed to the program as well to complete the full marginalization. The SPSS and Excel statistical software were considered for data processing and manipulations. 3.8. Compound Symmetry (CS) The structure of compound symmetric specifies that observations on the same subject have homogeneous covariance and homogeneous variance. The variance of the response variable is equal to variance of between and within subjects (Littell, Pendergast & Natarajan, 2000). Thus, cov(Y ,Y )   2 2 2ijt ijk CS ,b if t  k , V (Yijk )  CS ,b  CS ,w (3.33) where is the value of the response valuable (yield) measured at year t and k on region j in district i, i = 1, ... , g and j=1, ..., n10 , t= 1, ... , 10. 40 University of Ghana http://ugspace.ug.edu.gh Also the covariance cov(Yijt ,Yijk )   t ,k is the covariance measures at year t and k on the same subject (yield) and  tt  2 t is the variance at time t. The function of the correlation of CS is given as corrCS ( yield )   2 CS ,b / ( 2 CS ,b  2 CS ,w ) (3.34) We notice that the correlation does not depend on the value of the yield, in the sense that the correlation between two observations is equal to all pairs of observations on the same region. The structure of Compound symmetry is occasionally called 'variance components' structure since the two parameters  2 2CS ,b and  CS ,w denotes between-subjects and within-subject variances. The structure of compound symmetry is given as  2    1 1    1      1  2       1  , thus     1                  1    1  2      1  41 University of Ghana http://ugspace.ug.edu.gh CHAPTER FOUR DATA ANALYSIS AND DISCUSSION OF RESULTS 4.0. Introduction This various analyses of the study and the discussion of the results obtained are presented in this chapter. 4.1. Descriptive analysis This thesis applies MANOVA and Linear Mixed models to analyze cereal crop yields in Ghana. All the analysis in this study is limited to the yields of the two major cereal crops (Maize and Rice) recorded over the period 2005 to 2014. The unit of measurement is the 2016 districts in the country. The previous section’s methodologies are applied to the data and the results are discussed in this section. Table 4.1 present descriptive analyses of the results. The lowest average maize yields for the period under study was 0.13 metric tons per hectare (Mt/Ha), this was recorded in 2006 at Bongo in Upper East region and the maximum average maize yields was 9.56 Mt/Ha and that occurred in 2012 at Dormaa East District in Brong Ahafo region. In the same period the lowest average rice yield recorded was 0.68 Mt/Ha in 2011 at Bolga in Upper East region. The maximum average rice yields for the ten years period was 6.72 Mt/Ha and that was recorded in 2005 at Dangme West District in Greater Accra region. We observed from Table 4.1 that the average yield of rice in the period is higher than the average maize yield. This indicates that rice yields in Ghana for the period under study are on average higher than maize yields in most of the regions for the ten years period. Figure 4.1 give graphical representation of this results. 42 University of Ghana http://ugspace.ug.edu.gh Table 4.1: Descriptive statistics of the two major cereal crops Yield M i nimum 1st Quartile 3nd Quartile Mean Maximum Maize 0.130 1.252 1.900 1.605 9.560 Rice 0.680 1.265 2.665 2.098 6.720 Source: SRID of MOFA 3.00 2.50 Maize yields 2.00 1.50 Rice yields 1.00 0.50 0.00 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 Year Figure 4.1: Line graph showing average yields by year 4.2. Preliminary analysis Analysis of box plot appears to indicate steady improvement in the yields of rice over the years especially between 2009 and 2014. However, the yields of maize within the same period seem to be declining. Figure 4.2 demonstrate this assessment. 43 Average yields University of Ghana http://ugspace.ug.edu.gh Figure 4.2: Box plot of preliminary analysis 4.2.1 Mean plots over time The average yields of maize and rice for the ten regions is presented in the mean profile in Figure 4.3 and 4.4. We observed that from 2005 to 2006, the yields of maize decrease slowly over the years for all the ten (10) regions. The yields increased in 2008, 2010 and 2012, and then decreased in 2009, 2011, 2013 and 2014. The overall yields of maize as can be seen from Figure 4.3 indicates unstable trend in all the regions. 2.50 WESTERN 2.00 CENTRAL EASTERN GREATER ACCRA 1.50 VOLTA ASHANTI 1.00 BRONG AHAFO NORTHERN UPPER WEST 0.50 UPPER EAST 0.00 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 Year Figure 4.3: Mean profile of maize yields 44 Average yields (Mt/Ha) University of Ghana http://ugspace.ug.edu.gh Unlike maize yields which were decreased steadily within the period, rice yields, however, increased gradually from 2005 to 2009 with sharp increased between 2009 and 2010. Then it continues progressively until 2014 as indicated in Figure 4.4. The progressive trend in the rice yields from 2009 might be as results of National Rice Development Strategy (NRDS) implemented in 2009 by MOFA which increased domestic production of rice between 2009 and 2014 from 235,000Mt to 417,000Mt (GhanaWeb, 2015). 8.00 7.00 6.00 WESTERN 5.00 CENTRAL 4.00 EASTERN GREATER ACCRA 3.00 VOLTA 2.00 ASHANTI BRONG AHAFO 1.00 NORTHERN R WEST 0.00 UPPE 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 UPPER EAST Year Figure 4.4: Mean profile of rice yields 4.3. MANOVA Analysis 4.3.1. Test on Assumptions Diagnostics were fitted to check whether the underlying distributional assumptions for MANOVA model appear valid. These assumptions were as follows; Linearity: Linearity assumption was assessed by scatter plot for both maize and rice yields. The result is indicated in Figure 4.9 in appendix A. The plot suggests that response variables were linear as it does not show patterns in the plot. Normality: Box plot for both maize and rice yields, Q-Q normality plot and histogram with density curve in Figure 4.8 and 4.9 in appendix A seem to suggest that the error points have 45 Average yields (Mt/Ha) University of Ghana http://ugspace.ug.edu.gh not deviated from the fitted line for both maize and rice yields. A formal test was conducted using Shapiro-Wilk test to test for normality. From Table 4.6c in Appendix A, the Shapiro- Wilk test confirmed that the residuals are normally distributed since p-value of 0.00713 is less than significance level of 0.05 ( = 0.05). Homogeneity of Variance: Bartlett's test and Brown Forsyth test on the yields in Tables 4.6d and 4.7 also in appendix A indicates that variance were not identical as the p-values of both tests were less than = 0.05. Outliers: From Table 4.6c in Appendix A, the Bonferoni test obtained a p-value of 0.0420 which is less than the significance level (0.05) suggesting the presence of outliers. The most extreme observations in both yields were removed from the data set. Homogeneity of Covariance: Box M-test in also in Table 4.6c in Appendix A confirmed that equality of covariance assumption was not met as the p-value 2.2e-16 is less than 0.001 indicating significant of M which also implies that groups differ from each other. Multicollinearity: The assumption was tested by inter-correlation among the dependent variables. As indicated in Table 4.8a and b in appendix A, none of the correlation is above 0.80 signifying the absent of multicollinearity effects. 4.3.2. MANOVA Analysis The results of multivariate analysis of variance conducted on the data set using the four most commonly used statistic discussed in section 3.2.5 revealed that, significant differences exist in the yields of two major cereal crops in the country. For instance, the results of Pillai’s test in Table 4.2a indicate that the yields of the two cereal crops are significantly different in across the various regions. All the statistics has P-value less than 0.05, signifying significant evidence against the null hypothesis that, there are no significant differences in the average yields of the two major cereal crops across the regions. The analysis of which crop have 46 University of Ghana http://ugspace.ug.edu.gh significant differences in Table 4.2b indicates that differences occurred in both maize and rice yields across the regions. Further analysis in Table 4.3 on the mean vectors revealed significant differences in the yields between the regions. Table 4. 2a: MANOVA Test for Differences in crop yields across the ten regions DF Test Statistic Estimates Approx. F Pr (>F) Region Residual Pillai's trace 0.76033 9 1185 80.756 <2.2e-16*** Wilks' lambda 3.75e-01 9 1185 83.164 < 2.2e-16*** Hotellin-Lawley 1.30240 9 1185 85.596 <2.2e-16*** Roy's greatest root 0.90123 9 1185 118.66 <2.2e-16*** ***significant at 5% Table 4.2b: Test of Significant difference in crop yields Hypothesis Dependent Variables Value F Error df Pr(>F). df Pillai's Trace 0.458 4.595 178 2754 0.000*** MAIZE AND Wilks' Lambda 0.586 4.731 178 2752 0.000*** RICE YIELDS Hotelling's Trace 0.63 4.868 178 2750 0.000*** Roy's Largest Root 0.469 7.261 89 1377 0.000*** Pillai's Trace 0.185 3.521 89 1377 0.000*** Wilks' Lambda 0.815 3.521 89 1377 0.000*** MAIZE YIELDS Hotelling's Trace 0.228 3.521 89 1377 0.000*** Roy's Largest Root 0.228 3.521 89 1377 0.000*** Pillai's Trace 0.298 6.561 89 1377 0.000*** Wilks' Lambda 0.702 6.561 89 1377 0.000*** RICE YIELDS Hotelling's Trace 0.424 6.561 89 1377 0.000*** Roy's Largest Root 0.424 6.561 89 1377 0.000*** ***significant at 5% Pairwise comparisons on the mean vectors of the yields were further performed to determine where the differences occurred across the regions. The results of the test for both maize and 47 University of Ghana http://ugspace.ug.edu.gh rice yields are presented in Table 4.3 for five regions and Table 4.16a in appendix E for all the regions. Table 4.3 indicates that within the ten years study period, significant differences in both maize and rice yields occurred in most of the regions average yields. This is noted by the significant level less than 0.05 (sig.  0.05) of the regions’ pairwise comparisons with other regions. For instance, differences occurred in the comparisons between average maize yields of Ashanti region and six of the regions. However, no differences occurred in the average yields of Ashanti region and Upper East, Upper West and Western regions. Similarly, significant differences occurred in the average rice yields of Ashanti regions and all the regions except Northern region. Also pairwise comparisons of average maize yields of Brong Ahafo region and other regions indicates that significant differences occurred between the region and seven other regions’ yields except the yields of Central and Eastern regions. In the same way, significant differences occurred in the average rice yields of Brong Ahafo region and seven other regions with exception of Central and Upper West regions where there were no significant differences. Details of other regions results as well as 95% confidence interval for both yields are present in Table 4.3 and Table 4.16a of appendix E. 48 University of Ghana http://ugspace.ug.edu.gh Table 4.3: Pairwise Comparisons of differences in mean yields Differences in Mean yields Std. Error Sig. (I) Region (J) Region (I-J) Maize Rice Maize Rice Maize Rice ASHANTI B/AHAFO -.613* .449* 0.039 0.074 0.0000 0.0000 CENTRAL -.548* .565* 0.043 0.082 0.0000 0.0000 EASTERN -.701* -.350* 0.040 0.076 0.0000 0.0000 G/ACCRA .280* -1.400* 0.053 0.101 0.0000 0.0000 NORTHERN -.130* 0.090 0.04 0.075 0.0490 1.0000 UPPER EAST -0.009 -.469* 0.052 0.098 1.0000 0.0000 UPPER WEST -0.13 .684* 0.052 0.098 0.5530 0.0000 VOLTA -.287* -.620* 0.042 0.079 0.0000 0.0000 WESTERN -0.035 .902* 0.043 0.081 1.0000 0.0000 B/AHAFO ASHANTI .613* -.449* 0.039 0.074 0.0000 0.0000 CENTRAL .065 0.115 0.044 0.084 1.0000 1.0000 EASTERN -.088 -.800* 0.041 0.078 1.0000 0.0000 G/ACCRA .893* -1.850* 0.054 0.103 0.0000 0.0000 NORTHERN .483* -.359* 0.041 0.077 0.0000 0.0000 UPPER EAST .604* -.919* 0.053 0.099 0.0000 0.0000 UPPER WEST .483* 0.235 0.053 0.100 0.0000 0.8510 VOLTA .326* -1.069* 0.043 0.081 0.0000 0.0000 WESTERN .578* .452* 0.044 0.084 0.0000 0.0000 CENTRAL ASHANTI .548* -.565* 0.043 0.082 0.0000 0.0000 B/AHAFO -0.065 -0.115 0.044 0.084 1.0000 1.0000 EASTERN -.153* -.915* 0.045 0.085 0.0340 0.0000 G/ACCRA .827* -1.965* 0.057 0.108 0.0000 0.0000 NORTHERN .418* -.475* 0.045 0.085 0.0000 0.0000 UPPER EAST .538* -1.034* 0.056 0.105 0.0000 0.0000 UPPER WEST .418* 0.119 0.056 0.106 0.0000 1.0000 VOLTA .260* -1.185* 0.047 0.088 0.0000 0.0000 WESTERN .513* .337* 0.048 0.091 0.0000 0.0090 EASTERN ASHANTI .701* .350* 0.04 0.076 0.0000 0.0000 B/AHAFO 0.088 .800* 0.041 0.078 1.0000 0.0000 CENTRAL .153* .915* 0.045 0.085 0.0340 0.0000 G/ ACCRA .980* -1.050* 0.055 0.104 0.0000 0.0000 NORTHERN .571* .440* 0.042 0.079 0.0000 0.0000 UPPER EAST .691* -0.119 0.053 0.101 0.0000 1.0000 UPPER WEST .571* 1.034* 0.054 0.101 0.0000 0.0000 VOLTA .413* -.270* 0.044 0.083 0.0000 0.0490 WESTERN .666* 1.252* 0.045 0.085 0.0000 0.0000 *significant at 5% 49 University of Ghana http://ugspace.ug.edu.gh Another statistical test that was used to determine which regions differ significantly from others was Bonferroni Post hoc comparisons on the regions yields for both maize and rice yields. This test as indicated in Table 4.16b and 4.16c of appendix E also revealed similar results to the Pairwise Comparisons of differences in mean yields. 4.4. LMM analysis This section of the analysis deals with the linear mixed model considered for the study. The model was based on the methodology in 3.3. Two types of mixed models were considered for the study: model with random intercepts and model with random intercepts and slope. 4.4.1. The Model Specification The model with random intercept is given as 9 9 yij  0   r Xij  10Yeari   rijYeari b0i ij (4.1) r1 r1 9 0 is the intercepts of the fixed effects   r X ij  10Year , i r1 where th th y is a (n1)vector of the j region yields for the i year, ij 9   X Year is the interaction between regions and years, r ij i r1 is th region of year i , ij j r is a ( p1) vector of fixed effects parameters, that is region b0i is (g1) vector of random effects that occur in the data vector yij and  ij is an (n1)vector of errors (also random) specific to j th region of year i . 50 University of Ghana http://ugspace.ug.edu.gh The model with random intercept and slope is given as 9 9 yij  0   r Xij  10Yeari   rijYeari b0i b1iYear ij r1 r1 (4.2) zij  (1 Year) b  (b0i ,b ' 1i ) where b0i is the random intercept and b1i is the random slope for year, all other variables retain their usual meaning. In order to know which model best fit the data, we used Akaike Information Criteria (AIC) and the Bayes Information Criteria (BIC) developed by (Akaike, 1981) and (Burnham & Anderson, 2004) to select best fitted model to the data. Table 4.4 presents the two models and the values of AIC and BIC. It is important to state that for all analyses, P-value < 0.05 was considered to be statistically significant. Table 4.4: Selection of best model Maize Model df AIC BIC logLik Test L.Ratio p-value Random intercept 1 22 1326.305 1442.632 -641.1527 Ran. int. and slope 2 24 1329.980 1456.882 -640.9902 1 vs 2 0.3250162 0.85 Rice Model df AIC BIC logLik Test L.Ratio p-value Random intercept 1 22 3084.077 3200.403 -1520.039 Ran.int. and slope 2 24 3087.685 3214.587 -1519.843 1 vs 2 0.3917895 0.821 From Table 4.4, we observed that the p-value associated with random intercept and slope model of maize is not statistically significant (0.85> 0.05) which indicates random intercept model best fit the data than random intercept and slope. It is also confirmed by the lower AIC and BIC values associated with the model (1326.305 and 1442.632). 51 University of Ghana http://ugspace.ug.edu.gh The rice model also revealed similar findings. It also indicated that the model with random intercept best fit the data than random intercept and slope. As it can be seen from the Table 4.4, the p-value is not statistically significant (0.821>0.05) for random intercept and slope which was supported by lower AIC and BIC values given by the model (3084.077 and 3200.403). 4.4.2. Diagnostics on the fitted LMM model Further assessment on the fitted LMM for both maize and rice were conducted. The plots were (1) QQ-plot, (2) scattered plot and (3) a histogram with a density curve of the fitted LMM. A good fit for QQ-plot should produce a straight one-to-one line of points, for the scatter plot, a good fit should not produce a pattern in the plot. Generally, QQ-plot is often desired to the scatter plot. The normality nature of the density curve on the histogram with the points and approximated linearity of the QQ-plot indicates that, the model is an effective model. Hence we conclude that the diagnostic plots support the fitted model and so LMM is a good model for fitting maize and rice yields. This is clearly shown in Figure 4.5. 52 University of Ghana http://ugspace.ug.edu.gh Figure 4.5: Diagnostic plots for fitted Linear Mixed model 53 University of Ghana http://ugspace.ug.edu.gh 4.4.3. Random effects estimates The first part of the output that we would like to discuss is the random effect estimates. The random effects (random intercepts) are random values associated with the levels of a random factor (District) in the LMM. These values which are specific to a given level of a random factor District represent random deviations from a given district from overall fixed effects as represented in Table 4.5. Table 4.5: Random effects estimates Random effects of maize Random effects of rice Groups Name Variance Std.Dev. Groups Name Variance Std.Dev. DISTRICT (Intercept) 0.5885 0.2426 DISTRICT (Intercept) 0.1911 0.4371 Residual 0.1314 0.3212 Residual 0.2668 0.5165 95% Conf. Interval Lower Est. Upper 95% Conf. Interval Lower Est. Upper 0.220 0.251 0.285 0.542 0.609 0.685 The standard deviation column measures the variability for each random effect added to the model. As we can see from Table 4.5, maize has much less within variability (24.26%) than rice (43.71%). The variance column shows the variability of the intercept (Districts) across the regions. We observed that our model indicates 58.85% of variability in maize yields and 19.11% of rice yields across the ten regions of Ghana. Of course, not all the variability was accounted for by our model and that is indicated in the residuals which stand for variability within districts. This value is 32.12% and 51.65% for maize and rice yields respectively. It is believed that there is other variability from unknown covariates which also contribute to deviations in yields of these cereal crops. The graphical visualization of the distributions of districts as intercepts is given below. Clearly, Figure 4.6 shows that maize yields has average less within variability than rice yields. 54 University of Ghana http://ugspace.ug.edu.gh Figure 4.6: Distribution of random intercepts (District) for maize and rice yields 4.4.4. Fixed effects parameter estimates This show a portion of the model output that contain the fixed effect parameter estimates, their corresponding standard errors, the degrees of freedom, the t-test values and the corresponding p-values as well as their interactions. The intercept estimate which is also the reference category (Ashanti region) gives the expected average yields in metric tons per hectare when all the predictors or covariates are zero. From Table 4.6a, our model indicated that on the average, the expected maize yield is 1.37Mt/Ha if all other regions recorded zero yields. The estimate column tell us that Brong Ahafo region have average yields of maize higher than Ashanti region. This is indicated by the positive coefficient (0.71) figure of the region’s estimate. Again, the estimates column indicated that Central region have average yields of maize lower than Ashanti region for the study period as indicated by the negative sign of the coefficient (-0.49) of the region’s estimate. The rest of the regions estimates in Table 4.6a are interpreted in similar way. The most important aspect of the fixed effects analysis is the interaction between the regions and the year. The interaction terms can be interpreted as value telling us the relationship between the estimated change in regional maize yields and the year. Thus, interaction 55 University of Ghana http://ugspace.ug.edu.gh between the two gives suggestions in the model as what trend is found in the maize yields for the period under study. As a form of demonstration, suppose the interaction between two regions and year give the model, Y  0  1R1  2R2  3Year  4R1Year   5R2Year (4.3) whereY is the response variable (yields), 0 is intercept, i is the coefficient parameter, R1 is region 1 and R2 is region 2, R3 is the reference region. Model for R1 Y R   0   1   3Y e a r   Y e a r1 4  (  0   1 )  ( 3   4 )Y e a r (4.4)   0   1Y e a r . Model for R2 Y R   0  2 2   3Y ea r   5Y ea r = ( 0   2 )  ( 3   5 )Y e a r (4.5)   0   1Y e a r . Model for R3 YR  0   Year (4.6) 3 3 Now the difference between region 1and 2 and reference region model gives YR YR    1 3 1 4Year YR YR  2  5Year (4.7) 2 3 This is graphically presented as 56 University of Ghana http://ugspace.ug.edu.gh Y 1  0 3 0 4 1 Year Thus, the difference between the interaction of regions and the year determine the trend in the model. The results of these interactions between the regions and year for both rice and maize yield models are also given in Table 4.6a and 4.6b respectively. Table 4.6a results suggest a trend in maize yields of six regions that is decelerating as indicated by the negative (-) sign attached to the estimated values. Only three of the regions (Brong Ahafo, Northern and Western) aside Ashanti region has positive interaction average fixed effect values which mean that they have progressive trend in maize yields for the study period. 57 University of Ghana http://ugspace.ug.edu.gh Table 4.6a: Fixed effects parameter estimates for maize yields Maize Estimates Std.error df t-value P-value (INTERCEPT) 1.37177 0.07036 1231 19.495406 0.0000*** BRONG. AHAFO 0.71144 0.10177 231 6.990433 0.0000*** CENTRAL -0.49031 0.11528 231 -4.253051 0.0000*** EASTERN -0.30681 0.10129 231 -3.029021 0.0027*** G. ACCRA -0.51641 0.12798 231 -4.035011 0.0001*** NORTHERN 0.01693 0.12798 231 0.168178 0.8666 UPPER EAST -0.36260 0.10068 231 -3.053178 0.0025*** UPPER WEST -0.14901 0.12956 231 -1.149953 0.2514 VOLTA -0.06050 0.10997 231 0.550207 0.5827 WESTERN 0.02617 0.11342 231 0.230733 0.8177 YEAR 0.00451 0.00797 1231 0.569605 0.569 B. AHAFO:YEAR 0.01065 0.01141 1231 0.933553 0.0357*** CENTRAL:YEAR -0.00680 0.01339 1231 -0.507377 0.0010*** EASTERN:YEAR -0.06290 0.01184 1231 -5.317655 0.0000*** G. ACCRA:YEAR -0.04810 0.01461 1231 -3.293161 0.0000*** NORTHERN:YEAR 0.01296 0.01142 1231 1.135008 0.0006*** UPPER EAST: YEAR -0.05130 0.01490 1231 -3.441122 0.2566 UPPER WEST:YEAR -0.04080 0.01462 1231 -2.787415 0.0054*** VOLTA:YEAR -0.03590 0.01241 1231 -2.889813 0.0039*** WESTERN:YEAR 0.00415 0.01332 1231 0.311693 0.7553 ***significant at 5% Table 4.6b present fixed effects parameter estimates of rice yields. Similar to the maize yields, the intercept (Ashanti region) is the expected average yields of rice in metric tons per hectare when all covariates are zero. Our model indicates that the expected average yields of rice decreases by 1.035Mt/Ha if other regions recorded zero yields. The estimate column tell us that Brong Ahafo region have average yields of rice higher than Ashanti region. This is indicated by the positive coefficient (0.49) figure of the region’s estimate. Also the estimates column for Central region indicates that it have average yields of rice higher than Ashanti region for the study period as given by the positive coefficient (0.32) of the region’s estimate. The rest of the regions estimates in Table 4.6a revealed that all the regions have average rice yields higher than Ashanti region for the study period. 58 University of Ghana http://ugspace.ug.edu.gh The interaction between the regions and year gives significant positive estimates suggesting a trend in rice yields that is progressively increasing for each of the regions. This confirms the earlier assessment in Figure 4.2 which indicated stable improvement in rice yields for all the regions. Table 4.6b: Fixed effects Parameter estimates for rice yields Rice Estimates Std.error df t-value P-value (INTERCEPT) -1.0349446 0.14509882 1231 -7.132688 0.0000*** BRONG. AHAFO 0.4862897 0.21185939 231 2.295342 0.0226*** CENTRAL 0.3202962 0.23532551 231 1.361077 0.0048*** EASTERN 0.5833881 0.20834345 231 2.800126 0.0055*** G. ACCRA 0.8772819 0.26037494 231 3.369303 0.0009*** NORTHERN 1.0327226 0.20976651 231 4.923201 0.0000*** UPPER EAST 1.5809686 0.24136448 231 6.550131 0.0000*** UPPER WEST 0.1551816 0.27012671 231 0.574477 0.0062*** VOLTA 0.7140924 0.22778364 231 3.134959 0.3763 WESTERN 0.2051626 0.23146262 231 0.886375 0.0019*** YEAR 0.1867735 0.01420502 1231 13.148423 0.0000*** B.AHAFO:YEAR 0.1538969 0.02024638 1231 7.601207 0.0000*** CENTRAL:YEAR 0.1284677 0.02409227 1231 5.332321 0.0000*** EASTERN:YEAR 0.0624697 0.02112496 1231 2.957152 0.0032*** G.ACCRA:YEAR 0.0739272 0.02582751 1231 2.862344 0.0043*** NORTHERN:YEAR 0.1908038 0.02022598 1231 9.433598 0.0000*** UPPER EAST:YEAR 0.1679122 0.02653888 1231 6.327024 0.0000*** UPPER WEST:YEAR 0.1647742 0.02581505 1231 6.382877 0.0000*** VOLTA:YEAR 0.0199702 0.02210353 1231 0.903483 0.3664 WESTERN:YEAR 0.1900362 0.02394832 1231 7.935261 0.0000*** ***significant at 5% Approximately 95% confidence intervals were obtained for the parameters. These confidence intervals are based on the asymptotic normality of the covariance parameter estimates. Since most of the confidence intervals do not contain zero, it confirms our findings for both fixed and random effects discussed in 4.3.1 and 4.3.2. For approximated 95% confidence intervals for fixed effects see Table 4.14a and b in appendix C. 59 University of Ghana http://ugspace.ug.edu.gh 4.5. Discussion Research has indicated that modeling longitudinal data over time with Linear Mixed Model is an appropriate method (Brownie, 1993; Bruce & Schaalje, 2008). The result of our study revealed interesting findings about maize and rice yields in the country for the period under study. One such finding was the successive improvement of rice yields from 2009 to 2014 for eight of the ten regions in Ghana. Further research indicated that the implementation of National Rice Development Strategy (NRDS) accounted for this improvement. This was in consistent with a study done by Isaac (2012) of KNUST on Cocoa production for six regions of Ghana where he noticed an increasing trend of cocoa production for five of the six regions considered for the study. Unlike the rice yields, the trend in maize yields was generally decreasing especially for the last four years (2010-2014). This decreasing trend calls for reform on the production of maize crop. Against the popular believe that maize is the most produced and consumed cereal crop in Ghana and therefore will have higher yields, the study did indicated that rice yields are averagely higher than maize yields in the country. Also, it revealed that significant differences exist in both maize and rice yields among all the regions in Ghana and there is variability between and within the yields among the regions with maize having much less within variability than the rice yields. The study identified that significant differences occurred in the majority of the regions’ yields with other regions for both maize and rice. Only few of the regions indicated absent of differences between their yields and that of other regions. Although, the study did show that modeling cereal yields over time with MANOVA and LMM is valid, we think that modeling more crops with variables such as rainfall and climate data together with the Districts covariates will reveal more interesting findings. This could be done by joint model where more crops could be used. There is the need therefore, to focus 60 University of Ghana http://ugspace.ug.edu.gh more on production of maize crop in order to reverse the decreasing trend in the yields of maize. This means that MOFA should increase intensive education on the crop with much more viable seeds and incentives to farmers who engage in production of the crop. The most important aspect of our study is the fact that it has revealed the need to do further investigation on cereal crop yields in the country with other methods such as joint models. With this analysis, we will be able to determine the significant level of major crops produced and consume in Ghana as well as the evolution among the crops. 61 University of Ghana http://ugspace.ug.edu.gh CHAPTER FIVE CONCLUSION AND RECOMMENDATION 5.0. Introduction This is the last chapter of this thesis which presents the conclusion and recommendation based on the results of the study. 5.1. Conclusion This study uses data from Statistics Research and Information Department of Ministry of Food and Agriculture (MOFA) to analyze major cereal crop yields in Ghana. The study was aimed at investigating whether significant differences exist in the crop yields and to determine the evolution of crop yields between the regions in Ghana. MANOVA and Linear Mixed Models (LMM) were used to assess these objectives in two major cereal crops (Maize and Rice) yields produced and consumed in Ghana. The Restricted Maximum Likelihood Method was employed to select best fitted LMM to estimates random and fixed effects parameters. The diagnostic plots indicate that, the LMM selected is good for determine the evolution of cereal crop yields. It was found that significant differences exist in maize and rice yields across the ten (10) regions in Ghana. The study identified that the differences occurred in the majority of the regions yields for both maize and rice with only few of them indicating absent of differences in their yields comparisons. Descriptive statistics revealed that within the ten years period considered for the study, Brong Ahafo region recorded the highest average maize yields of 9.56 metric tons per hectare (Mt/Ha) and the lowest average maize and rice yields was recorded in Upper East region 62 University of Ghana http://ugspace.ug.edu.gh (0.13Mt/Ha and 0.68Mt/Ha). The highest average rice yield was recorded in Greater Accra region (6.72 Mt/Ha). The study also confirms that there is variability in cereal crop yields of maize and rice between and within the regions with maize yields having much less within variability than rice yields. It reveals a trend that is decelerating in maize yields of majority of the regions and steadily increasing trend in rice yields in all the regions. Against the popular believe that maize is number one cereal crop produced and consumed in Ghana and hence may have higher average yields, the study generally indicated that, the yields of rice are averagely higher than the yields of maize from 2005 to 2014. 5.2. Recommendation Based on these findings, maize production should be given more attention such as a national programme to improve the yields so as to reverse the declining trend likewise rice production which has been given a boost due to NRDS programme. This will improve yields of maize and increase credit facility to farmers as well as reducing poverty and hunger. MOFA is also encouraged to intensify education on the two cereals widely consume in Ghana and give more benefits to farmers who engage in its production, especially the youth. We also recommend that researchers engaging in similar work of study should strive to obtain data on rainfall and climate in addition to yields data. This will clarify more significant differences in cereal yields in the various regions in Ghana and help to determine more interesting patterns. We therefore conclude by declaring that Joint Models that will include other factors such as rainfall and climate data which may influence crop yields should be employed to investigate cereal crop yields in Ghana. 63 University of Ghana http://ugspace.ug.edu.gh REFERENCE Agyare, Agyei, W., Asare, I. K., Sogbedji, Jean Clottey, & Attuquaye, V. (2014). Challenges to maize fertilization in the forest and transition zones of Ghana. African Journal of Agricultural Research, 9(6), 593-602. Akaike, H. (1981). Likelihood of a model and information criteria. Journal of econometrics, 16(1), 3-14. Azinu, A. R. (2014). Evaluation of Hybrid Maize Varieties in Three Agro-Ecological Zones in Ghana. University of Ghana. Bank, W. (2011). Food Price Watch April 2011. Poverty Reduction and Equity Group, Poverty Reduction and Economic Management (PREM) Network. Bates, D., & Pinheiro, J. (2006). Mixed-effects models in S and S-PLUS: Springer Science & Business Media. Beddington, J. (2010). Food security: contributions from science to a new and greener revolution. Philosophical Transactions of the Royal Society B: Biological Sciences, 365(1537), 61-71. Box, G. E. (1949). A general distribution theory for a class of likelihood criteria. Biometrika, 317-346. Brownie, C., Larry, D., & Tina, J. (1993). Longitudinal and Spatial Analyses Applied to Corn Yield Data from a Long-Term Rotation Trial-Institute of Statistics Mimeo Series No. 2559. Bruce Schaalje, G. (2008). Linear mixed models, a practical guide using statistical software. Statistics in Medicine, 27(15), 2979-2980. Burnham, K. P., & Anderson, D. R. (2004). Multimodel inference understanding AIC and BIC in model selection. Sociological methods & research, 33(2), 261-304. 64 University of Ghana http://ugspace.ug.edu.gh Casella, G., & Berger, R. L. (2002). Statistical inference (Vol. 2): Duxbury Pacific Grove, CA. Diao, X., Hazell, P., & Thurlow, J. (2010). The role of agriculture in African development. World Development, 38(10), 1375-1383. FAO. (2009). Report of the World Food Summit. Retrieved from Rome: FAO: Faraway, J. J. (2005). Extending the linear model with R: generalized linear, mixed effects and nonparametric regression models: CRC press. Foley, J. (2011). Can we feed the world & sustain the planet? Scientific American, 305(5), 60-65. Foresight, U. (2011). The future of food and farming. Final Project Report, London, The Government Office for Science. Garson, G. D. (2012). Testing statistical assumptions. Asheboro, NC: Statistical Associates Publishing. Ghana Statistical Service. (2014). Gross Domestic Product 2014: Ghana Statistical Service. GhanaWeb. (2015). Ghana to double rice production by 2018. Business News of Sat. 3 Oct 2015. Godfray, H. C. J., Beddington, J. R., Crute, I. R., Haddad, L., Lawrence, D., & Toulmin, C. (2010). Food security: the challenge of feeding 9 billion people. science, 327(5967), 812-818. Gregory, P., Ingram, J. S., Andersson, R., Betts, R., Brovkin, V., & Hardy, T. (2002). Environmental consequences of alternative practices for intensifying crop production. Agriculture, ecosystems & environment, 88(3), 279-290. Griggs, D., Stafford-Smith, M., Gaffney, O., Rockström, J., Öhman, M. C., & Noble, I. (2013). Policy: Sustainable development goals for people and planet. Nature, 495(7441), 305-307. 65 University of Ghana http://ugspace.ug.edu.gh Harrar, S. W., & Bathke, A. C. (2008). Nonparametric methods for unbalanced multivariate data and many factor levels. Journal of Multivariate Analysis, 99(8), 1635-1664. Harwood, J. (2005). Europe’s Green Revolution: Peasant-Oriented Plant-Breeding in Central Europe, 1890-1945. Hotelling, H. (1951). A Generalized t Test and Measure of Multivariate. Ingram, J. S. I. (2011). From food production to food security: developing interdisciplinary, regional-level research: publisher not identified. Isaac, A. A. (2012). Multiple Comparison And Random Effect Model On Cocoa Production In Ghana (From 1969/70 To 2010/11 Production Years). College of Science Department of Mathematics. A Thesis Submitted to the department of Mathematics, Kwame Nkrumah University of Science and Technology, Kumasi. Ittersum, M., & Rabbinge, R. (1997). Concepts in production ecology for analysis and quantification of agricultural input-output combinations. Field Crops Research, 52(3), 197-208. Johnson, R. A., & Wichern, D. W. (2007). Applied multivariate statistical analysis (Vol. 4): Prentice hall Englewood Cliffs, NJ. Jose, p., & Bates, D. (2000). Linear mixed-effects models: basic concepts and examples. Mixed-Effects Models in S and S-Plus, 3-56. Kearney, J. (2010). Food consumption trends and drivers. Philosophical Transactions of the Royal Society B: Biological Sciences, 365(1554), 2793-2807. Khatri, C., & Pillai, K. (1968). On the non-central distributions of two test criteria in multivariate analysis of variance. The Annals of Mathematical Statistics, 215-226. Lawley, D. (1938). A generalization of Fisher's z test. Biometrika, 30(1-2), 180-187. 66 University of Ghana http://ugspace.ug.edu.gh Littell, R. C., Pendergast, J., & Natarajan, R. (2000). Tutorial in biostatistics: modelling covariance structure in the analysis of repeated measures data. Statistics in Medicine, 19(1793), 1819. Littell, R. C., Stroup, W. W., & Freund, R. J. (2002). SAS for linear models: SAS Institute. Lobell, D. B., Cassman, K. G., & Field, C. B. (2009). Crop yield gaps: their importance, magnitudes, and causes. Annual Review of Environment and Resources, 34(1), 179. OECD, & FAO. (2009). OECD-FAO Agricultural Outlook 2009-2018, OECD. Oerke, E.-C., Dehne, H.-W., Schönbeck, Fritz, Weber, & Adolf. (2012). Crop production and crop protection: estimated losses in major food and cash crops: Elsevier. Onumah, G., & Aning, A. (2009). Feasibility study towards establishment of commodities exchange in Ghana: Final Report: Undertaken on behalf of the Securities and Exchange Commission, Accra, Ghana August. Parry, M., Rosenzweig, C., Iglesias, A., Livermore, Matthew, & Fischer, G. (2004). Effects of climate change on global food production under SRES emissions and socio- economic scenarios. Global Environmental Change, 14(1), 53-67. Parry, M., Rosenzweig, C., & Livermore, M. (2005). Climate change, global food supply and risk of hunger. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 360(1463), 2125-2138. Pinheiro, Céline, Sousa, B., Albergaria, A., Paredes, J., Dufloth, R., . . . Baltazar, F. (2011). GLUT1 and CAIX expression profiles in breast cancer correlate with adverse prognostic factors and MCT1 overexpression. Histology and histopathology, 26(10), 1279-1286. Pinheiro, Jose, Bates, D., DebRoy, S., Sarkar, & Deepayan. (2007). Linear and nonlinear mixed effects models. R package version, 3, 57. 67 University of Ghana http://ugspace.ug.edu.gh Premanandh, J. (2011). Factors affecting food security and contribution of modern technologies in food sustainability. Journal of the Science of Food and Agriculture, 91(15), 2707-2714. Quansah, G. W. (2010). Effect of organic and inorganic fertilizers and their combinations on the growth and yield of maize in the semi-deciduous forest zone of Ghana. Department of Crop and Soil Sciences, College of Agriculture and Natural Resources, Kwame Nkrumah University of Science and Technology, Kumasi. Rencher, A. C. (2003). Methods of multivariate analysis (Vol. 492): John Wiley & Sons. Rencher, A. C., & Schaalje, G. B. (2008). Linear models in statistics: John Wiley & Sons. Roberts, M. J., & Tack, J. B. (2011). A Mixed Effects Model of Crop Yields for Purposes of Premium Determination. Paper presented at the 2011 Annual Meeting, July 24-26, 2011, Pittsburgh, Pennsylvania. Rosegrant, M. W., & Cline, S. A. (2003). Global food security: challenges and policies. science, 302(5652), 1917-1919. Rowhani, P., Lobell, B, D., Linderman, M., Ramankutty, & Navin. (2011). Climate variability and crop production in Tanzania. Agricultural and Forest Meteorology, 151(4), 449-460. Roy, S. N. (1953). On a heuristic method of test construction and its use in multivariate analysis. The Annals of Mathematical Statistics, 220-238. Sawyer, S. F. (2009). Analysis of variance: the fundamental concepts. Journal of Manual & Manipulative Therapy, 17(2), 27E-38E. Schwarz, G. (1978). Estimating the dimension of a model. The annals of statistics, 6(2), 461- 464. SRID, M. (2007). National Crop production estimates 2002-2006. Statistical Research and Information Department, Ministry of Food and Agriculture. 68 University of Ghana http://ugspace.ug.edu.gh Tilman, D., Cassman, G, K., Matson, P. A., Naylor, R., & Polasky, S. (2002). Agricultural sustainability and intensive production practices. Nature, 418(6898), 671-677. Timm, N. (2002). Applied multivariate analysis: methods and case studies. New York: Springer. Tran, M. (2011). Africa can feed the world? The Guardian [online]. London. Verma, U., Piepho, H., Hartung, K., Ogutu, J., Connell, C., & Goyal, A. (2014). Linear Mixed Modeling for Mustard Yield Prediction in Haryana State (India). Issues, 1(1). West, B. T., Welch, K. B., & Galecki, A. T. (2014). Linear mixed models: a practical guide using statistical software: CRC Press. Wilks, S. S. (1932). Certain generalizations in the analysis of variance. Biometrika, 471-494. Wolter, D. (2008). Ghana: Seizing New Agribusiness Opportunities. World, B. (2009). Food Price Watch April. Poverty Reduction and Equity Group, Poverty Reduction and Economic Management (PREM) Network. Zhang, J.-T., & Xiao, S. (2012). A note on the modified two-way MANOVA tests. Statistics & Probability Letters, 82(3), 519-527. Zuur, A. F. (2010). AED: data files used in mixed effects models and extensions in ecology with R. R package version, 1(9). 69 University of Ghana http://ugspace.ug.edu.gh APPENDIX Appendix A: Test of Assumptions Figure 4.7. Scatter plot of maize and rice yields Figure 4.8. Boxplot of maize and rice yields Figure 4.9: Test Normality assumption A University of Ghana http://ugspace.ug.edu.gh Table 4.6c: Test of Assumptions Bonferroni Test: rstudent=3.6472 Unadjusted p-value =0.0005 Bonferroni p-value= 0.0402 Shapiro-Wilk normality test: W = 0.9889, p-value = 0.007226 Box's M-test for Homogeneity of Covariance Matrices Chi-Sq (approx.) = 140.943, df = 20, p-value < 2.2e-16 B University of Ghana http://ugspace.ug.edu.gh Table 4.6d: Bartlett's Test of Homogeneity of variance Bartlett's K-square P-value YEAR Maize Rice DF Maize Rice 2005 92.794 214.526 9 4.47e-16 2.20e-16 2006 61.815 76.054 9 5.98e-10 9.78e-13 2007 50.814 137.409 9 7.57e-08 2.21e-16 2008 97.238 93.671 9 2.20e-16 2.98e-16 2009 108.028 114.713 9 2.20e-16 2.20e-16 2010 62.124 109.133 9 5.21e-10 2.20e-16 2011 78.398 118.691 9 3.36e-13 2.20e-16 2012 147.176 172.373 9 2.20e-16 2.20e-16 2013 58.429 163.995 9 2.69e-09 2.20e-16 2014 62.136 172.312 9 5.19e-10 2.20e-16 Alternative hypothesis: variances are not identical Table 4.7: Brown Forsyth Test of Homogeneity of variance F-Statistic DF DF P-value YEAR Maize Rice Region Residuals Maize Rice 2005 4.0249 8.1244 9 136 1.38e-04 1.37e-09 2006 4.9686 7.051 9 128 9.89e-06 3.07e-08 2007 2.7202 5.1463 9 128 6.19e-03 5.97e-06 2008 5.5392 8.6752 9 128 1.97e-06 1.21e-05 2009 6.7619 9.7208 9 128 6.70e-08 3.13e-11 2010 4.939 11.4745 9 128 1.08e-05 4.68e-13 2011 6.3297 9.3142 9 128 2.19e-07 8.59e-11 2012 0.5799 9.0994 9 158 0.8122 4.95e-11 2013 2.7309 8.1734 9 160 0.00545 6.03e-10 2014 3.5979 9.0337 9 160 4.19e-04 5.57e-11 Alternative hypothesis: variances are not identical TABLE 4.8a: INTERCORRELATION OF MAIZE YIELDS (Intr) BRONG A CENTRAL EASTERN GREATER NORTHERN U PP EA S T WEST VO LTA WESTERN YEAR BRONG AHAFO -0.699 CENTRAL -0.595 0.416 EASTERN -0.673 0.470 0.401 GREATER ACCRA -0.546 0.381 0.325 0.368 NORTHERN -0.698 0.488 0.415 0.470 0.381 UPPER EAST -0.535 0.374 0.318 0.360 0.292 0.373 UPPER WEST -0.545 0.381 0.324 0.367 0.298 0.381 0.292 VOLTA -0.642 0.449 0.382 0.432 0.351 0.448 0.344 0.384 WESTERN -0.598 0.418 0.356 0.403 0.327 0.418 0.320 0.642 0.598 C University of Ghana http://ugspace.ug.edu.gh ASHANTI:YEAR -1.000 0.595 0.673 0.546 0.698 0.535 -0.449 -0.418 -0.699 BRONG:YEAR 0.699 -1.000 -0.416 -0.470 -0.381 -0.488 -0.374 -0.356 -0.595 CENTRAL:YEAR 0.595 -0.416 -1.000 -0.401 -0.325 -0.415 -0.318 -0.403 -0.673 EASTERN:YEAR 0.673 -0.470 -0.401 -1.000 -0.368 -0.470 -0.360 -0.327 -0.546 GREATER:YEAR 0.546 -0.381 -0.325 -0.368 -0.100 -0.381 -0.292 -0.418 -0.698 NORTHERN:YEAR 0.698 -0.488 -0.415 -0.470 -0.381 -0.100 -0.373 -0.320 -0.535 UPP EAST:YEAR 0.535 -0.374 -0.318 -0.360 -0.292 -0.373 -1.000 -0.326 -0.545 UPP WEST:YEAR 0.545 -0.381 -0.324 -0.367 -0.298 -0.381 -0.292 -0.100 -0.642 VOLTA:YEAR 0.642 -0.449 -0.382 -0.432 -0.351 -0.448 -0.344 -0.412 -0.100 WESTERN:YEAR 0.598 -0.418 -0.356 -0.403 -0.327 -0.418 -0.320 -0.100 -0.598 TABLE 4.8b: INTERCORRELATION OF RICE YIELDS (Intr) BRONG CENTRAL EASTERN GREATER NORTHERN UPP EAST WEST VOLTA WESTERN YEAR BRONG AHAFO -0.702 CENTRAL -0.590 0.414 EASTERN -0.673 0.472 0.397 GREATER -0.551 0.386 0.325 0.370 NORTHERN -0.703 0.493 0.414 0.473 0.387 UPPER EAST -0.536 0.376 0.316 0.360 0.295 0.376 UPPER WEST -0.550 0.386 0.324 0.370 0.303 0.387 0.295 VOLTA -0.643 0.451 0.379 0.432 0.354 0.451 0.344 0.354 WESTERN -0.593 0.416 0.350 0.399 0.327 0.417 0.318 0.326 0.381 ASHANTI:YEAR -1.000 0.702 0.590 0.673 0.551 0.702 0.536 0.550 0.643 0.593 BRONG A:YEAR 0.702 -1.000 -0.414 -0.472 -0.386 -0.493 -0.376 -0.451 -0.416 -0.702 CENTRAL:YEAR 0.590 -0.414 -1.000 -0.397 -0.325 -0.414 -0.316 -0.379 -0.350 -0.590 EASTERN:YEAR 0.672 -0.472 -0.396 -1.000 -0.370 -0.472 -0.360 -0.432 -0.399 -0.672 GREAT A:YEAR 0.551 -0.386 -0.325 -0.370 -0.100 -0.387 -0.295 -0.354 -0.327 -0.551 NORTHERN:YEAR 0.702 -0.493 -0.414 -0.472 -0.387 -0.100 -0.376 -0.451 -0.417 -0.702 UPP EAST:YEAR 0.535 -0.376 -0.316 -0.360 -0.295 -0.376 -1.000 -0.344 -0.317 -0.535 UPP WEST:YEAR 0.550 -0.386 -0.324 -0.370 -0.303 -0.387 -0.295 -0.354 -0.326 -0.550 VOLTA:YEAR 0.643 -0.451 -0.379 -0.432 -0.354 -0.451 -0.344 -0.100 -0.381 -0.643 WESTERN:YEAR 0.593 -0.416 -0.350 -0.399 -0.327 -0.417 -0.318 -0.381 -0.100 -0.593 Appendix B: Test on significant differences of maize and rice yields Table 4.9: Pillai's Test Year Estimates DF Approx.F Pr(>F) 2005 1.0741 9 13.148 2.201e-16*** 2006 1.1379 9 18.772 2.211e-16*** 2007 0.9189 9 12.088 2.234e-16*** 2008 1.0420 9 11.845 2.231e-16*** 2009 0.9501 9 12.870 2.220e-16*** 2010 0.9069 9 11.800 2.242e-16*** 2011 1.1942 9 21.075 2.247e-16*** 2012 0.5287 9 6.3077 2.874e-13*** 2013 0.8559 9 13.302 2.210e-16*** D University of Ghana http://ugspace.ug.edu.gh 2014 0.9018 9 14.598 2.112e-16*** Table 4.10: Hotelling-Lawley Trace Year Estimates DF Approx.F Pr(>F) 2005 4.8065 9 26.703 2.201e-16*** 2006 2.9192 9 20.435 2.213e-16*** 2007 2.9036 9 20.325 2.231e-16*** 2008 2.1833 9 11.645 2.212e-16*** 2009 1.8931 9 13.252 2.223e-16*** 2010 1.7795 9 12.457 2.202e-16*** 2011 3.1524 9 22.067 2.240e-16*** 2012 0.5365 9 6.3713 2.062e-13*** 2013 1.5253 9 13.389 2.239e-16*** 2014 1.6510 9 14.492 2.200e-16*** Table 4.11:Roy’s Greatest Root Test Year Estimates DF Approx.F Pr(>F) 2005 4.4601 9 50.548 2.231e-16*** 2006 2.0458 9 29.096 2.233e-16*** 2007 2.6666 9 37.924 2.221e-16*** 2008 1.1822 9 12.873 2.548e-13*** 2009 1.2314 9 17.513 2.211e-16*** 2010 1.2267 9 17.446 2.213e-16*** 2011 2.0693 9 29.430 1.224e-16*** 2012 0.4989 9 8.7591 1.254e-10*** 2013 0.9222 9 16.394 2.221e-16*** 2014 0.9144 9 16.257 2.244e-16*** Table 4.12: Wilks’ Lambda Test Year Estimates DF Approx.F. Pr(>F) 2005 0.1360 9 19.205 2.212e-16*** 2006 0.1753 9 19.597 2.220e-16*** 2007 0.2205 9 15.942 2.214e-16*** 2008 0.2290 9 11.745 2.231e-16*** 2009 0.2697 9 13.061 2.213e-16*** 2010 0.2892 9 12.128 2.210e-16*** 2011 0.1564 9 21.570 2.226e-16*** 2012 0.7424 9 6.4341 1.490e-13*** 2013 0.3245 9 13.389 2.221e-16*** 2014 0.3008 9 14.545 2.237e-16*** E University of Ghana http://ugspace.ug.edu.gh Appendix C: Approximate 95% confidence intervals Table 4.13a 95% confidence intervals for maize yields Fixed effects: lower est. u pp er (Intercept) 1.233719566 1.371765247 1.50981093 REGIONBRONG AHAFO 0.510918851 0.711442215 0.91196558 REGIONCENTRAL 0.263163392 0.490303632 0.71744387 REGIONEASTERN 0.107238693 0.306808230 0.50637777 REGIONGREATER ACCRA -0.768529813 -0.516381752 -0.26423369 REGIONNORTHERN -0.181427607 0.016931401 0.21529041 REGIONUPPER EAST -0.596581724 -0.362592412 -0.12860310 REGIONUPPER WEST -0.404258743 -0.148988117 0.10628251 REGIONVOLTA -0.156168130 0.060506982 0.27718209 REGIONWESTERN -0.197297246 0.026169402 0.24963605 ASHANTI:YEAR -0.020172801 -0.004539020 0.01109476 REGIONBRONG AHAFO:YEAR -0.033025601 -0.010648155 0.01172929 REGIONCENTRAL:YEAR -0.019481393 0.006795677 0.03307275 REGIONEASTERN:YEAR 0.039720357 0.062942233 0.08616411 REGIONGREATER ACCRA:YEAR 0.019455597 0.048127372 0.07679915 REGIONNORTHERN:YEAR -0.009439387 0.012956791 0.03535297 REGIONUPPER EAST:YEAR 0.022047189 0.051288249 0.08052931 REGIONUPPER WEST:YEAR 0.012070049 0.040755118 0.06944019 REGIONVOLTA:YEAR 0.011513308 0.035855783 0.06019826 REGIONWESTERN:YEAR -0.030273537 -0.004150291 0.02197295 Random Effects: Level: DISTRICT lower est. uppe r sd((Intercept)) 0.2200621 0.2506516 0.2854933 Within-group standard error: lower est. upper 0.3098047 0.3223060 0.3353117 Table 4.13b 95% confidence intervals for rice yields Fixed effects lower est. upper (Intercept) 0.75027625 1.03494460 1.31961294 REGIONBRONG AHAFO 0.06886600 0.48628973 0.90371347 REGIONCENTRAL -0.14336250 0.32029622 0.78395494 REGIONEASTERN 0.17289168 0.58338800 0.99388431 REGIONGREATER ACCRA 0.36426868 0.87728195 1.39029522 REGIONNORTHERN 0.61942247 1.03272261 1.44602275 REGIONUPPER EAST 1.10541142 1.58096863 2.05652584 REGIONUPPER WEST -0.37704547 0.15518159 0.68740865 REGIONVOLTA 0.26529335 0.71409241 1.16289148 REGIONWESTERN -0.25088510 0.20516262 0.66121034 ASHANTI:YEAR 0.15890483 0.18677354 0.21464226 REGIONBRONG AHAFO:YEAR -0.19361817 -0.15389694 -0.11417570 REGIONCENTRAL:YEAR -0.17573418 -0.12846772 -0.08120126 F University of Ghana http://ugspace.ug.edu.gh REGIONEASTERN:YEAR -0.10391465 -0.06246973 -0.02102482 REGIONGREATER ACCRA:YEAR -0.12459805 -0.07392723 -0.02325641 REGIONNORTHERN:YEAR -0.23048497 -0.19080376 -0.15112255 REGIONUPPER EAST:YEAR -0.21997862 -0.16791217 -0.11584572 REGIONUPPER WEST:YEAR -0.21542061 -0.16477425 -0.11412789 REGIONVOLTA:YEAR -0.06333493 -0.01997017 0.02339460 REGIONWESTERN:YEAR -0.23702023 -0.19003618 -0.14305214 Random Effects: Level: DISTRICT lower est. upper sd((Intercept)) 0.5419314 0.6093177 0.685083 Within-group standard error: lower est. upper 0.5431437 0.5651979 0.5881477 Appendix D: Marginal Variance Covariance Matrix TABLE 4.14: MAIZE: MARGINAL VARIANCE COVARIANCE MATRIX 1 2 3 4 5 6 7 1 0.16669 0.062832 0.062832 0.062832 0.062832 0.062832 0.062832 2 0.062832 0.16669 0.062832 0.062832 0.062832 0.062832 0.062832 3 0.062832 0.062832 0.16669 0.062832 0.062832 0.062832 0.062832 4 0.062832 0.062832 0.062832 0.16669 0.062832 0.062832 0.062832 5 0.062832 0.062832 0.062832 0.062832 0.16669 0.062832 0.062832 6 0.062832 0.062832 0.062832 0.062832 0.062832 0.16669 0.062832 7 0.062832 0.062832 0.062832 0.062832 0.062832 0.062832 0.16669 Random Effects Variance Covariance Matrix (Intercept) (Intercept) 0.062832 Standard Deviations: 0.25066 TABLE 4.15: RICE: MARGINAL VARIANCE COVARIANCE MATRIX 1 2 3 4 5 6 7 1 0.6904 0.37098 0.37098 0.37098 0.37098 0.37098 0.37098 2 0.37098 0.6904 0.37098 0.37098 0.37098 0.37098 0.37098 3 0.37098 0.37098 0.6904 0.37098 0.37098 0.37098 0.37098 4 0.37098 0.37098 0.37098 0.6904 0.37098 0.37098 0.37098 5 0.37098 0.37098 0.37098 0.37098 0.6904 0.37098 0.37098 6 0.37098 0.37098 0.37098 0.37098 0.37098 0.6904 0.37098 7 0.37098 0.37098 0.37098 0.37098 0.37098 0.37098 0.6904 G University of Ghana http://ugspace.ug.edu.gh Random Effects Variance Covariance Matrix (Intercept) (Intercept) 0.37098 Standard Deviations: 0.60908 Appendix E: Pairwise Comparisons of difference in Mean Yields Table 4.16a: Pairwise Comparisons in difference of mean yields 95% Confidence 95% Confidence Differences in Interval for Interval for Mean yields Std. Error P-value Difference Difference (I) Region (J) Region (I-J) Maize Rice Lower Upper Lower Upper Maize Rice Maize Rice Maize Rice Bound Bound Bound Bound ASHANTI B/AHAFO -.613* .449* .039 .074 .000 .000 -.741 -.485 .208 0.6909 CENTRAL -.548* .565* .043 .082 .000 .000 -.689 -.406 .298 0.8316 EASTERN -.701* -.350* .040 .076 .000 .000 -.832 -.570 -.597 -0.1030 G/ACCRA .280* -1.400* .053 .101 .000 .000 .105 .454 -1.729 -1.0712 NORTHERN -.130* 0.090 .040 .075 .049 1.000 -.260 .000 -.155 0.3349 UPPER EAST -.009 -.469* .052 .098 1.000 0.000 -.178 .159 -.788 -0.1506 UPPER WEST -.130 .684* .052 .098 .553 .000 -.300 .039 .364 1.0040 VOLTA -.287* -.620* .042 .079 .000 .000 -.423 -.151 -.877 -0.3632 WESTERN -.035 .902* .043 .081 1.000 0.000 -.175 .106 .636 1.1675 B/AHAFO ASHANTI .613* -.449* .039 .074 .000 .000 .485 .741 -.691 -0.2079 CENTRAL .065 .115 .044 .084 1.000 1.000 -.080 .210 -.159 0.3896 EASTERN -.088 -.800* .041 .078 1.000 0.000 -.223 .047 -1.055 -0.5445 G/ACCRA .893* -1.850* .054 .103 .000 .000 .715 1.070 -2.185 -1.5146 NORTHERN .483* -.359* .041 .077 .000 .000 .349 .617 -.612 -0.1066 UPPER EAST .604* -.919* .053 .099 .000 .000 .432 .776 -1.243 -0.5938 UPPER WEST .483* 0.235 .053 .100 .000 .851 .310 .656 -.092 0.5607 VOLTA .326* -1.069* .043 .081 .000 .000 .186 .466 -1.334 -0.8049 WESTERN .578* .452* .044 .084 .000 .000 .434 .723 .179 0.7255 CENTRAL ASHANTI .548* -.565* .043 .082 .000 .000 .406 .689 -.832 -0.2979 B/AHAFO -.065 -.115 .044 .084 1.000 1.000 -.210 .080 -.390 0.1589 EASTERN -.153* -.915* .045 .085 .034 .000 -.301 -.005 -1.194 -0.6357 G/ACCRA .827* -1.965* .057 .108 .000 .000 .640 1.015 -2.319 -1.6112 NORTHERN .418* -.475* .045 .085 .000 .000 .271 .565 -.752 -0.1976 UPPER EAST .538* -1.034* .056 .105 .000 .000 .356 .721 -1.378 -0.6899 UPPER WEST .418* 0.11919 .056 .106 .000 1.000 .235 .601 -.226 0.4646 VOLTA .260* -1.185* .047 .088 .000 .000 .108 .413 -1.473 -0.8969 WESTERN .513* .337* .048 .091 .000 .009 .356 .670 .041 0.6328 EASTERN ASHANTI .701* .350* .040 .076 .000 .000 .570 .832 .103 0.5972 B/AHAFO .088 .800* .041 .078 1.000 0.000 -.047 .223 .545 1.0545 CENTRAL .153* .915* .045 .085 .034 .000 .005 .301 .636 1.1941 H University of Ghana http://ugspace.ug.edu.gh G/ ACCRA .980* -1.050* .055 .104 .000 .000 .801 1.160 -1.389 -0.7110 NORTHERN .571* .440* .042 .079 .000 .000 .434 .708 .182 0.6983 UPPER EAST .691* -0.119 .053 .101 .000 1.000 .517 .866 -.448 0.2099 UPPER WEST .571* 1.034* .054 .101 .000 .000 .396 .746 .704 1.3644 VOLTA .413* -.270* .044 .083 .000 .049 .271 .556 -.539 -0.0003 WESTERN .666* 1.252* .045 .085 .000 .000 .519 .813 .974 1.5300 G/ ACCRA. ASHANTI -.280* 1.400* .053 .101 .000 .000 -.454 -.105 1.071 1.7294 B/ AHAFO -.893* 1.850* .054 .103 .000 .000 -1.070 -.715 1.515 2.1848 CENTRAL -.827* 1.965* .057 .108 .000 .000 -1.015 -.640 1.611 2.3189 EASTERN -.980* 1.050* .055 .104 .000 .000 -1.160 -.801 .711 1.3893 NORTHERN -.409* 1.490* .055 .103 .000 .000 -.588 -.231 1.153 1.8278 UPPER EAST -.289* .931* .064 .121 .000 .000 -.498 -.080 .537 1.3254 UPPER WEST -.410* 2.084* .064 .121 .000 .000 -.619 -.200 1.689 2.4797 VOLTA -.567* .780* .056 .106 .000 .000 -.750 -.384 .434 1.1266 WESTERN -.314* 2.302* .057 .108 .000 .000 -.501 -.127 1.949 2.6550 NORTHERN ASHANTI .130* -0.0899 .040 .075 .049 1.000 .000 .260 -.335 0.1550 B/AHAFO -.483* .359* .041 .077 .000 .000 -.617 -.349 .107 0.6123 CENTRAL -.418* .475* .045 .085 .000 .000 -.565 -.271 .198 0.7521 EASTERN -.571* -.440* .042 .079 .000 .000 -.708 -.434 -.698 -0.1818 G/ACCRA .409* -1.490* .055 .103 .000 .000 .231 .588 -1.828 -1.1526 UPPER EAST .121 -.559* .053 .100 1.000 0.000 -.053 .294 -.886 -0.2318 UPPER WEST .000 .594* .053 .101 1.000 0.000 -.174 .174 .265 0.9227 VOLTA -.157* -.710* .043 .082 .013 .000 -.299 -.016 -.977 -0.4423 WESTERN .095 .812* .045 .085 1.000 0.000 -.051 .241 .536 1.0880 U/EAST ASHANTI .009 .469* .052 .098 1.000 0.000 -.159 .178 .151 0.7877 B/AHAFO -.604* .919* .053 .099 .000 .000 -.776 -.432 .594 1.2433 CENTRAL -.538* 1.034* .056 .105 .000 .000 -.721 -.356 .690 1.3780 EASTERN -.691* 0.119 .053 .101 .000 1.000 -.866 -.517 -.210 0.4480 G/ACCRA .289* -.931* .064 .121 .000 .000 .080 .498 -1.325 -0.5368 NORTHERN -.121 .559* .053 .100 1.000 0.000 -.294 .053 .232 0.8864 UPPER WEST -.121 1.153* .063 .118 1.000 0.000 -.326 .084 .766 1.5398 VOLTA -.278* -0.151 .055 .103 .000 1.000 -.456 -.100 -.487 0.1855 WESTERN -.025 1.371* .056 .105 1.000 0.000 -.207 .156 1.028 1.7142 U/WEST ASHANTI .130 -.684* .052 .098 .553 .000 -.039 .300 -1.004 -0.3640 B/AHAFO -.483* -0.2346 .053 .100 .000 .851 -.656 -.310 -.561 0.0916 CENTRAL -.418* -0.1192 .056 .106 .000 1.000 -.601 -.235 -.465 0.2262 EASTERN -.571* -1.034* .054 .101 .000 .000 -.746 -.396 -1.364 -0.7038 G/ACCRA .410* -2.084* .064 .121 .000 .000 .200 .619 -2.480 -1.6888 NORTHERN .000 -.594* .053 .101 1.000 0.000 -.174 .174 -.923 -0.2654 UPPER EAST .121 -1.153* .063 .118 1.000 0.000 -.084 .326 -1.540 -0.7665 VOLTA -.157 -1.304* .055 .103 .185 .000 -.336 .022 -1.642 -0.9663 WESTERN .095 .218 .056 .105 1.000 1.000 -.087 .278 -.127 0.5624 VOLTA ASHANTI .287* .620* .042 .079 .000 .000 .151 .423 .363 0.8768 B/ AHAFO -.326* 1.069* .043 .081 .000 .000 -.466 -.186 .805 1.3338 I University of Ghana http://ugspace.ug.edu.gh CENTRAL -.260* 1.185* .047 .088 .000 .000 -.413 -.108 .897 1.4726 EASTERN -.413* .270* .044 .083 .000 .049 -.556 -.271 .000 0.5394 G/ACCRA .567* -.780* .056 .106 .000 .000 .384 .750 -1.127 -0.4340 NORTHERN .157* .710* .043 .082 .013 .000 .016 .299 .442 0.9775 UPPER EAST .278* 0.15081 .055 .103 .000 1.000 .100 .456 -.185 0.4871 UPPER WEST .157 1.304* .055 .103 .185 .000 -.022 .336 .966 1.6416 WESTERN .253* 1.522* .046 .088 .000 .000 .101 .404 1.235 1.8086 WESTERN ASHANTI .035 -.902* .043 .081 1.000 0.000 -.106 .175 -1.168 -0.6362 B/AHAFO -.578* -.452* .044 .084 .000 .000 -.723 -.434 -.726 -0.1794 CENTRAL -.513* -.337* .048 .091 .000 .009 -.670 -.356 -.633 -0.0413 EASTERN -.666* -1.252* .045 .085 .000 .000 -.813 -.519 -1.530 -0.9739 G/ACCRA .314* -2.302* .057 .108 .000 .000 .127 .501 -2.655 -1.9492 NORTHERN -.095 -.812* .045 .085 1.000 0.000 -.241 .051 -1.088 -0.5358 UPPER EAST .025 -1.371* .056 .105 1.000 0.000 -.156 .207 -1.714 -1.0279 UPPER WEST -.095 -.218 .056 .105 1.000 1.000 -.278 .087 -.562 0.1266 VOLTA -.253* -1.522* .046 .088 .000 .000 -.404 -.101 -1.809 -1.2351 Table 4.16b: Bonferroni Post Hoc Test for maize yields with (adj.p-values) REGION ASHANTI B/A CENTRAL E/R G/A N/R U/EAST U/WEST VOLTA B/AHAFO < 2e-16 - - - - - - - - CENTRAL < 2e-16 1.00000 - - - - - - - EASTERN < 2e-16 1.00000 0.06824 - - - - - - G/ACCRA 1.10e-05 < 2e-16 < 2e-16 < 2e-16 - - - - - NORTHERN 0.067220 < 2e-16 < 2e-16 < 2e-16 1.30e-11 - - - - U/EAST 1.000000 < 2e-16 < 2e-16 < 2e-16 0.000480 1.00000 - - - U/WEST 0.716450 < 2e-16 1.10e-12 < 2e-16 2.00e-08 1.00000 1.00000 - - VOLTA 5.00e-10 3.60e-12 4.40e-07 < 2e-16 < 2e-16 1.20e-02 1.90e-05 0.15084 - WESTERN 1.000000 < 2e-16 < 2e-16 < 2e-16 2.60e-06 1.00000 1.000000 1.00000 4.10e-06 Source: SRID of MOFA Table 4.3c: Bonferroni Post Hoc Test for rice yield. REGION ASHANTI B/A CENTRAL EASTERN G/A N/R U/EAST U/WEST VOLTA B/A 4.60e-08 - - - - - - - - CENTRAL 9.60e-06 1.00000 - - - - - - - EASTERN 3.90e-06 < 2e-16 < 2e-16 - - - - - - G/ACCRA < 2e-16 < 2e-16 < 2e-16 < 2e-16 - - - - - NORTHERN 1.00000 7.10e-05 0.00250 1.80e-08 < 2e-16 - - - - U/EAST 0.00020 < 2e-16 4.10e-14 1.000000 < 2e-16 2.00e-06 - - - U/WEST 5.10e-10 1.00000 1.00000 < 2e-16 < 2e-16 7.10e-07 < 2e-16 - - VOLTA 3.10e-12 < 2e-16 < 2e-16 1.00000 < 2e-16 2.50e-15 1.0000 < 2e-16 - WESTERN < 2e-16 0.0162 0.0139 < 2e-16 < 2e-16 < 2e-16 < 2e-16 1.0000 < 2e-16 Source: SRID of MOFA J University of Ghana http://ugspace.ug.edu.gh Appendix F: R Codes used for the study R CODES FOR MANOVA MODEL data1<-read.csv("mixed2.csv",header=TRUE) plot(data1) ##MANOVA MODEL crop_manova <- manova(cbind(MAIZE, RICE) ~ REGION, data = data1) summary (crop_manova) ##EXPLORATORY ANALYSIS, TEST OF ASSUMPTIONS RESID<-resid(crop_manova) ##BOXPLOT boxplot(RESID,main="Boxplot of Residuals",ylim=c(-2,3),col="lightgreen") ##linearity assumption maize<-data1$MAIZE rice<-data1$RICE par(mfrow=c(1,2)) plot(maize, main=" Scatterplot of maize") plot(rice,main="Scatterplot of rice") library(MVN) par(mfrow=c(2,2)) uniPlot(RESID,type="qqplot") uniPlot(RESID,type = "histogram",) ##Homogeneity of Variance Plot bartlett.test(MAIZE~REGION, data=data1) bartlett.test(RICE~REGION, data=data1) K University of Ghana http://ugspace.ug.edu.gh library(HH) hov(MAIZE~REGION, data=data1) hov(RICE~REGION, data=data1) par(mfrow=c(2,2)) hovPlot(MAIZE~REGION,data=data1) hovPlot(RICE~REGION,data=data1) ##Test of differences in crop yields using manova summary(crop_manova, test = "Pillai") summary(crop_manova, test = "Wilks") summary(crop_manova, test = "Hotelling-Lawley") summary(crop_manova, test = "Roy") summary.aov(crop_manova) ##We might now move on to investigate the difference in yields across the regions ## Bonferroni post hoc for maize MAIZE attach(data1) group<-as.factor(REGION) MM<-tapply(maize,REGION,mean) diff<-pairwise.t.test(MAIZE,REGION,p.adjust="bonferroni") diff ## Bonferroni post hoc for rice MAIZE group<-as.factor(REGION) MM<-tapply(rice,REGION,mean) diff<-pairwise.t.test(RICE,REGION,p.adjust="bonferroni") diff ## Tukey HSD Post hoc test for maize yields par(mfrow=c(1,1)) library(TukeyC) manova_aov<-aov(MAIZE~REGION,data=data1) L University of Ghana http://ugspace.ug.edu.gh CI<-TukeyHSD(manova_aov, "REGION") CI plot(CI) ## # Tukey HSD Post hoc test for rice yields man_aov<-aov(RICE~REGION,data=data1) CONFI<-TukeyHSD(man_aov, "REGION") CONFI plot(CONFI) ## Box M Test library(biotools) head(data1) box M (data1[,4], data1[,5]) R CODES FOR LINEAR MIXED MODEL data<-read.csv("mixed.csv",header=TRUE) dm<-str(data) head(data) # Exploratory Analysis summary(data) maize<-data$MAIZE rice<-data$RICE par(mfrow=c(2,2)) boxplot(maize, main="Boxplot for Maize yield",ylab = "Maize yield in Mt/Ha", col="orange") qqnorm(maize, main = "Normal Q-Q Plot for Maized yields", ylab="Maize yield in Mt/Ha") qqline(maize) boxplot(rice, main= "Boxplot for Rice yield", ylab = "Rice yield in Mt/Ha",col="lightgreen") qqnorm(rice, main = "Normal Q-Q Plot for Rice yields", ylab="Rice yield in Mt/Ha") qqline(maize) M University of Ghana http://ugspace.ug.edu.gh REGION<-data$REGION par(mfrow=c(2,2)) boxplot(MAIZE ~ REGION, data=data,col="orange", xlab="REGION", ylab="Maize (Mt/Ha)", ylim=c(0,6),main="Maize yields by Region") boxplot(RICE ~ REGION, data=data,col="orange", xlab="REGION", ylab="Rice (Mt/Ha)", ylim=c(0,8), main="Rice yields by Region") boxplot(MAIZE ~ YEAR, data=data,col="orange", xlab="YEAR", ylab="Maize (Mt/Ha)", ylim=c(0,6),main="Maize yields by Year") boxplot(RICE ~ YEAR, data=data,col="orange", xlab="YEAR", ylab="Rice (Mt/Ha)", ylim=c(0,6),main="Rice yields by Year") #LMM MAIZE ## Random intercept library(lme4) fit2<-lmer(MAIZE~REGION+YEAR+(1|DISTRICT)+(REGION*YEAR),data) fit21<-lmer(MAIZE~REGION+YEAR+(1|DISTRICT)+(REGION*YEAR),data,REML=FALSE) #random intercept and slope fitlope <- lmer(MAIZE~REGION+YEAR+(1|YEAR)+(1|DISTRICT)+(REGION*YEAR),data=data,REML=F ALSE) anova(fit21,fitlope) summary(fitlope) par(mfrow=c(1,1)) hlmList <- lmList(MAIZE~ YEAR|DISTRICT, data=data) coefs <- coef(hlmList) names(coefs) <- c("Intercept", "Slope") plot(Slope~Intercept, data=coefs) abline(lm(Slope~Intercept, data=coefs)) #LMM RICE ## Random intercept library(lme4) fit3<-lmer(RICE~REGION+YEAR+(1|DISTRICT)+(REGION*YEAR),data) N University of Ghana http://ugspace.ug.edu.gh fit31<-lmer(RICE~REGION+YEAR+(1|DISTRICT)+(REGION*YEAR),data,REML=FALSE) #random intercept and slopefitlope2 lmer(RICE~REGION+YEAR+(1|YEAR)+(1|DISTRICT)+(REGION*YEAR),data=data,REML=FAL SE) anova(fit31,fitlope2) summary(fitlope2) par(mfrow=c(1,1)) hlmList <- lmList(RICE~ YEAR|DISTRICT, data=data) coefs <- coef(hlmList) names(coefs) <- c("Intercept", "Slope") plot(Slope~Intercept, data=coefs) abline(lm(Slope~Intercept, data=coefs)) library(car) par(mfrow=c(2,1)) plot((fit2),main="Scatter plot of Maize yield") plot((fit3),main="Scatter plot of Rice yield") residual<-resid(fit2) residuals<-resid(fit3) par(mfrow=c(2,2)) hist(residual, ylab="Maize yields(Mt/Ha)",freq=FALSE, main="Histogram of Maize residual", col="orange",ylim=c(0.0,1.0)) curve(dnorm(x, mean=mean(residuals), sd=sd(residuals)), add=TRUE, col="darkblue", lwd=2) hist(residuals,ylab="Rice yields(Mt/Ha)", freq=FALSE,main="Histogram of Rice residual",col="lightgreen", ylim=c(0.0,1.0)) curve(dnorm(x, mean=mean(residual), sd=sd(residual)), add=TRUE, col="darkblue", lwd=2) qqnorm(residual, main="Normal QQ-Plot for Residual of maize");qqline(residual) qqnorm(residuals,main="Normal QQ-Plot for Residual of Rice");qqline(residuals) O