ON THE EFFICIENCY OF TREATMENT COMPARISONS IN A RANDOMISED BLOCK DESIGN by PAUL TWUM NKANSAH B .Sc.(General) Education, Cape Coast. A Thesis Submitted to the Faculty of Graduate Studies, University of Ghana, Legon, in Partial Fulfillment of the Requirements for the M.Sc. (Statistics) Degree. Department of Mathematics University of Ghana Legon. May, 19 76, University of Ghana http://ugspace.ug.edu.gh G 271576 University of Ghana http://ugspace.ug.edu.gh D E C L A R A T I 0 N I hereby declare that the work presented in this thesis is entirely my own, and that no part thereof has been presented for a degree elsewhere. I certify that the work has been carried out under my supervision. University of Ghana http://ugspace.ug.edu.gh To My Friends University of Ghana http://ugspace.ug.edu.gh A C K N 0 W L O D G E M E N T S I wish to express my gratitude to my supervisor, Dr. S.I.K. Odoom, of the Department of Mathematics, Univer­ sity of Ghana, Legon, for his encouragement and the many useful suggestions he gave in the course of my work. My thanks also go to Dr. K.T. de Graft-Johnson, Dr. K. Ewusi and Mr- C.G. Bhattacharya, all of the Institute of Social, Statistical, and Economic Research, Legon, for the great interest they showed in my affairs. I am grateful to Prof. E. Laing, of the Department of Botany, Legon, and Mr. A.N. Aryeetey, of the Agricultural Research Station, Kpong, for permission to use the data on Cowpea. Finally, I wish to thank Mr. R.C. Nartey, of the Computer Centre, Legon, for his assistance in the computations; Mr. Kofi Ben of the same Centre, for the excellent typing; and all the junior members of the Department of Mathematics for their co­ operation . iv. University of Ghana http://ugspace.ug.edu.gh C O N T E N T S Page ACKNOWLEDGEMENTS ........................... iv CHAPTER 1 INTRODUCTION .................................... 1 1.1 Randomised Block Experiment and the Analysis of Variance .............................. 4 1.2 Univariate Analysis of Variance ............ 6 1.3 Multivariate Analysis of Variance .......... 7 1.4 Multiple Comparisons of Treatment Means ... 8 1.5 Estimation of Variance Components .......... 10 1.6 The Pseudo-Factorial Type of Arrangement ... 11 1.7 Comparison of Methods of Analysis .......... 12 2 RANDOMISED BLOCK EXPERIMENT AND THE ANALYSIS OF VARIANCE ...................... 14 2.1 Mathematical Models ....................... 14 2.2 Assumptions Underlying the Analysis of Variance ........ 18 2.3 The "Kpong Data" ...... 36 3 UNIVARIATE ANALYSIS OF VARIANCE .................. 54 3.1 Randomised Block Experiment with n Plants Per Plot Using the Average Yields Per Plot ... 54 3.2 Randomised Block Experiment with n Plants Per Plot ..................................... 59 3.3 Comparisons of the Various Tests for Treatment Effects ............... 64 University of Ghana http://ugspace.ug.edu.gh CHAPTER Page 3.4 Randomised Block Experiment with One Observation Per Plot Under Randomization Models .................................. 71 3.5 Loss of Information due to Sampling ....... 80 3.6 Analysis of the "Kpong Data" .............. 82 4 MULTIVARIATE ANALYSIS OF VARIANCE ............... 95 4.1 The Multivariate Analysis of Variance Test .. 95 4.2 Analysis of the "Kpong Data" .............. 98 5 PAIRWISE MULTIPLE COMPARISONS OF MEANS .......... 110 5.1 Multiple Comparisons of Means Procedures .... 110 5.2 The Effect of the Standard Error on Multiple Comparisons of Treatment Means .......... 121 5.3 Application of the FSD, SSD, MKT, and BET Procedures to the "Kpong Data" ......... 123 5.4 Multiple Comparisons of Treatment Means for the Various Analyses Using Duncan's MRT ........ 133 6 ESTIMATION OF VARIANCE COMPONENTS ............... 144 6.1 Standard Errors and Confidence Limits ...... 144 6.2 Estimation of Variance Components by the Analysis of Variance Method .......... 148 6.3 Estimation of Variance Components by the Restricted Maximum Likelihood Method ....... 152 6.4 Estimation of Variance Components for the "Kpong Data" ............................. 168 7 PSEUDO-FACTORIAL ARRANGEMENT .................... 174 8 CONCLUSION ...................................... 180 ........................................ 184 University of Ghana http://ugspace.ug.edu.gh CHAPTER ONE INTRODUCTION In agricultural field experiments involving treatment comparisons, the statistical tool usually employed is the analysis o f variance, and the design most commonly used is the randomised block design. It is also the practice to grow more than one plant of the same kind on each plot, and observations are taken individually on each plant on a plot. Very often, however, the individual observations are not used in the subsequent analysis; the average yields per plot are used instead. The purpose of this work is to examine, for an agricul­ tural field experiment using a randomised block design with n plants per plot, (i) the various methods of analysis (ii) the models underlying these methods (iii) the assumptions implicit in these models, and to determine whether there is a most "efficient" way of comparing treatment effects in such an experiment. The data used in this study were obtained from an experiment conducted at the Agricultural Research Station, University of Ghana, Kpong, Ghana, in conjunction with the of Botany, University of Ghana, Legon, Ghana. University of Ghana http://ugspace.ug.edu.gh It formed part of a study on "Inheritance of Yield Components and their correlation with Yield in Cowpea", conducted by A.N. Aryeetey of the Agricultural Research Station, Kpong, and E. Laing of the Department of Botany, University of Ghana, Legon. There were 24 varieties of cowpea and these were grown in a randomised block design with three blocks in October, 1971. Each plot consisted of a single ridge 10 metres long with 10 plants spaced 1 metre apart along the ridge. Ridges were spaced 75 cm. apart and only alternative ridges were planted so that the space between two planted ridges was 1.5 me t re s . Five relatively healthy plants were individually harvested from all plots which had five or more plants; no probabilistic sampling procedure was used. Some plots had less than 5 plants and were therefore not harvested. The varieties grown on such plots were not considered in this study. Five characters were measured on each plant: (i) number of pods per plant (x) (ii) average number of seeds per pod (y) (iii) average pod length in mm. (h) (iv) 100 - grain w t . in grams (z) (v) average grain wt. per plant in grams (w). University of Ghana http://ugspace.ug.edu.gh The average pod length and average number of seeds per pod for each plant were based on 15 selected pods, the selec­ tion of the pods not based on any particular criterion. From a bowl containing all the seeds of a plant, 50 healthy ones were taken and weighed. The weight obtained was multiplied by two in determining the mean 100 - seed weight (z) for a plant as some plants had less than 100 seeds. The seeds of all the five plants on a plot were weighed to determine the average grain weight per plant (w). In this study, the average yield per plant is defined as 0 = xyz/100 instead of using w. By this definition, 0 will be available for each of the five selected plants and this will make it possible to carry out two multivariate analyses of variance. 5 Essentially, d = z 0 -/5 measures the same characteristic as i=l w. The data described above will henceforth be referred to as "Kpong data". The work is broken up into the following parts: (i) Discussion of the models for a randomised block design and the assumptions underlying the analysis of variance; (ii) Univariate analysis of variance for a randomised block design with n plants per plot; (iii) Multivariate analysis of variance for a randomised University of Ghana http://ugspace.ug.edu.gh block design with n plants per plot; (iv) Multiple comparisons of treatment means; (v) Estimation of variance components; (vi) Pseudo-factorial arrangement; (vii) Comparison of methods of analysis. 1.1 Randomised Block Experiment and the Analysis of Variance. Suppose that an experiment for comparing I varieties of maize is to be conducted, and that J grains of each variety are available. One may acquire a field and divide it into IJ plots of equal size. The I varieties may then be randomly allocated to the plots in such a way that J of the plots receive one kind of variety. The principal objection to the above design is on grounds of accuracy. In spite of the randomization employed in alloca­ ting the varieties to the plots, it is possible that one kind of variety will fall on plots which have high fertility while another kind will fall on plots which have low fertility. In such a situation, there will be differences which are due to the fertility fluctuations of the plots rather than the varie­ ties . A procedure which will isolate any soil fertility fluctuations from the experimental error is to stratify the field into J blocks in such a way that within each block there University of Ghana http://ugspace.ug.edu.gh is as little variation as possible among the plots. Each block is then divided into I plots of equal size. The I varieties are then allocated at random to the plots subject to the constraint that each occurs once in each block. For this design, the soil fertility fluctuations are represented by differences between the blocks. It is possible then to isolate these block differences from differences due to the varieties being compared. An experimental design of the nature described above is called a randomised block design. It is mostly used in experiments where, though there is considerable variation in the experimental area, it is possible to divide the area into blocks in such a way that within each block there is as littl variation as possible. The mathematical models for a randomised block design will be discussed in Chapter 2. Analysis of variance, which is used in analysing randomi sed block experiments, is a technique for estimating how much of the total variation in a set of data can be attributed to one or more sources of variation; the remainder, not attribu­ table to any source, being classed as the residual or error variation. There are related tests of significance by which we can decide whether the sources have probably resulted in real variation or whether the apparent variation ascribed to University of Ghana http://ugspace.ug.edu.gh them is only a chance happening. The related tests of significance, and calculations of fiducial limits for estimates, often require that the errors be normally distributed and have a common variance. Further­ more, with the fixed-effects model (Chapter 2), it is generally necessary to assume that the treatment and environmental effects are additive so as to get exact tests and confidence intervals concerning the main effects. 1.2 Univariate Analysis of Variance . The analysis of variance technique will be used to compare treatment differences in a randomised block design with n plants per plot in the following cases: (i) Using the average yields per plot under the assumption that the yields are normally distributed (normal - theory model). (ii) Using yields of the individual plants on a plot under the normal - theory model. (iii) Using the average yields per plot under the randomization models. The F-test will be used to test for treatment differences in (i) and (ii) and as an approximate test in (iii) . The tests will be compared to find out which one gives the most sensitive comparison of treatments. These tests will be based on observations of only one vnriafe yield - hence the term univariate analysis of University of Ghana http://ugspace.ug.edu.gh variance. The tests reduce to the comparison of two independently distributed mean squares. One of the mean squares is an unbiased estimate of the error variance while the other is an unbiased estimate only when the null hypothesis which is being tested is true and may be called the mean square due to deviation from the hypothesis. A description of the analysis of variance technique and the corresponding analysis - of - variance CANOVA) tables will be given for each of the three cases in Chapter 3. These results will then be applied to the "Kpong data". In experiments with more than one plant per plot, sampling of the plots is often resorted to. The loss of information due to sampling will be examined. 1.3 Multivariate Analysis of Variance. In a multivariate analysis of variance, observations on more than one variate are used. This is usually done when the character of primary interest, say yield, is thought to be correlated with other variates such as size of leaf, height of plant, and length of cob. Each observation will then be a random vector consisting of measurements of these correlated variates. The process of analysing the variances and covariances of multiple correlated variates is termed multivariate analy­ sis of variance or analysis of dispersion. The distribution University of Ghana http://ugspace.ug.edu.gh theory of the multivariate analysis of variance is analogous to that of the univariate case and the test criteria are analogous to F-tests. The multivariate analysis of variance technique will be used to compare treatment differences in a randomised block experiment with n plants per plot in the following cases: (i) Using yields of the individual plants on a plot (ii) Using p correlated variates. The results will be applied to the "Kpong data" in Chapter 4. 1.4 Multiple Comparisons of Treatment Means. In experiments involving treatment comparisons, if the treatments have been shown to be significantly different, it is often necessary to investigate further how different they are. Given that treatments A, B, C, and D are significantly different, one may wish to know whether treatment A differs significantly from treatment B, or whether treatments A and C differ significantly from treatments B and D. Investigations of this nature lead to what is termed "multiple comparisons of means" in which the means of the treatments are compared by some special tests. Various procedures are available for the multiple compa­ risons of means. These include University of Ghana http://ugspace.ug.edu.gh (i) a "protected" least significant difference (LSD) or Fisher's significant difference (FSD) due to Fisher (18); (ii) a multiple range test (MRT) by Duncan (12); (iii) the method of Student-Newman-Keuls (SNK) (14) ; (iv) Tukey's significant difference (TSD) due to Tukey (34); (v) Scheff^'s significant difference (SSD) due to Scheffe' (33) ; (vi) Bayes exact test (BET) procedure by Waller (41) , which is an improvement upon and extension of a Bayesian procedure developed by Duncan. Waller and Duncan (41) have discussed the various proce­ dures and the results they give. According to them, a difficulty with these procedures lies in differences in the principles on which they are based and considerable disparities in the results to which they can lead. They considered the multiple comparisons problem as one of many decisions. As such, a full analysis requires the consideration of many different kinds of decision errors and the prior probabilities of their occurrence. Whether it is more appropriate to use one approach other than another depends on which comes closer to minimizing the overall weighted average of the decision errors, its "loss" and its relative prior probability. It was from this point of University of Ghana http://ugspace.ug.edu.gh view that the BET was developed. They regarded the BET as not much different from the FSD. Carmer and Swanson (5) used computer simulation techniques to study the Type I and Type II error rates and the correct decision rates for the procedures listed above. Their results indicate that the SSD, TSD and SNK tests are less appropriate than either the FSD with = = 0.05, BET or MRT. In fact, they considered FSD with a = 0.05 and BET as the best available procedures so far for use. The above findings by Carmer, Swanson, Waller, and Duncan will be examined by applying the FSD, BET, MRT, and SSD proce­ dures to the "Kpong data" in Chapter 5. The error variances in the cases given in Sections 1.2 and 1.3 above differ from one another. Since a multiple compa­ risons of treatment means is dependent on the error variance, the different error variances will give rise to different results for any given multiple comparisons of means procedure. The different results will be examined in Chapter 5 to find out the extent to which they differ. 1.5 Estimation of Variance Components , Variance components estimation is an important aspect of agricultural experiments. Experimenters normally wish to have an idea of the variances associated with the various effects University of Ghana http://ugspace.ug.edu.gh 11. and to test hypotheses about them. The expressions contain­ ing the variance components are obtained by taking expectation of the mean squares. These expressions vary with mathematical models. It is therefore expected that estimates of the variance components for cases (i) and (ii) of Section 1.2 may differ from one another. This will be investigated in Chapter 6. When there are equal number of observations for each treatment (that is, for a "balanced data"), the analysis of variance method (35) is most commonly used to estimate variance components. This choice is motivated by the fact that estima­ tors thus obtained are minimum variance quadratic unbiased when normality is not assumed, and are minimum variance unbiased when normality is assumed. Unfortunately, though variance components are, by definition, positive, the analysis of variance method can lead to negative estimates (28) which cannot be satisfactorily explained. An alternative method which avoids negative estimates but whose estimators may be biased is the restricted maximum likelihood (39). In Chapter 6, the two methods given above will be applied to cases (i) and (ii) of Section 1.2. 1.6 The Pseudo-Factorial Type of Arrangement . With a large number of treatments to be compared in an University of Ghana http://ugspace.ug.edu.gh agricultural field experiment, a randomised block design containing the whole of the treatments is unsatisfactory. The blocks are likely to be too large and it may be difficult to eliminate fertility differences efficiently. To keep the block size small, some authors have suggested the selection of one or more treatments as controls and to divide the rest into sets, each set being arranged with the controls in a number of randomised blocks. Unfortunately, as pointed out by Yates (44) , this method has the disadvantage that comparisons between treatments in different sets are of lower accuracy than comparisons between treatments in the same set. There is the further disadvantage that a disproportionate number of plots is devoted to the control treatments, thus further lowering the efficiency. To cope with the large number of treatments, and at the same time to avoid the use of excessively large blocks and controls, Yates (44) has suggested a new type of arrangement of the treatments. The arrangement is known as pseudo-factorial and is described in Chapter 7. 1•7 Comparison of Methods of Analysis . The F tests used to test for treatment differences in cases (i) and (ii) of Section 1.2 will be compared to determine which one (if any) is the most sensitive for detecting diffe­ rences between treatments. The comparison will be made by 12. University of Ghana http://ugspace.ug.edu.gh 13. examining the powers of the various F tests. Comparisons will also be made of the following: (a) the two multivariate analyses of variance listed in Section 1 - 3 J (b) the various multiple comparisons of means resulting from the cases listed in Sections 1.2 and 1.3; (c) variance components estimation under cases (i) and (ii) of Section 1.2. The conclusions to be derived from, these results, and recommendations, if any, will be given in Ghapter 8, i CVe'yL; s' University of Ghana http://ugspace.ug.edu.gh CHAPTER TWO RANDOMISED BLOCK EXPERIMENTS AND THE ANALYSIS OF VARIANCE 2 .1 Mathematical Mo.dels. Consider the randomised block experiment described in Section 1.1. We shall assume that there are n plants on a plot instead of one. The symbol xijk will be used to denote the kth observation on the plot belonging to the i^h row and the j column. Let p = general mean , nij = plot mean , Ai = mean of the ith variety, Bj = mean of the block. The main effect of the i^h variety is defined as “i = A i ~ P .....(2.1.1) • Similarly, the main effect of the block is 3 j = B j - p .....(2.1.2) . The main effect of the i^h variety specific to the j ^ block is n^j - Bj . The interaction of the it^ variety with the jt*1 block is defined as ^i j = (nij Bj) - (A^ — p) j = nij- - Bj - A ± + y . ..... (2.1.3) University of Ghana http://ugspace.ug.edu.gh On substituting = y + and Bj = u + 3j in (2,1.3) , we obtain E(x--i) = n- - = U + “ • + 2 . + z • -u i]kJ ij i j i] * Thus the mathematical model for the individual observations is given by xijk = V + ai + 6j + ^ij + eijk. ..... (2.1.4) The are errors assumed to be uncorrelated with zero expectations and a common variance a^ . The effects of the violation of this assumption have been studied by Cochran (6) and Scheffe'’ (34) . They found out that there could be substantial biases in the standard errors of estimates, and the t-tests might be impaired, if the errors are correlated. It has however been noted that for randomised experiments the errors may safely be regarded as uncorrelated. I . Fixed-Effects Model. When interest lies only in the treatments and blocks actually used in the experiment and all inferences made are restricted only to these, then the treatment effects block effects gj, and interaction effects in the mathematical model (2.1.4) are fixed effects. If the errors e^jj. are assumed to be independently normal with mean zero and a common variance 0g , we obtain what is termed as the fixed-effects model. Under University of Ghana http://ugspace.ug.edu.gh this model, and in view of equations (2.1.1), (2.1.2) and (2.1.3), we have the side condition S a ~ r 3 ™ ~ £ ^ii — ^ ^ii — ^ • •.. • • (2.1.5) i j i j The data available for this situation are thought of as one of the many possible sets of data (involving these same treat­ ments and blocks) that could be derived in repetitions of the experiment, repetitions in which the e--v 's on each occasion1 j K would be a random sample from a population of error terms that O have zero means and a common variance cre . II. Random-Effects Model. When interest in the experiment is unlikely to centre only on those treatments and blocks actually used in the experiment, then the “i'S} 3j's and \ j ' s are random effects assumed to be uncorrelated among themselves and with the errors. If, in addition, we assume that the e..v 's are1] K 2 independently normal with mean zero and a common variance cre we obtain the random-effects model. The “i's, Bj's and \j': are now random samples from populations of “^'s, gj's and respectively, with respective variances a j - , . From equations (2.1.1), (2.1.2) and 2.1.3), we observe that the means of the various effects taken over their respective popu­ lations are zero. The above conditions can mathematically be University of Ghana http://ugspace.ug.edu.gh 17. represented as (i) Efccj) = ECgj) = Et^j) = o ; (ii) Var(cci) = a j \ Var(gj) = ag ; Var(*ij) = cr% . (iii) cc. 3]’ Z • ■ and e - a r e uncorrelated j 1J 1JK (iv) eijk N(0, cre2) ,..(2.1.6) Under these conditions we conceive of repetitions as taking random samples of treatments and blocks on each occasion. III. Mixed-Effects Model. When a model can be conceived of as having both fixed and random effects, we obtain a mixed-effects model. In the mathematical model (2.1.4), if the block effects, (3j y are considered random and the treatment effects, cc^ t fixed, then the interaction effects, ^ ^ a r e random and the model becomes a mixed-effects model. Mathematically, we have ( i ) 2> . = 0 ; ( i i ) E ( S j ) = E ( ( i i i ) V a r ( g . ) = a 32 ; ( i v ) g j , TL.. and (v ) e i j k ^ N C ° , a e ‘ 'ijk >ij) - »» ; are uncorrelated ; > ,.(2.1.7) University of Ghana http://ugspace.ug.edu.gh 18. The models above may be called normal-theory models (34) (or infinite models) because of the normality assumption associated with the e--, 's.ijk IV. Randomization Models, Randomization models (4, 34) (or finite models) have the same characteristics as normal-theory models except that there is no normality assumption about the errors. The random errors are regarded as uncorrelated with expectation zero. These models take into account the randomization employed in assigning the treatments to the experimental plots. Detailed discussion of the randomization model for a randomised block experiment is given in Chapter 3. 2.2 Assumptions Underlying the Analysis of Variance, I . Normality of Errors . The general mathematical model for a randomised block design was given in Section 2.1 as X.., = y + <>:. + g. + . . + e..n ijk i i] 13 k * If the error are normal with mean zero and variance oe , the observations will also be normal with mean v + + ' s - University of Ghana http://ugspace.ug.edu.gh and variance ONT2 if all the effects are fixed, or with mean p and variance 19. if the effects are random. Thifs the assumption of normality of errors implies normality of the observations. The theoretical distributions of the statistics upon which tests of significance and calculations of fiducial limits are based are derived under the assumption that the observations are normally distributed. The effect of the violation of the normality assumption has been studied by Box (3), Scheff^ (34) and others. In their studies, Scheff^ and Box showed that though the effect of violation of the normality assumption is slight on inferences involving fixed effects (tests of hypothe­ ses about fixed main effects or interactions) it is serious on inferences involving random effects - tests for the equality of variances, and confidence intervals for error variance, variance components or ratio of variance components. In view of the above, it has been suggested that we verify the normality assumption for any data whose analysis will involve inferences of the type given above. When the normality assumption is violated, the data may be transformed by a suitable transformation (2). Certain transformations are able to change non-normal data to normal. University of Ghana http://ugspace.ug.edu.gh 20. Several tests are available for testing normality. Among them are the large sample methods (26) , W-test (36), and non-parametric goodness of fit test (26). The large sample methods use the criteria of Skewness and Kurtosis since these are two important ways by which a dis­ tribution may depart from normality. A large sample size is required for this test to be applicable. Non-parametric goodness of fit test for normality is based on the fact that if x^, X2 »...x is a random sample from a distribution of the continuous type with cumulative function F and the transformation Yj = F(Xi) ; 0 f F(Xi) 1 1, i-1,2,.. ,n is made, then -2 ? log Y, i = l 1 2 is distributed as x with 2n degrees of freedom. As a test of normality, this test statistic tests the hypothesis that the x^'s come from a normal population with mean and variance equal to those of the sample against the broad and vague class of alternatives that the x^'s come from a population which differs from these specifications. Since the alternatives are undefined, little is known about the power functions of the test in such applications. University of Ghana http://ugspace.ug.edu.gh The W-test for normality, proposed by Shapiro and Wilk (36), provides an index or test statistic to evaluate the supposed normality of a complete sample. It is scale and origin inva­ riant and hence supplies a test for the composite null hypothe­ sis of normality. The power functions of the test have been calculated and have shown the statistic to be an effective measure of normality for small samples (n i 50). Lack of knowledge about the power functions of the non- parametric goodness of fit test makes the W-test the most plau­ sible for testing normality of small samples (n 1 50). Since the samples provided by the "Kpong data" are small (n = 21), the W-test will accordingly be employed in testing for their normal­ ity. The W-test for Normality (complete samples). Let m' = (m-pii^, n^) denote the vector of expected values of standard normal order statistics, and let V = be the corresponding n x n covariance matrix. That is, if 21, denotes an ordered random sample of size n from a normal distri­ bution with mean zero and variance 1, then (i) E(xi) = mi, i=l,2, n * (ii) Covfxj, x . ) = i,j = 1,2, (2.2.1) .n University of Ghana http://ugspace.ug.edu.gh -4 22, Let y* = fy ...... ) denote a vector of ordered randomi w ^ > n observations. The objective is to derive a test for the hypothesis that this is a sample from a normal distribution awith unknown mean y and unknown variance a . If the are a normal sample the y^ may be expressed y^ = U + ox. j i=l,2,...n «. It follows from the generalized least-squares theorem that the best linear unbiased estimates of y and a are the quanti­ ties that minimize the quadratic form Q = (y - jil - a m ) ' V ‘*'(y-yI - am) j ..... (2.2.2) where I ’ = (1,1,....1) • The estimates are - _ m 1V-1(ml' - Im')V J y . ro ^\x • —= ^ • I 'V Im' V m - (I ' V-1m) 0 - = I ’V 1(Im1 - m l ')V~1y e ..... (2.2.4) I'V~1Im,V"1m - (11V-1m)2 For symmetric distributions, I ’V ^m = 0, and hence equations (2.2.3) and (2.2.4) become v = £ ? y. = y ; (2 .2 .5 ) i = l 1 * University of Ghana http://ugspace.ug.edu.gh - _ m'V 1y rn 0a * m' V--*-in 7 ^ — 2 Let R = E (y- - y) (2.2.7) i=l ± 2 denote the usual symmetric unbiased estimate of (n-l)a . Shapiro and Wilk (36) defined the W-test statistic for norma­ lity as n 2 T4 *2 .2 r„ , , 2 M aiy i^ W = V " V = ~ ^ 7 = 1 ^ ' = “M ------- 5 .....(2.2.8)r ^ D" r>2 n 2 Tl _ 2 2 Cy,- -y) i = l 1 23. where CM C2-2 -9) where the values of a . are tabulated by Shapiro and Wilkn-i+1 ' r (36). (ii) for n odd, n = 2k +1 } b = anfrn-yi> + + ak +2 ^ k + 2 " ^ * In carrying out the test, W is calculated and its value compared with the tabulated value at the required confidence probability. Small values of calculated W indicate non­ normality . This test will be applied to the "Kpong data" in Section 1.4. University of Ghana http://ugspace.ug.edu.gh II. Additivity of Treatment and Environmental Effects. The general mathematical model for a randomised block design with one observation per plot is given by X.. = y + where £=, = = z ^ - = 0 i j J i J j J The hypothesis we would like to test is ^AB ' \ j ~ i-1,.. .1, j— 1,...J * As pointed out earlier on, the error variance in model (2.2.11) cannot be estimated. Thus, it is not possible to test for interactions by the F-test. Tukey (40) managed to obtain a test of by imposing some restrictions on the {3..}. These restrictions consist i] in assuming the { ^} to be of the form ^j j - ^tti ^ j > .....(2.2.12) where X is a constant. He then asserted that if we could construct a test for the hypothesis X = 0, it would serve as a test for non-additivity, University of Ghana http://ugspace.ug.edu.gh And that is what he showed. With the above definition of ^ , the expectation of is given by E ( X. .) = y + <*. + g. + .... (2.2.13)v. x y 1 Pj v J These expectations are not linear when x f 0. The least square theory is therefore not applicable. We note, however that since E(Xi . - X..) = t E(X.. - X..) = g . 3 3 5 E(X..) = y , where (i) X.. = J_ e eX.. is the overall mean \ IJ i j (ii) X-. = — EX. . are the row means * 1 J j 3 J 1 (iii) X.. = — E X-• are the column means; 3 I i J the unbiased estimators of , e j and y are given by = X.. - X.. ; ^ = X . . - X.. ; {I = X. , • Furthermore, E(X.. - X..- X.. + X..) = X=-0.. 1J J 1 3 Hence an estimate of X may be written, when and 6^. are known, by using the least square theory on equation (2.2.13) University of Ghana http://ugspace.ug.edu.gh 28. as E£«i6i CX- - - - g... - X. .) y * = ---- -— i— „— ---- 2--------e a a EE<* • g • l pj Substituting estimates and g^, an estimate of X in terms of known quantities is obtained as EECXj. - X. .) (X. . - X..)(X - X - X.. + X.. X = ------------------ 1 ------------ ■*------ E ( X i . - X..)2 E(X.j - X..)2 This simplifies to IJ EE (X. . - X..)(X.. - X..)X.. r? _ ^ — 1 J lj » (SSV) (SSB) where (i) S S y = J e()L. - X..)2 > (ii) SSB = I E CX.. - X..)2 * j J A A A The variance of X , for given and g^ , is given by Var(^ = I2J2ae2EE(Xi... X..)2 (X.. - X..)2 (SSy)2 (SSg) 2 = IJqe2 .... (2.2.15) (SSy)(ssB) ) • 14) If X = 0, E(x|°=^ , p. for all i,j) = 0 t> University of Ghana http://ugspace.ug.edu.gh Thus under H^g, X ^ N (0, Var(x)) 29. Therefore X Afar(x) N (0,1) Hence G - i»ii = Var(x) (SSV)(SSB)ae2 (2.2.16) is distributed as chi-square (43) with one degree of freedom Cx:) • The "interaction sum of squares" is given by IE(/i;j)2 = EE (X*^iBj)2 j = (X*) 2I-i2 3 ,[£EV j . (Xij. - “i - - X..)] £^±2 SSj2 Replacing a and is distributed as X (I - 1)(J - 1)-l Hence SSN j^ I -1 (J - 1) - fj Fo (2.2.18) SSE _ SSN has an exact F-distribution (43) with 1 and (I - 1)(J - 1) - 1 University of Ghana http://ugspace.ug.edu.gh 31. degrees of freedom F1 ’(I - 1)(J - 1) - 1_[ lf X = °* Now SS^ is the sum of squares due to non-additivity and is part of the error sum of squares SS^. Thus SS' = SS- - SSME E N is the new error sum of squares devoid of non-additivity. Tukey's test for non-additivity is therefore provided by the test statistic F . The power of this test is unknown. It is, however, expected to be good against alternatives of the type (2.2.12). The analysis of variance table for carrying out the test is given in Table 2.2.1. Table 2.2.1 * Analysis of Variance Table for Testing Non-additivity in a Randomised Block Design . Source Degrees of Freedom Sum of Squares Mean Square F Non-additivity 1 :SV - ■ SSN SSN [(I-1)(J-1) -i I Residual (1-1) (J-l)-l SSE SS]! (i-D(j-i)-i SSE The test will be applied to the "Kpong data" in Section 2.3 (II) . University of Ghana http://ugspace.ug.edu.gh The methods for testing equality of means, calculating the true error probabilities of Type I error, and carrying out multiple comparisons of means are derived from assumptions that include the equality of error variances. The effect on these methods when the error variances are not equal has been investigated by Hornsnell (21) , Eisenhart (13), and others (34,19). Their studies indicate that the effect is serious when there are unequal number of observations on each plot and the I treatments have unequal sizes. The effect is,however,not very serious for equal treatment and plot sizes. Scheffe/ (34) does not recommend a preliminary test of the assumption of equality of error variances because he is con­ vinced that the effect of the violation of the assumption is not very serious. On the other hand, some authors (6,13,21) feel that even when there are equal numbers of observations on each plot and the treatments have equal sizes, it is a worthwhile practice to test for the equality of the error variances. When the error variances are unequal, one may use a suitable transformation to transform the data. Sometimes, when the assumptions of normality of errors, additivity of treatment and environmental effects, and homogeneity of error variances are violated simultaneously, one transformation may remedy all the defects. In some cases, however, one defect III. Homogeneity of Error Variances. University of Ghana http://ugspace.ug.edu.gh may be remedied at the expense of another. In such a situation, one has to consider, in the light of the experi­ ment, the defect which is more serious. Several procedures for testing equality of error variances have been proposed. Among them are those developed by Neyman and Pearson (29) , Box and Anderson (4), and Bartlett (38). The original test of this nature was that obtained by Neyman and Pearson (29) who suggested the use of a criterion L which was the ratio of a weighted geometric to a weighted arithmetic mean of the mean squares from which the variances were estimated. On the assumption that variation followed the normal law, these authors (i) gave the sampling moments of L if the hypothesis of equal sampling variances was true; (ii) showed that in the case of large samples, a -N Loge L was distributed as x with k - 1 degrees of freedom (where N was the total number of observations and k the number of separate estimates of variance); (iii) suggested a method of calculating approximate probability levels for L in the case of small samples. a o-f this test was its non-applicability to small University of Ghana http://ugspace.ug.edu.gh 34, samples without resorting to approximations. Approaching from another angle, Bartlett (38) suggested an analogous test in which the sum of squares were weighted with their appropriate degrees of freedom instead of with the number of observations as in Neyman-Pearson criterion. Furthermore, unlike the Neyman-Pearson criterion where approximate probability levels of L need to be calculated in the case of small samples, Bartlett's test just requires a corrective factor (38) to make it applicable to small samples. Box (3) has shown that Bartlett's test is extremely sensitive to non-normality, and that it is essential to have the original data normally distributed before applying the test. Making use of the permutation theory, Box and Anderson (4) developed another test for equality of error variances. Essentially, it is a modification of Bartlett's test under the consideration of permutation theory. Unlike Bartlett's test, however, it is insensitive to non-normality, and has less power when the data are normal, A preliminary test on the "Kpong data" (Section 2.3 (I)) showed them to be normally distributed. Also, the samples furnished by the data have small sizes. From the above discu' ssion therefore, Bartlett's test is the most appropriate (among those discussed in this section) for testing equality n f the "Kpong data". University of Ghana http://ugspace.ug.edu.gh Bartlett's Test for Homogeneity of Error Variances, 2 2Let S. be the usual unbiased estimate of a . based on a 1 x sum of squares having f^ degrees of freedom, and let there b k of these estimates from independent sets of observations, Bartlett (38) took as his test statistic M = N Log e f.s: 1 = 1 1 1 - E (f. log St) , ....(2.2 i=l e 1 where N = E f. i = l 1 Provided that none of the degrees of freedom f^ is too small he found that M is distributed approximately as x with k - 1 degrees of freedom if the null hypothesis is true - that is, 2 if the (i=l,2, k) have a common value. For small samples, Bartlett introduced the corrective factor C = 1 + 3(k - 1) 1_ N (a . a . a‘ ) and showed that the quantity M/C followed approximately the a same x distribution law. If all the samples have equal sizes n, then f-^ = f2 = ..... = fj. = n - 1, and N = k(n - 1) . Equations (2.2.19) and (2.2.20) therefore become University of Ghana http://ugspace.ug.edu.gh 36. s .....(2.2.21) C = 1 + ---L _ L 2 — . .....(2.2.22) 3k(n - 1) 2.3 The "Kpong Data1: The appropriate tests for verifying the assumptions underlying the analysis of variance have been discussed in Section 2.2. We shall now investigate the assumptions for the "Kpong data". I . Testing for Normality. The "Kpong data" were obtained from a randomised block design with three blocks, each block consisting of 21 plots. The 21 observations from each block form a sample. Thus, there are three samples each of size 21. If there are block effects, the observations from each block may be assumed to have come from a different normal population. In testing for normality therefore, each sample will be considered separately - It was shown in Section 2.2 (I) that for small samples (n < 50), the W-test for normality, also discussed in that M = (n - 1) k lQgp ES. i 1 E log S." i = l e 1 University of Ghana http://ugspace.ug.edu.gh 37. h 2 W = — — 5 (2.3.1) a R where R2 = z (y, - y)2 = z yv2 - ^Eyk-^ j ..... (2.3.2) k=l K k=l K “ and for n = 2k +1, b = an^yn - y l-* + --- + ak + 2^yk + 2 “ yk^ (2.3.3) Since the y^'s are ordered random observations, the observations in each block will be arranged in ascending order before computing W. The tables necessary for computing W are presented below. section, is the appropriate one to use. The test statistic was given as University of Ghana http://ugspace.ug.edu.gh 38. Table 2.3.11 The Observations y^ and Their Squares, y^ , for the 3 Blocks . B L O C K 1 B L O C K 2 B L 0 C K 3 ^k yl ^k A yk 2 ^k 5.56 30.914 2.83 8.009 3.62 13.104 6.37 40.577 5.07 25.705 5.66 32.036 13.44 180.634 6.42 41.216 6.87 47.197 13.99 195.720 10.96 120.122 8.76 76.738 15.23 231.953 12.85 165.123 11.77 138.533 16.39 268.632 14.52 210.830 12.87 165.637 19.49 379.860 14.58 212.576 16.15 260.823 21.30 453.690 14.68 215.502 18.02 324.720 24.58 604.106 18.44 340.034 18.40 338.560 24.79 614.544 19.05 362.903 18.41 338.928 i 26.39 696.432 22.06 486.644 18.68 348.942 V 27.23 741.473 22.39 501.312 21.82 476.112 27.58 760.656 24.67 608.609 28.93 836.945 33.21 1102.904 30.62 937.584 29.81 888.636 34.52 1191.630 34.61 1197.852 34.70 1204.090 39.47 1557.881 39.84 1587.226 53.67 2880.469 ' 45.58 2077.536 47.01 2209.940 55.80 3113.640 > 47.11 2219.352 52.59 2765.708 60.34 3640.916 48.71 2372.664 58.20 3387.240 64.20 4121.640 50.52 2552.270 62.63 3922.517 66.17 4378.469 53.68 2881.542 70.32 4944.902 71.61 5127.992 Total 595.14 21155.040 584.34 24251.554 626.26 28754.127 University of Ghana http://ugspace.ug.edu.gh 39. Table 2.3.2 *. The Coefficients %-k+l and the Values (yn_k+i " y^ ) Using the Observations from Block 1 . k an-k+l (yn-k+i - y^ an-k+1^Yn-k+l 1 0.4643 48.12 22.342 2 0.3185 44.15 14.062 3 0.2578 35.27 9.093 4 0.2119 33.12 7.018 5 0.1736 30.35 5.269 6 0.1399 23.08 3.229 7 0.1092 15.03 1.641 8 0.0804 11.91 0.958 9 0.0530 3.00 0.159 10 0.0263 2.44 0.064 b =63.835 University of Ghana http://ugspace.ug.edu.gh 40. Table 2.3.3; The Coefficients s^ y-^ +i and the Values (yn_^+^ - y^ ) Using the Observations from Block 2 * k ^-k+l frn-k*l " >k) an-k+l C^n-k+1 ~ yk^ 1 0.4643 67.49 31.336 2 0.3185 57.56 18.333 3 0.2578 51.78 13.349 4 0.2119 41.63 8.821 5 0.1736 34.16 5.930 6 0.1399 26.32 3.682 7 0.1092 20.03 2.187 8 0.0804 15.94 1.282 9 0.0530 6.23 0.330 10 0.0263 3.34 0.088 b =85.338 University of Ghana http://ugspace.ug.edu.gh 41. Table 2.3.4 » Hie Coefficients and the Values (yn_jc+^ " Yy) Using the Observations from Block 3 • k V k + l (V k +i - ^ an-k+l(yn-k+l “ yk^ 1 0.4643 67.99 31.568 2 0.3185 60.51 19.272 3 0.2578 57.33 14.780 4 0.2119 51.58 10.930 5 0.1736 44.03 7.644 6 0.1399 40.80 5.708 7 0.1092 38.55 4.210 8 0.0804 11.79 0.948 9 0.0530 10.53 0.558 10 0.0263 3.41 0.090 b =95.708 University of Ghana http://ugspace.ug.edu.gh 42. Let us consider the observations from Block 1. From Table 2.3.1, sy^ . = 595. 14 and £y ^ = 21155.04 Hence n2 2 (Syk)2 R = Eyk ■ = 21155.04 - C595-14) j a2 = 4288.773 - From Table 2.3.2, b = 63.835 . Therefore 7 w - 4 . R C63.835)2 4288.773 J 0.950. For n = 21, the tabulated 51 point of W is 0.908, Since the composite null hypothesis of normality is being tested only small values of calculated W indicate non-normality. Thus, according to the above test, the observations from Block 1 come from a normal population, at the 51 level. Let us consider the observations from Block 2. Tables 2.3.1 and 2.3.3 provide the following information: zyk = 584.34; Ey2 = 24251.554; b = 85.338, University of Ghana http://ugspace.ug.edu.gh Thus R = i r l - J Z n . K n 43. 2 = 24251.554 - = 7991.877 . (5 84.54) 21 Therefore W = R (85.338)' 7991.877 = 0.911 . For n = 21, the tabulated 54 point of W is 0.908. Thus there is no evidence of non-normality at the 54 level. Let us finally consider the observations from Block 3. Tables 2.3.1 and 2.3.4 provide the following information: Zyk =626.26; Ey2 = 28754.127; b = 95.708 . Thus 2 « £ y \ - t^lQ, k n 28754.127 10077.862 (626.26) a2 There fore W = (9 5.708) 10077.86 = 0.910 For n = 21, the tabulated 54 point of W is 0.908. Thus, there is no evidence of non-normality at the 54 level. The above results indicate that the observations, and hence the errors, are normally distributed. University of Ghana http://ugspace.ug.edu.gh II. Testing for Additivity of Treatment and and Environmental Effects. It was shown in Section 2.2 (II) that the presence of non-additivity in a randomised block experiment could be tested for by using Tukey's test provided the normality assump tion was satisfied. The test carried out in Section 2.3 (I) verified the normality assumption. Accordingly, for the "Kpong data", Tukey's test will be employed to test for additi vity of treatment and environmental effects. The test was described in Section 2.2 (II) and the analysis of variance table for its performance was given in Table 2.2.1 The sum of squares for non-additivity was given as SS, N z(.\. - x . o V x . . - x . . ) 2 (Id?) (Zd?) (2.3.4) The residual for testing non-additivity was given as where (i) SSP = 2 S(X.. - X.. - X.- + X . . ) 2 , E n- i J i Jij 13 2 _™2 (2.3.5) r carrying out the test are presented below. University of Ghana http://ugspace.ug.edu.gh 45. Table 2.3.5 . Tukey's Test for Non-Addivity. Yields in grams per Plot for 21 Varieties of Cowpea . Varieties B L O C K S Row Totals T..l Row Me 5 V1 2 3 V1 26.39 22.39 21.82 70.60 23.533 V2 50.52 22.06 16.15 88.73 29.577 V3 5.56 5.07 5.66 16.29 5.430 V4 45.58 58.20 64.20 167.98 55.993 V5 39.47 34.61 34.70 108.78 36.260 V6 27.58 47.01 60.34 134.93 44.977 V7 47.11 24.67 55.80 127.58 42.527 V8 34.52 39.84 29.81 104.17 34.723 V9 27.23 30.62 18.68 76.53 25.510 V10 48.71 62.63 71.61 182.95 60.983 V11 53.68 52.59 53.67 159.94 53.313 V12 33.21 70.32 66.17 169.70 56.567 V13 24.58 12.85 18.41 55.84 18.613 V14 19.49 14.52 18.40 52.41 17.470 V15 6.37 2.83 3.62 12.82 4.273 V16 24.79 18.44 18.02 61.25 20.417 V17 13.44 19.05 6.87 39.36 13.120 V18 13.99 6.42 8.76 29.17 9.723 V19 15.23 10.96 11.77 37.96 12.653 O 1 16.39 14.58 12.87 43.84 14.613 V21 21.30 14.68 28.93 64.91 21.637 University of Ghana http://ugspace.ug.edu.gh 46. Table 2.3.5 (continued) B L O C K S 1 2 3 Column Totals T.j 595.14 584.14 626.26 Column Means X.. 3 28.340 27.827 29.822 d. = X.. - X.. 3 J -0.323 -0.836 1.159 o11rd'-' w 4 0.104 0.699 1.343 Id? = 2.146 T?. J 354191.620 341453.236 392201.588 ET?.=1087846.444 E E X?. = 74160.721; X.. = 1— , 74 = 28.663 i j V 63 University of Ghana http://ugspace.ug.edu.gh 47. Table 2 .3.5 (continued) & 11 X I H* 1 X \ •• Ti- V fi: S diPi d2l -5.129 4984.360 -1.953 10.017 26.307 0.914 7873.013 -16.042 -4.662 0.835 -23.233 265.364 0.525 -12.197 539.772 27.331 28217.280 11.031 301.488 746.984 7.597 11833.088 -1.466 -11.137 57.714 16.314 18206.105 21.726 354.438 266.147 13.864 16276.656 28.831 399.713 192.210 6.061 10851.389 -9.906 -60.040 36.736 -3.153 5856.841 -12.743 40.179 9.941 32.321 33470.703 14.904 581.712 1044.647 24.651 25580.804 0.900 22.186 607.672 27.907 28798.090 7.176 200.239 778.633 -10.049 3118.106 2.655 -26.680 100.982 -11.193 2746.808 2.892 -32.370 125.283 -24.389 164.352 -0.228 5.561 594.823 -8.245 3751.563 -2.538 20.926 : 67.980 -15.543 1549.210 -12.305 191.257 241.585 -18.939 850.889 0.267 -5.057 358.686 16.009 1440.962 -0.441 7.060 256.288 -14.049 1921.946 -2.567 36.064 197.374 -7.026 4213.308 14.378 -10 1 . 0 20 49.365 Ed, - 0 ST?. =211970.837 £P.=45.096 £d.P.= P = Id? =l l l l l l 1807.677 6299.964 University of Ghana http://ugspace.ug.edu.gh From equation (2.3.4) p2. _ (1807.6 77) 2 N ,2, ,2'( s d f H s d p (6299.964) (2.146) = 241.69 8 From equation (2.3.5) SS- = EE X . . E 13 I N A R 1_T 2 XT.f . + T 2 IJ Hence SS, n t.c rs ™ 211970.837 1087846.444 . 1805.74 ./4I6U. //I ------------------------ + ------- > 21 21 x3 = 3458.657 SS-r, - SS„ E N = 3458.657 - 241.698 , = 3216.959 . The analysis of variance table is given below. Table 2.3.6 “ Analysis of Variance for Testing Non-additivity in a Randomised Block Design,. Source Degrees of Freedom Sum of Squares Mean Square. Non-additivity 1 241.698 241.698 Residual 39 3216.959 82.486 University of Ghana http://ugspace.ug.edu.gh From Table 2.3.6, M P ASDRBLK P L G b 1 * 39 82 .468 * The tabulated value of ^ . 3 9 at the 54 level of significance is 4.09 (by linear interpolation). Thus, non-additivity is not significant and hence the treatment and environmental effects may be assumed to be additive. If the value of F obtained from the ANOVA table were large, non-additivity would be indicated and a decision would have to be made about procedure. According to Snedecor (37), help in making the decision comes from a graph of P^ as ordinate against the mean . , where P^ = EX^.d^ are given in Table 2.3.5. Such a graph has been drawn for the data and is shown in Figure 2.3.1. University of Ghana http://ugspace.ug.edu.gh HyRU 2 0 - 24- 2 0 - 16- 12 0- 4 0 - 4 - -0 -1 2 - 16- - 24 - 50. 3.]*• PRODUCT P f Xidj AS FUNCTION OF R O W M E A N XL- M E A N OF Pi IS S H O W N WITH CONFIDENCE LIMITS. UPPER CONFIDENCE LIMIT x ------- .. .— . . X PRODUCT MEAN F -n— z----1------1----- 1----- 1----- 1------ 1-- 10 * 2 0 30 4 0 30 60 7 0 X ____________________ X X R O W M E A N X ; X LOWER CONFIDENCE LIMIT University of Ghana http://ugspace.ug.edu.gh si. The mean of the P^ , P, is plotted as a horizontal line. Similar lines are drawn to show the 951 confidence interval which is given by Snedecor as 1 . e , P ± 2 sum of squares of deviations of column means Mean square for Residual 2 1 1 P ±2 (zd . ) 2 x (Mean square for Residual ) 2 } j 3 where the dj’s are given in Table 2.3.5. Now P = EP. • 1 l 21 45.096 21 = 2.147 Hence the 95% confidence limits are ± 2 ^ 2 . 146) (82.486)J2 j* i.e. 2.147 ± 26.610 . These lines are shown in Figure 2.3.1. Figure 2.3.1 shows that all the points lie within the confidence limits and this corresponds with the non-significance of the Tukey's test for non-additivity. If non-additivity were significant, one or more of the plotted points would be expected to be outside the confidence lines. University of Ghana http://ugspace.ug.edu.gh 52. Ill- Testing for Homogeneity of Error Variances. In Section 2.2 fill), Bartlett's test was shown to be appropriate for testing homogeneity of error variances. A theoretical description of the test was also given in that section. The test statistic for small samples was given as M/C, where (i) M = (n-l) ZS? k l0 se n r ' sloge si = 2.3026(n-l) S? 2 k lQSio ~ Elogio si ; ___ (2.3.6) (ii) C is the corrective factor and is given by k + 1C = 1 + 3k(n-l) ....(2.3.7) For the "Kpong data", the observations from each block are regarded as a sample. Thus, there are three samples each of size 21 and it is the variances of these samples that are to be tested for homogeneity. Using Table 2,3.1, the three variances are obtained as S* = 214.439, S^ = 399.594, S^ = 503.893. Thus Log1 0 S^ = 2.3312, Log1Q S^ = 2.6017, Log1Q S^ = 2.7024, University of Ghana http://ugspace.ug.edu.gh 53, Hence and 3 2 i*=lSi 1117.926 S E Log,n S. = 7.6353 «• i=l iU 1 From equation (2.3.6), M = (2.3026) (20) 3 Log10 1117.926 - 7.6353 = 46.052 (7.7136 - 7.6353) , = 3.606 . From equation (2.3.7) 4C = 1 + 9 x 2 0 J 1 .022 Therefore M _ 3.606 C 1.022 = 3.53 The tabulated value of X (2 d.f., 54 level) is 5.99. Thus, there is no evidence of heterogeneity of error variances at the 95% confidence level. The results of the previous section indicate that the three most important assumptions underlying analysis of variance have been satisfied. The analysis of variance methods will therefore be used to analyse the data. University of Ghana http://ugspace.ug.edu.gh CHAPTER THREE UNIVARIATE ANALYSIS OF VARIANCE 3.1 Randomised Block Experiment with n Plants Per Plot Using the Average Yields Per Plot. Regarding the average yield of a plot as a single observa­ tion, we obtain a randomised block experiment with one observa­ tion per plot. Such an experiment has been described in Section 1.1. The appropriate mathematical model for this situation is given by X.. = y +a:. + 6 . + ft. . + e . - • .....(3.1.1) ij i 3 ij i] The application of the analysis of variance technique to the above model has been discussed by Federer (14) , Scheffe^ (34) , and others (23 ,27). For model (3.1.1), the total sum of squares is given by EE(X.. - X ..)2 . ij J Applying the analysis of variance technique, this total sum of squares can be broken up into three different sums of squares, each attributable to a different source. The resulting identi­ ty is EE(X.. - X ..)2 = J E ( X . . - x ..)2 + IE(X.. - X ..)2 ij J i j 3 + ee (x.. - x . . - x . . + x ..)2 : ij 3 1 3 University of Ghana http://ugspace.ug.edu.gh (i) s S CXij - X..) = SST = Total sum of squares where 2 13 2 (ii) W I C.cR - X..) = SSy = Treatment sum of squares J 1 - 2 (iii) lE(X.j - X..) = SSg = Block sum of squares > 3 (I ij (iv) EE(X.^ - X^. - X.j + X . . ) 2 = SSg = "Error" sum of squares . If we let EX.. = T . ., EX.. = T . ., Z E X . . = T , j 13 i 13 3 ’ ij 13 we obtain the following convenient computational formulae ? T 2 (i) JE(X . - X..) = — i- - -i- = SS 5 i J J vT 2 . j 2 (ii) IE (X. . - X . . ) 2 = --- - — = SSB > j 3 I IJ B 2 (iii) EE (X. . - X . . ) 2 = E EX? . - -I- = SS . i j J i j J The analysis-of-variance (ANOVA) table appropriate to the above situation is given in Table 3.1.1. University of Ghana http://ugspace.ug.edu.gh 56. Table 3.1.1“ ANOVA for a Randomised Block Experiment with One Observation Per Plot. Each Observation is an Average Yield .. Source of Variation Degrees of Freedom (df) Sum of Squares (SS) SS for Computation Nfean Square 06) Treatments I - 1 JEQC. - X . . ) 2 = SSy ST?. t2 J IJ SS — = MS,r I-l v Blocks J - 1 IE(X.. - X . . ) 2 = SSB 3 a ST?. t2 3 _ T I IJ SSR e P nra WlD ) "Error" (I-l) (J-1) EE (X. . -X. . -X.. +X . . ) 2 IJ 1 3 -SS e By subtraction SSE C - MSr (1-1) (J-1) . E Total ( U - 1) EE(Xi;j - X . . ) 2 = SST 2 T2EEX7.-- — ii IJ With the fixed-effects model, we assume that the treatment and environmental effects are additive. That is, the in equation (3.1.1) are assumed to be zero. The expected values of the mean squares in Table 3,1,1 under both the fixed-effects and random-effects models have been calcu- lated by Scheffe7 (34), and they are given in Table 3.1,2, Table 3.1.2 ; Expected Values of the Mean Squares in Table 3.1.1. Fixed-Effects (tf^ . = 0) Random-Effects (k^ . f 0) E(MSy) ae * J Scc-2 X - I-l 1 9 7 2 °e /n + a'S + Jacc EfMSg) CTe + 1 E6 2 n J- 1 3 ae2/n + °tf2 + ItJg2 E(MSc) 0e2 n ae2/n + °z2 University of Ghana http://ugspace.ug.edu.gh In this experiment, if the fixed-effects model is assumed, the hypotheses that we usually wish to test are 57. A - = 0 for i = l,Z....I JHA : Hg : g . = 0 for j =1,2 J - If the random-effects model is assumed, the hypotheses we would like to test are Ha : f. 2 ■ 0 1 hb ' ‘ 0 ■ Rao (32) has shown that each of the sums of squares (given in Table 3.1.1) divided by the expected value of the corresponding mean square is distributed as central or non- 2 central x with the degrees of freedom of the quadratic form in question. In particular, under the fixed-effects model, SSE 2 Cl) ; (ii) When H. is true, i.C3v 2 X ae2/n 1 - 1 J (iii) SSg and SS^ are independent. Hence the statistic MSy FA1 " SiE / (i-i)(j-i) " d u I ..,.(3.1.2) University of Ghana http://ugspace.ug.edu.gh has an F-distribution with (I - 1) and (I - 1)(J - 1) degrees of freedom when is true. It may therefore be used to test the hypothesis H^. Similarly, the statistic F„, = MSB (3.1.3) MSEE has an F-distribution with (J - 1) and (I - 1) (J - 1) degrees of freedom when is true, and may be used to test the hypothesis 58. HB' Under the random-effects model, SSg- 2 U ) o e V n * ? ~ i (ii) When H. is true » cc „ 2 T T ~ T T ~ n ' X I - 1a e /n + az (iii) SS£ and SSy are independent. Hence the statistic S S y / (! - 1) MSy A2 SSE/(I-1)(J-l) d u I ...(3.1.4) has an F-distribution with (I-1 ) and (I — 1)(J-l) degrees of freedom when H^ is true. It may therefore be used to test the hypothesis H^. A similar test can be constructed to test the hypothesis Hg . University of Ghana http://ugspace.ug.edu.gh 3.2 Randomised Block Experiment with h Plants Per Plot. In a randomised block experiment with n plants per plot, if the characteristic is observed on each of the n plants on a plot, there will be n observations per plot. A mathematical model appropriate for this situation was given in Section 2.1 X.., = y + «. + 3 - + . + e.ljk i j ljk The analysis of variance technique has also been applied to this situation. The procedure is described by Federer (14), Scheffe7^ (34) and Kempthorne (25) . For the above model, the total sum of squares can be broken up into four different sums of squares attributable to different sources. The various sums of squares and the corresponding sources are given in Table 3.2.1. University of Ghana http://ugspace.ug.edu.gh Table 3.2.1; ANOVA for a Randomised Block Experiment with n Observations Per Plot, Source of Variation Degrees of Freedom. Sum of Squares (SS) SS for Computation Ms an Square . (MS) Treatment I - 1 JnE(X^.. - X . . . ) 2 = SSy- et?.. t 2 Jn IJn SS MS,r= ---- V 1 - 1 Blocks J-l InlQC.j. - X . . . ) 2 = SSB ■ sT?3.. t 2 In IJn t)i = SSB J-l Interaction (1-1) (J-l) nEE(X-.. - X... ID i -X.,. +X . . . ) 2 .. . . SSVB By subtraction - SSVBMS-, —j w Cl—i) (J-l) Within Plots IJ(n-l) EEE(Xijk - X y . ) 2 ■ ssw ^ - « & . SSW^ = --- =------ ” IJ(n - 1) Total IJn - 1 EE2 (Xijk - X . . . ) 2 = SSp 2 ? T EEEX7.. - — 13 k IJn The expected values of the mean squares in Table 3.2.1 under both the fixed-effects and random-effects models have been calcula­ ted by Scheffe' (34). They are given below in Table 3.2.2. University of Ghana http://ugspace.ug.edu.gh 61. Table 3.2.2 Expected Values of the Mean Squares in Table 3.2.1 , Fixed-Effects Random-Ef fe cts E(M3V) - 2 * m r t 2 ^ 2 T 2°e + na„ + Jna <5 « ECMSg) ° * 2 * a e 2 + W + 6 + (i-i) (j-i)m i :i 2 2 ae + ncr,y E ( ^ w ) ° e CTe2 Under the fixed-effects model, the hypotheses of interest HA: oci = 0, i=l, 2 ........ 1 j % : = 0 , j-l,2 J J ^AB' ^i j = ®> i = .....j— 1 ,.....J» Under the random-effects model, the hypotheses of interest A' 2 CToc = 0 t) 2 = 0[B : a e t > 2 = 0[AB " • University of Ghana http://ugspace.ug.edu.gh 62. It has again been shown by Rao (32) that each of the sums of squares in Table 3.2.1 divided by the expected value of the corresponding mean square is distributed as central or 2 non-central x with the degrees of freedom of the quadratic form in question. In particular, under the fixed-effects model, m SSw v 22 ~ IJ (n-1 ) > °e (ii) When is true, SSV x 2 2 I-l » (iii) SS^ and SSy are independent Hence the statistic ssy/(I-D A3 SSw /ij(n _i) MS V ..... (3.2.1) MSW has an F-distribution with (I-l) and IJ(n-l) degrees of freedom when H^ is true, and may be used to test the hypothesis H^. Similar test statistics can be obtained to test for the hypothe­ ses Hg and H^g. Under the random-effects model, we have University of Ghana http://ugspace.ug.edu.gh 63. (ii) When is true o 2 2 CTe + ncr^ Hence the statistic F SSy/f1-1) (3.2.2) SSyg/d-l) (j-l) has an F-distribution with I-l and (I-l)(J-1) degrees of freedom when H^ is true. F ^ may therefore be used to test for the hypothesis H^. Similarly, the test statistics MSg/MSyg and MSyg/MS^ may be used to test for the hypotheses Hg and H^g, respectively. When the hypothesis H^g is not true, we may have treatment, say, to be better than treatment when averaged over all the J blocks but for some of the blocks V^ may be better than V^. Under this condition, any treatment differences detected should be viewed as being averaged over all the J blocks in the experiment. For a true picture of the situation, however, it is necessary to test for differences in treatments for every block. The sum of squares with I-l degrees of freedom due to treatments for the jth block is given by Rao (31) as University of Ghana http://ugspace.ug.edu.gh 64. 2 SSV = k lTii • ’ T * j * . (3.2.3) 1 In The mean square corresponding to this is tested against MS^g or MS^ depending on whether we are using the random-effects or fixed-effects model. 3.3 Comparisons of the Various Tests for Treatment Effects* We noted in Section 3.1 that when the average yields per plot are used, the hypothesis of no treatment effects (H^) under both the fixed-effects and random-effects models could respectively be tested by the F statistics F ^ and F ^ where MSV (i) F ^ = (Fixed-Effects Model) 3 E MS (ii) Fa 2 = (Random-Effects Model) . E We also noted in Section 3.2 that when yields of the individual plants on a plot are used, the same hypothesis under both the fixed-effects and random-effects models could respectively be tested by the F statistics F ^ and F ^ ; where MSV(iii) F^j = pg— (Fixed-Effects Model) 9 where MS^. is the treatments mean square given in Table 3.2.1 * University of Ghana http://ugspace.ug.edu.gh 65, (iv) - (Random-Effects Model) Thus, under the fixed-effects model, the hypothesis can be tested by using either or F ^ . ^et these two statistics be generally represented by where M^ and M^ are treatments and error mean squares with f^ and f^ degrees of freedom, respectively. As pointed out in Sections 3.1 and 3.2, F^ has an F-distribution with f^ and f2 degrees of freedom when is true. When is not true, Fj- has been found to have a non-central F(F,- . „) with non-f fi »f? , s centrality parameter 6 . Using F^ to test at the oc level of significance consists in rejecting HA if and only if where F^.r r is the tabular value of the F-distribution »±l *f 2 with f^ and f£ degrees of freedom at the <* level of signifi­ cance. Thus the power of the test - the probability of reject­ ing the hypothesis - is fl » f 2 5 3 = Pr (Ff > Fee a (3.3.1) University of Ghana http://ugspace.ug.edu.gh 66, The non-centrality parameter 6 is calculated by 2 _ ssy(mjj) CTe 2— t— 9 (3.3.2) where (i) m . . = expected value of X.. - ij i] » (ii) SSv (nUj) = treatment sum of squares with the x^j's replaced by nu.. • Equation (3.3.1) has been tabulated by Tang (25) for = = 0.01 and 0.05. He, however, chose to work in terms of *, where * = 6 (fx + l)~s • (3.3.3) Pearson and Hartley (34) have prepared charts from Tang's table for = 0.01 and 0.05. The charts show that the power 3 increases with f2 and * for a given f^. From equations (3.3.2) and (3.3.3), * will be large when CTe2 is small. Thus the F-test based on F£ will be powerful, and hence sensitive, if the error variance is small or has more degrees of freedom. When Ff = FA1> we have (i) f 1 = 1 - 1 ; (ii) f2 = (I-D(J-l) j (iii) ffe2 = n(MS£) . University of Ghana http://ugspace.ug.edu.gh On the other hand, when F^ = ^A3* we ^ave (i) fx = I-l 5 (ii) = IJ(n-1 )J (iii) (ii) Var(MSw ) = 4 e IJ(n-l) Thus Var(MSw) < Var [[n(MSEf] since IJ(n-l) > (I-l)(J-1) . 2 That is, MS^ is a more efficient estimator of o Q than n(MSg), From the above results, we may conclude that the F-test based on F ^ will be more sensitive than the one based on F ^ . That is, if the fixed-effects model is assumed, the F-test for treatment differences using yields of the individual plants on University of Ghana http://ugspace.ug.edu.gh 68. a plot is more sensitive than a corresponding F-test using the average yields per plot. Under the random-effects model, the hypothesis can be tested by using either F ^ or fa 4 * these two statistics be generally represented by f - J h . r M4 5 where and are treatments and error mean squares with and degrees of freedom, respectively. When is true, Fr has an F-distribution with f^ and f^ degrees of freedom. Under this model, however, F is distributed as bFx £ r ±3**4 when H. is not true, where F\c r is a central F-variable with A f3»f4 fj and f^ degrees of freedom, and b is a constant. The test consists in rejecting at the level of significance if F > F ,f , . r “ ,1:3»f4 The power of the test is therefore given by e = P rCFr - F«-f f ’ » r ’ 3 ’ 4 or 6 = P (F, f > r F -f f ) ......... (3.3.4) r ±3»t4 b oc;±3»±4 When Fr = F ^ , we have (i) f3 = i-i ; (ii) f ‘ = (1-1) (J-l) J University of Ghana http://ugspace.ug.edu.gh 69. (iii) MSy ( ° 2 /n .+ a 2 + Ja } ) (1-1) (J-l) T~, 2--- * ' ae M + Jnaj ( 1 + a 2 + n o 2 3 F(I-1) JCi-1) (J-l) 'I-l 1 (I-l) (J-l) Thus Jnaa2 t1 +"7 T } °e + nat From equation (3.3.4), we obtain = P (I-l),(1-1)(J-l) Jna ( 1 + - j — ae + When Fr = F ^ we have (3,3.5) (i) I-l (ii) £4 = (I-l) (J-l) (iii) MS, MS.VB 2 2 2 ae + n Bi 2 since the error variance, c?e , is estimated more efficiently in equation (3.3.6) than in equation (3.3.5) - MS^ is a more 2efficient estimator of than n (MSg). From the above results, we may conclude that the F-test based on F ^ will be more sensitive than a corresponding one based on F ^ if there are no interaction effects. That is, if the random-effects model is assumed and there are no inter­ action effects, the F-test for treatment differences using yields of the individual plants on a plot is more sensitive than a corresponding F-test using the average yields per plot. When there are interaction effects, the two tests may have the same sensitivity. University of Ghana http://ugspace.ug.edu.gh 3.4 Randomised Block Experiment with one Observation Per Plot Under Randomization Models. As pointed out in Section 2.1 (IV), the basic difference between normal-theory models and randomization models is the normality assumption associated with the errors in the former models. The errors in randomization models are assumed only to be uncorrelated with expectation zero. Under randomization models, exact tests of certain hypotheses are possible. These exact tests are called permu­ tation tests (or randomization tests) (4,17,34). I . Permutation Tests . The rudiments of permutation tests were given by Fisher (17). Major studies which stimulated interest in this field were however started by Box and Anderson (4) as well as Wilk and Kempthorne (42) . Permutation tests for a hypothesis were found to exist whenever the joint distribution of the observations under the hypothesis had a certain kind of symme try (a requirement which may often be guaranteed by randomiza tion). That is, when the distribution was invariant under a group of permutations. The tests were found to be exact for very broad kind of hypotheses in the sense that the validity of the calculated significance level depended only on the University of Ghana http://ugspace.ug.edu.gh symmetry of the distribution and not on any further assump­ tions such as normality of errors, homogeneity of error variances, or independence as in normal-theory tests. Box and Anderson also found permutation tests to be robust (4). The attractiveness of permutation tests to statisticians lies in the fact that they are exact tests under less restric­ tive model, and can often be approximated by the F-test [or a slight modification of it) derived for the corresponding hypo­ thesis under the fixed-effects normal-theory test. For this reason, the tests are often based on the same statistics that would be used in normal-theory tests. In spite of its attractiveness, some statisticians have given concern about this new procedure. Yates and Barnard (4), for example, give caution of the possibility of finding "ourselves in the position where two statistical tests, while applying with apparently equal appropriateness to answering apparently the same question on the basis of the same data, give different answers". Welch (4) observed that: "the more complicated the experimental situation becomes, the less heavily one can lean on permutation theory to validate the usual normal- theory tests and the more heavily one must lean on the mathema­ tical assumptions made". On the power of the permutation test, Box and Anderson (4) found out that if the observations were really normally distri­ buted -t-hp norm, tat ion test would necessarily be less powerful University of Ghana http://ugspace.ug.edu.gh 73. than the normal-theory test since the latter is uniformly most powerful for the normal distribution. For a randomised block design, Hoeffding (20) showed that the permutation test has asymptotically the same power as the usual F-test against alternatives of the normal-theory model. To the latter claim, Scheffe' (34) comments that what is of interest is the power of F-tests against alternatives in the randomization models. According to him, it is from such a study that we shall see whether the usual practice of using the normal-theory test as approximation to the exact permutation test is justified. II. Permutation Test for a Randomised Block Design. Consider the randomised block experiment described in Section 1.1. There are (11)^ possible assignments of treat­ ments to the plots. Box and Anderson (4) and Scheffe/ (34) have developed a permutation test for such an experiment. The following treatment is mainly due to Scheffe7. Let the plots in each block be numbered with t=l,2,...I. Also let u. . be the 'true' response under the i*-*1 treatment 1 j U on the (j ,t) plot. Scheffe/ defined the 'true' response as vijt = u + ^ + *.. + e.jt ,....(3.4.1) where (i) y = y. . . * (ii) =i= yi.. - p . ... . University of Ghana http://ugspace.ug.edu.gh 74. (iii) Bj = y. ,. - p. . . ; (iv) ^ = y — . - y^.. - y.j. + y... ; ^ eijt uijt pij ’ Thus Ee.. = 0 for all i,j. t 13 He adopted the terminology of Wilk and Kempthorne (42) by calling the quantity e^jt a plot error. It is a constant and it arises from the inequality of the true responses of diffe­ rent plots in the same block j to the same treatment i. In a conceptual experiment involving a sequence of repeti­ tions under the same conditions, Scheffe^ noted that the observed response X.. of the (j ,t) plot to the i ^ treatment would 1 J T differ from the conceptual 'true' response of the (j,t) +■ Vi plot to the i treatment on any particular trial by a techni­ cal error, eijt > (42). Thus X . . = y . . + e . . . *ljt ^ljt ijt > e .. is regarded as a random variable with E(e.. ) = 0 by1 J L IJ L definition of y... as the expectation of X....1 3 1 v ijt The technical error, e . i s a measurement error, the • ij t > difference between an observed value and the corresponding true value, an error caused by the measuring instrument or the observer. The randomization by which the treatments are assig- University of Ghana http://ugspace.ug.edu.gh ned to the plots is performed in such a way that the plot errors are independent of the technical errors {e^.^}. Denoting the observation on the i1'*1 treatment in the jth block by , Scheffe'’ gave the mathematical model as X.. = vi + °=. + 6 - + ^ . + n. . +e.. , (3,4.2) il 1 J ij ij 1 3 where (i) n . . = rd. E E £ EEE = plot error « ^ ' ij t ijt ljt y 3 75. (ii) e.. = sd. ., e. . ! ] t l j t l r technical error ^ 2 (iii) {d.. } are I J random variables taking on the ij ^ values 0 and 1 . He noted that, under the above model, the joint distribution of the observations does not have sufficient symmetry under the hypothesis H^: all .^ = 0 to permit a permutation test. However, by assuming that the treatment-block interactions and the treatment-plot interactions within a block are zero, and that the technical errors are additive, he obtained a model for which the joint distribution of the observations is symme­ tric under the hypothesis H^. Under the hypothesis H^, he gave the model as X. . i] v - + Sd.jt (C.t + Z.t) , (3.4.3) where University of Ghana http://ugspace.ug.edu.gh 76. (i) = plot main effect within the block 3* (ii) Z . ^ = technical error associated with the (j ,t) plot, formerly written but now assumed not to depend on the treatment applied to the plot. Permutation tests will be based on the set G of permuta­ tions within blocks. The set G consists of (I!)^ permutations. Let X‘Q be the observed sample of {X^.} and S(XQ) the set of (11)^ samples obtained from X0 by applying the set G of permu­ tations. Scheffe' has shown that, given that X0 is in S(X0), all X0 in S(X0) have the same conditional probability under the hypothesis H^. Thus, symmetry is established. The exact permutation test of the hypothesis will be based on the statistic F. -SSV SSE where the SS's are those defined in Section 3.1. We note that SSy + SSg has a constant value in the set S(X0) since SSy + SSg = SSrp - SSg • It follows that FQ is a strictly increasing function of SSy, 2 and hence of the sum of squares of the treatment totals ET^. i University of Ghana http://ugspace.ug.edu.gh 77. The permutation test based on sT^. would thus be equivalent i to the permutation test based on F0 . It turns out,however, that the calculations involved in using this statistic is impracticable because of the large number of permutations involved. Box and Anderson (4) , as well as Scheffe” approximated the exact calculation by basing the permutation test on the statistic S.S, U = S S y + S S g This was found to be equivalent to basing it on F . They chose U instead of S S y because it offered them a way both for approximating the exact calculation and for comparing the permutation test with the normal-theory test. By assuming that there were no technical errors, Box and Anderson obtained the expected value and variance of U as (i) ECU) _1 J (ii) VarCU) = — iH-JJ (i - I ) , J 3 C I - 1 ) J ,(3.4.4) where W is the square of the coefficient of variation (ratio of the variance to the square of the mean) of the block variances. University of Ghana http://ugspace.ug.edu.gh Under the various assumptions given in this section, U was found to have a discrete distribution with range 0 < U < 1. Box and Anderson approximated this distribution by a (continuous) 3-distribution (43) with the same mean and variance. Now if a random variable X has the 3-distribution with f^ and degrees of freedom, then f. (i) E(X) = (ii) Var(X) = fl + f; 2 f f 1 2 C V £ 2 ) 2 ( W 2 ’ ( 3 . 4 , 5 ) Equating the expected values given in equations (3.4.4) and (3.4.5), the variances given in equations (3.4.4) and (3.4.5) and solving for f^ and f2 , we obtain (i) (ii) >(1 -1 ) j >(1-1) (J-l) , (3.4.6) where 1 - * J(I-l) • .••••(3.4.7) Scheffe's approximation to the permutation test of based on the statistic U consists in rejecting if the value of U for the observed sample £ the upper point of the 3-distribution with the degrees of freedom f^ and f2 given above. University of Ghana http://ugspace.ug.edu.gh Let X be a g- variable with f^ and degrees of freedom. It has been shown that the random variable fjCl-X) is a strictly increasing function of X, and has the F distribu­ tion with f^ and degrees of freedom. Hence, according to Scheff^, it is equivalent to reject H^ if U > the uppeir point of the g-distribution with f^ and degrees of freedom and to reject H. if F0 > Ff f (=4 level). He therefore regarded his A 1» 2 approximation to the permutation test of H^ based on the statis­ tic F0 as being equivalent to modifying the normal-theory test by multiplying the numbers of degrees of freedom of the F- distribution by the factor . Scheffe'” observed that if > 1, the adjusted normal-theory test will be more sensitive than the unadjusted normal-theory test. This is seen from inspection of the F-tables. For = < 0.1 and f- > 2, Fr r («% level) is a decreasing function of 2 fl »f 2 f^ and f2 ; thus values of the statistic FQ which miss signifi­ cance with the unadjusted numbers of degrees of freedom may achieve it with the adjusted numbers. Scheffe*’ further justified the above approximation to the permutation test in the case where the technical errors are not zero but have an arbitrary distribution subject only to University of Ghana http://ugspace.ug.edu.gh 80. B(ujt) = 0. In carrying out the permutation test, the factor^ may be calculated from the observations by using equation (3.4.7) with where S. = 1 JL j EX? . i J-l (EX- .) i x r JES? ■i 3 (ES .)' j J ..(3.4.8) I Equation (3.4.8) is due to Box and Anderson (4) 3.5 Loss of Information due to Sampling. In some agricultural field experiments using the randomi­ sed block design, there may be N plants per plot from which a sample of n plants is taken. The characteristic is then measured on the n plants. In such experiments, it is worth considering the loss of information due to sampling. The theory has been discussed by Yates and Zacopanay (45). Other comments have been given by Immer (22) and Finney (15,16). Whenever a plot is sampled, a loss in information relative to the information obtained for complete recording results. The fractional loss in information due to the sampling and the efficipur-v n-F sampling relative to complete recording are dis- University of Ghana http://ugspace.ug.edu.gh cussed below. The variance of a variety mean is 81. d u lo i Jn From Table 3.2.2, E(MSyg) = ae2 + no2 . Hence the informa­ tion on each treatment mean is Jn ~ 1 2 *a e + nay If the whole plot is harvested, the information on each treatment is JN 2 T.T 2a e + Na^ Thus the efficiency of sampling relative to complete recording Jn , JN e 2 7 T ' 2 . KT 2 ’ a e + n a-y ae Nax ai + ae2 /N ............. (3.5.1) 2 2 , The loss of information, L, due to sampling is estimated by L = 1 - e .(3.5.2) University of Ghana http://ugspace.ug.edu.gh 3 . 6 Analysis of the "Kpong Data',' The "Kpong data" come from a randomised block experiment with 10 plants per plot from which a sample of 5 plants was taken. The different ways of analysing such data have been given in the preceding sections. These different procedures will be applied to the data to illustrate some of the points given in those sections. I. The Case of Using the Average Yields Per Plot Under the Normal-Theory Models.________________ The analysis of a randomised block, experiment with n plants per plot using the average yields per plot has been discussed in Section 3.1. The calculations necessary for carrying out the analysis are given in Tables 3.6.1 and 3.6.2. University of Ghana http://ugspace.ug.edu.gh Table 3.6.1 ? Yields in grams per Plot for 21 Varieties of Cowpea c Varieties B L O C K S T,2. 1 2 3 l l V1 26.39 22.39 21.82 70.60 4984.360 V2 50.52 22.06 16.15 88.73 7873.013 V3 5.56 5.07 5.66 16.29 265.364 V4 45.58 58.20 64.20 167.98 28217.280 V5 39.47 34.61 34.70 108.78 11833.088 V6 27.58 47.01 60.34 134.93 18206.105 V7 47.11 24.67 55.08 127.58 16276.656 v8 34.52 39.84 29.81 104.17 10851.389 V9 27.23 30.62 18.68 76.53 5856.841 V10 48.71 62.63 71.61 182.95 33470.703 V11 53.68 52.59 53.67 159.94 25580.804 V12 33.21 70.32 66.17 169.70 28798.090 V13 24.58 12.85 18.41 55.84 3118.106 V14 19.49 14.52 18.40 52.41 2746.808 V15 6.37 2.83 3.62 12.82 164.352 V16 24.79 18.44 18.02 61.25 3751.563 V17 13.44 19.05 6.87 39.36 1549.210 V18 13.99 6.42 8.76 29.17 850.889 V19 15.23 10.96 11.77 37.96 1440.962 V20 16.39 14.58 12.87 43.84 1921.946 V21 21.30 14.68 28.93 64.91 4213.308 T-J 595.14 584.34 626.26 1805.74 = T 211970.837 = E Tj2. T 2 354191.620 341453.236 392201.588 IT.2. - 1087846.444 2 E X?, = 74160.721 „ University of Ghana http://ugspace.ug.edu.gh 84. _ l l . (1805. n f , 517 , IJ 21 x 3 ? T i- 2 SSV Ij- . 211970.837 _ 5 1 7 5 7 . 0 94 , = 18899.852 . 2 E T ss . - L - 2 - - i L - 1P87846.444 - 51757 .094 , a I IJ 21 = 45.118 . 2 SST = i t X ? . 1 — = 74160.721 51757 .094 , i j i = 22403.627 » SS£ = SST - SSy - SSB = 3458.657. Table 3.6.2; Analysis of Variance for Table 3.6.1 „ Source df SS MS F Varieties 20 18899.852 944.993 10.929 Blocks 2 45.118 22.559 Error 40 3458.657 86.466 Total 62 22403.627 University of Ghana http://ugspace.ug.edu.gh From the ANOVA table, F20,40 = 10'93 * From standard tables, F ( S% level of = 20,40 '•significance3 Thus, the varieties are significantly different at the 51 level. II. The Case of Using Yields of the Individual Plants On a Plot Under the Normal-Theory Modelst_________ The analysis of a randomised block experiment with n plants per plot using yields of the individual plants on a plot has been discussed in Section 3.2. Given below in Table 3.6.3 and 3.6.4 are the calculations necessary for carrying out the analysis of the "Kpong data". University of Ghana http://ugspace.ug.edu.gh Table 3.6.3 I Totals of Plot Yields (in grams) for 21 Varieties of Cowpea. Varieties B 1 L 0 C 2 K S 3 Ti.. T 2 i" V1 131.97 111.92 109.13 353.02 124623.121 V2 252.61 110.31 80.74 443.66 196834.196 V3 27.78 25.37 28.28 81.43 6630.845 V4 227.92 291.01 321.02 839.95 705516.003 V5 197.35 173.03 173.51 543.89 295816.332 V6 137.92 235.06 301.72 674.70 455220.090 V7 235.56 123.35 279.02 637.93 406954.685 V8 172.58 199.21 149.06 520.85 271284.723 V9 136.15 153.09 93.43 382.67 146436.329 V10 243.56 313.13 358.06 914.75 842265.063 V11 263.41 262.94 268.33 799.68 639488.103 V12 166.03 351.59 330.84 848.46 719884.372 V13 122.90 64.23 92.06 279.19 77947.056 V14 97.44 72.59 92.01 262.04 68664.962 V15 31.87 14.14 18.12 64.13 4112.657 V16 123.96 92.20 90.12 306.28 93807.439 V17 67.21 95.24 34.34 196.79 38726.304 oc '>r 69.94 32.10 43.77 145.81 21260.556 V19 76.16 54.78 58.86 189.80 36024.040 ^ i O ' 81.95 72.89 64.35 219.19 48044.256 V21 106.48 73.43 144.64 324.55 105332.703 N N 2975.75 2921.61 3131.41 9028.55 5304393.835 2 T.j. 8855088.063 8535804.992 9805728.588 ET.2.. j 3 = 27196621.643 e ZZZ Xf-k = 403876.001 , ?S T,-. = 1854057. ij V University of Ghana http://ugspace.ug.edu.gh 87, CF = -X— = .C9028- 77) = 258789.484 . IJn 21 x 3 x 5 £ T? . . SSp = ^ — - CF = - .645, _ CF = 225.96 - B In 21 x 5 I N A R R SS = - L - L — - CF = ^ ■5-?i-5i 3. - 8 3 5 - CF = 94836 .772 £ V jn 3 x 5 SST = EEE X?., - CF = 403876.001 - CF = 145086.517 .I * • i 1 ”1Ki j k J EE T 2 r2 " 13 SSW = EEE x . . , - = 403876.001 - 370811.561 ,W i j k 13 k n = 33064.440 SSyg = SS^ , - SSg - SSy - SS^ = 16959.345. Table 3.6.4' Analysis of Variance for Table 3,6.3 . Source df SS MS F Varieties 20 94836.772 4741.839 Blocks 2 225.960 112.980 Interaction 40 16959.345 423.984 3.232 Within Plots 252 33064.440 131.208 Total 314 151824.762 University of Ghana http://ugspace.ug.edu.gh 88. The F ratio for testing the interaction is N, p SAsRLKS p , L, A A o r ' i ~ O * fci «J * 40,252 131.208 From standard tables, ^40,2 52 C” level> ■ ^ • The interaction effects are therefore significant at the 54 level. In view of the reasons given in Section 3.2, we have to test for variety effects for each block. The variety sum of squares for each block is given by equation (3.2.3) . Under the random-effects model, the variety mean square is tested against MSyg while under the fixed-effects model it is tested against MS^. The various mean squares and the corresponding F values for the three blocks are given in Table 3.6.5. Fr and F£ denote the F ratios under the random- effects and fixed-effects models, respectively. Table 3.6.5 ; Variety Mean Squares and the Corresponding F Values for Blocks 1, 2 and.3 c BLOCKS d£ SSy M3V Ff :■ Fr 1 20 21445.88 1072.294 8.17 2.53 2 20 39958.52 1997.926 15.24 4.71 3 20 50391.67 2519.584 19.21 5.94 University of Ghana http://ugspace.ug.edu.gh 89. From standard tables, (i) f20,40 C5% leve1 ^ = X ’ 84 > (ii) F20 2 5 2 C54 level) = 1.52 . Thus, for each block, and under both the fixed-effects and random-effects models, the varieties are significantly diffe­ rent at the 54 level. Using yields of the individual plants on a plot, the inter­ action effects are found to be significant. Using the average yields per plot, however, the interaction effects are not signi­ ficant. The mathematical models for the two situations are, respectively, Thus, the various effects are the same in both models. In case (ii) we assume that 1 = 0 (explained in Section 2.2 (II)). We then use Tukey's test to justify this assumption. As we noted in Section 2.2 (II), Tukey's test imposes restrictions on the {^i.} and, furthermore, the power of the test is unknown. Since the test for interactions in case (i) does not impose any restrictions on the it is possible for this University of Ghana http://ugspace.ug.edu.gh test to detect interaction effects otherwise not detectable by Tukey's test. It is also possible that Tukey's test is not powerful enough to detect snail interaction effects. These arguments may explain why Tukey's test failed to detect interaction effects in the "Kpong data". From ANOVA Tables 3.6.2 and 3.6.4, we have = 131.208,M S ^ 423.984, n (MSg) = 432.330. Theoretically, whether there are interaction effects or 2 not, MS^ and n (MSg) estimate the same thing, namely x df SS MS F Varieties 20 20.40 18899.852 926.463 10.929 Blocks .2 2.04 45.118 22.117 Error 40 40.80 3458.657 84.771 Total 62 63.24 22403.627 From the ANOVA table, F20.4,40.8 = 1 0 *9 3 * From standard tables, F20.4 ,40. 8 level) = i - 81 ‘ Thus the varieties are significantly different at the 51 level. Let Fj^ and Fp be the F statistics for the normal-theory and permutation tests, respectively, when the average yields per plot are used. Let F^ and Fp be the corresponding tabular values. From Tables 3.6.2 and 3.6.6, FN = 10.93 and Fp = 10.93. University of Ghana http://ugspace.ug.edu.gh The corresponding tabular values at the 5% level are = 1.84 and Fp = 1.81. Thus, the F-tests based on F^ and Fp give the same verdict. Though Fp is smaller than F^ (this is expected in view of the deductions given in Section 3.4 (II) ), the large value of F^ (= Fp) does not make the reduction in the tabular value any important. For the "Kpong data" therefore, the permutation test for variety differences gives the same verdict as the normal-theory test. IV. Loss of Information due to Sampling. The importance of examining the loss of information due to sampling for experiments involved with sampling has been discussed in Section 3.5. The theoretical treatment given in that section gave the efficiency of sampling relative to complete recording as 2 2 /XT e = q 9f + g e /N , 2 2 a Y + a e / n For the "Kpong data", the restricted maximum likelihood estimates 2 2 of o f and have been calculated in Chapter 6 and are given as o e = 131.208, = 55.593 . Substituting these values in the above equation, we obtain 54. University of Ghana http://ugspace.ug.edu.gh = 55.593 + 151.208/10 E 55.595 + 131.208/5 * 68.7138 81.8346 ' 0.840 » From equation (3.5.2), the loss of information, L, due to sampling is L = 1 - e «j = 1 - 0.84 > = 0.16 . Thus, the loss of information is 164 which is relatively small. In experiments where sampling is involved, it is possible to choose a sampling size and a number of replications that will minimize the error variance by making use of the results of a previous similar experiment or a pilot experiment. The theory leading to such a sample size and number of replications - often referred to as optimum conditions of sampling - has been discussed by Kempthorne (25). 95. University of Ghana http://ugspace.ug.edu.gh CHAPTER FOUR MULTIVARIATE ANALYSTS OF VARIANCE 4.1 The Multivariate Analysis of Variance Test , In the univariate analysis of variance described in Chapter 3, the test statistics for testing the hypothesis of no treatment effects, H^, were of the form F = SSy/q SSg/ne where (i) SSy = Treatment sum of squares with q degrees of freedom | (ii) SSg = Error sum of squares with ne degrees of freedom. 2 2 We noted that since SS^ is distributed as a e x with ne degrees of freedom, and, under the null hypothesis H^, SSy is 2 2 distributed as ae X with q degrees of freedom, the statistic F has an F-distribution with q and ne degrees of freedom. It has been shown (43) that the likelihood ratio criterion for the same hypothesis H^ is X = SS„ ______E____ SSg + SSy N /2 (4.1.1) where N is the total number of observations. University of Ghana http://ugspace.ug.edu.gh 99. In the multivariate analysis of variance, the observations are column vector variables with P components. They are assumed to be normally and independently distributed with common covariance matrix $. The expected value of X ^ is assumed to be E C ^i j) = y + + Sj j ..... (4.1.2) where p , “i's and ^j'S are c°lumn vectors. Under the above situation, the error sum of squares is shown Cl) to be ss' = 2ECX.- - X-. - X.. + X..)CX-. - X.. - X.. + X..)' .E i - 13 l 3 i] i 3 C4.1.3) Similarly, the treatment sum of squares is SS^ = JEQC. - X..) (L. - X..) ’ - .... (4.1.4) i Anderson (1) and Wilks C43) have shown that SS^ has the Wishart distribution, W(p,ne ,$), and, under the null hypothesis H^, SS^ - has the Wishart distribution, W(p,q,$). Furthermore, SSg and SSy are independent. They then showed that the statistic I ssg I A = .---------- , , C4.1.5) l S S £ + S S y | has the U n distribution Cl) under the null hypothesis H.. P >4 A University of Ghana http://ugspace.ug.edu.gh The multivariate test statistic analogous to (4.1.1) for the same hypothesis is shown (1) to be (4.1.5). It has been shown (1,31) that for (i) p 5 2 , any q, and (ii) q < 2 , any p ) the statistic A is exact for testing the null hypothesis H^. On the other hand, if p > 2 and q > 2, the appropriate test statistic to use is Q = - m Log A , (4.1.6)5e where (i) m = n - P +Cl + ^ J 2 (ii) n = total degrees of freedom. If m is large (m >30), the first approximation consists ? in using Q as x with pq degrees of freedom. If m is not large, one may use the first approximation suggested by Bartlett (31). The approximation gives the test statistic as ' - 1 - a ^ s w ms + 2 X, h P m Ce l fvm A s Wh Xj. are the yields of the five plants on a plot. The values of the row totals (X^.), column totals and overall total fX..) are given in Table 4.2.1. University of Ghana http://ugspace.ug.edu.gh 101 . Table 4.2.1 * Values of x!., X.'., and x!,---------- • 1 J Varieties X.'.1 Vl V 2 V, V5 V6 V„ v„ v.10 V.11 2a 13 14 15 16 ;17 1^8 19 a‘ a2 83.12 96.41 11.93 173.57 122.03 137.27 152.88 128.72 91.56 238.85 174.02 196.52 51.26 53.52 9.94 56.37 31.51 26.99 38.04 44.38 63.48 63.46 70.57 62.20 73.67 88.34 69.85 55.12 103.94 31.44 16.21 12.95 18.90 154.06 237.73 157.76 116.83 82.77 154.78 108.37 75.94 105.35 142.25 166.99 122.84 129.07 161.80 128.85 65.33 108.59 100.06 88.40 95.08 89.60 89.90 42.99 68.62 178.16 140.21 147.39 210.14 194.44 125.04 131.67 174.51 141.14 194.91 133.14 182.75 75.24 63.73 56.04 32.92 42.34 59.77 41.99 65.22 10.91 18.46 15.57 9.25 82.73 63.33 57.15 46.72 34.91 34.87 30.48 65.02 26.72 28.61 39.58 23.91 34.80 43.13 37.97 35.86 37.38 49.37 39.40 48.66 75.69 85.46 51.95 47.97 University of Ghana http://ugspace.ug.edu.gh 102. Blocks x:. E 1 579.55 675.79 552.40 544.40 678.80 2 629.66 669.03 640.28 565.11 580.39 3 822.18 614.59 691.79 561.84 598.57 x l. = (2031.39, 1959 .41, 1884 .47, 1671.21, 1857.76) /s ' 100472.73 80828.21 86656.28 75583.81 79792.64^1 80828.21 75790.35 71752.14 63296.61 66440.59 A xijxij = ij J J 86656.28 75583.81 71752.14 63296.61 91070.21 69634.35 69634.35 64070.64 69581.10 61522.69 ^79792.64 66440.59 69581.10 61522.69 72470.62 - J 90806.51 77297.37 84405.46 71123.84 74975.23 77297.73 84405.46 68202.96 72195.76 72195.76 84669.01. 61028.35 68473.45 64121.91 67682.95 J 71123.84 61028.35 68473.45 58484.14 58332.55 y74975.23 64121.91 67682.95 58332.55 65159.55 ^67063.31 62772.31 61527.50 53961.25 59570.42^ EX.,x!. j J . 62772.31 61527.50 61048.28 58420.94 58420.94 56841.76 51961.03 50054.89 ' 57582.35 55269.78 I 53961.25 51961.03 50054.89 44344.39 49225.15 ^59570.42 57852.35 55269.78 49225.15 55043.21J 65500.71 63179.76 60763.38 53886.96 59902. l F 63179.76 60941.07 58610.30 51977.54 57779.57 x ..x ! . _ T T 60763.38 58610.30 56368.68 49989.60 55569.71 IJ 53886.96 51977.54 49989.60 44332.42 49281.06 ^59902.13 57779.57 55569.71 49281.06 54782.08 -J University of Ghana http://ugspace.ug.edu.gh 103. From equation (4 .1 .3) SS^ = x . . x !. - HE HE EX. .X.. EX.-X.. i 1 1 ~ j— 3— I J I + X..X.'. IJ 5 r 8103.62 3937.93 1486.68 4385.68 5149.11 3937.93 7480.18 -254.26 2284.76 2245.89 1486.68 -254.26 5928.12 1095.60 2198.08 4385.68 2284.76 1095.60 5574.54 3246.04 J5149.ll 2245.89 2198.08 3246.04 7049.93 From equation (4.1.4) SSV I Xi .Xi . 1------ J r 25305.80 14117.96 23642.08 17236.87 15073.09 X. .x:. IJ 14117.96 7261.89 13585.45 9050.81 6342.34 23642.08 13585.45 28300.32 18483.84 12113.22 17236.87 9050.81 18483.84 14151.72 9051.49 "A 15073.09 6342.34 12113.22 9051.49 10377.47 Hence SSE +SSV = a 33409.42 18055.90 25128.76 21622.55 20222.21 18055.90 14742.07 13331.18 11335.58 8588.23 25128.76 13331.18 34228.45 19579.45 14311.31 21622.55 11335.58 19579.45 19726.26 12297.54 20222.21 8588.23 14311.31 12297.54 17427.40 University of Ghana http://ugspace.ug.edu.gh Now IO4 . SS^ | = 0.27 x 1019 j and | SSg + SS^| = 0.49 5c 1020 I SSE I Therefore A = ----------,— j | SSg + SS.V 1 19 0.27 x 10iy ha‘0.49 x 10 = 5.51 x 10" 2 - 2 Hence Logg A = Logg 5.51 + Logg 10 , 1.7066 + 5.3945 „ 3.1014 , -2.8986 . From equation (4.1.6), Q = -m Loge A > = m x 2.8986 . Now m = n - E±9-L = 62 = 49 , 2 2 Therefore Q = 49 x 2.8986 j = 142.03 University of Ghana http://ugspace.ug.edu.gh 105. Since m = 49 is large, the x approximation is appropriate for the test. 2 From standard tables, .lchh (5% level) = 124.3. Thus, Q is significant, and hence the varieties are signifi­ cantly different at the 5$ level. II. The Case of Using the Five Correlated Variates x, y, h, z, 9 ._______________ Under this situation, the five correlated variates are x , y , h , z , 0 . The row totals (X^.), column totals (X.^.)> and overall total (X..) are given in Table 4.2.2. 2 University of Ghana http://ugspace.ug.edu.gh 106, Table 4.2.2 * Values of X^., xl^, and X.. x:. 478.00 42.75 45.00 33.00 70.60 567.65 45.12 46.40 37.60 88.73 359.36 19.88 1 2 .0 0 60.20 16.29 608.76 45.14 54.80 60.60 167.98 527.77 44.90 57.60 37.40 108.78 390.97 39.07 88.80 33.80 134.93 527.28 39.35 65.80 42.90 127.58 509.92 37.06 51.20 49.30 104.17 465.80 29.31 38.60 56.50 76.53 469.93 43.74 85.60 43.50 182.95 446.16 45.38 78.80 40.20 159.94 414.27 41.12 86.60 42.10 169.70 447.33 34.45 40.60 35.50 55.84 355.19 27.47 29.80 58.50 52.41 376.47 23.79 13.80 36.50 12.82 502.58 37.66 33.60 44.00 61.25 438.37 20.65 25.80 62.80 39.36 348.03 22 .0 1 25.20 46.90 29.17 394.75 32.57 34.80 30.30 37.96 486.44 29.77 23.00 57.80 43.84 466.70 33.44 37.40 46.60 64.91 Blocks X 1 'j 1 3236.01 249.39 333.00 310.40 595.14 2 3207.68 244.70 311.80 315.30 584.34 3 3138.12 240.54 330.40 330.30 626.26 University of Ghana http://ugspace.ug.edu.gh 107. X.. = (9581.81, 734.63, 975.20, 956.00, 1805.74) A/ij EXi.Xi. i---- (l492786.75 114639.57 151793.96 145539.65 287600.18 114639.57 9085.10 12497.05 10871.36 23720.98 151793.96 12497.05 19465.37 14210.61 37225.89 145539.65 10871.36 14210.61 15252.36 26992.21 287600.18 23720.98 37225.89 26992.21 74160.42 1490855.50 114495.90 151324.03 145562.00 286427.2? 114495.90 9061.59 12417.16 10872.97 23529.39 151324.03 12417.16 18766.94 14196.38 35718.58 145562.00 10872.97 14196.38 15202.95 26893.41 286427.25 23529.39 35718.58 26893.41 70656.85 x ..x : . HE 1457560.00 111751.93 148313.34 145344.43 ^£74549.12 1457318.25 111731.46 148320.31 145400.15 , 274638.93 111751.93 8568.23 11372.31 11143.56 21050.01 111731.46 8566.36 11371.60 11147.71 21056.35 148313.34 11372.31 15108.20 14800.23 27966.42 148320.31 11371.60 15095.47 14798.27 27951.70 145350.43 11143.56 14800.23 14517.15 27420.35 145400.15 11147.71 14798.27 14506.92 27401.38 274549.12 21050.01 27966.42 27420.35 51802.19 274638.93 21056.35 27951.70 27401.38 51757.09 University of Ghana http://ugspace.ug.edu.gh 108. From equation (4.1.3), SSE 1689.50 123.20 476.90 27.37 123.20 21.64 79.17 2.53 476.90 79.17 685.69 12.27 27.37 2.53 12.27 39.17 \1262.75 147.92 1492.59 79.83 From equation (4.1.4) SS, I V V x.. x.. HE Hence SSE + SS.V 33537.25 2764.43 3003.71 161.84 11788.31 2764.43 495.22 1045.55 -274.74 2473.03 3003.71 1045.55 3671.46 -601.88 7766.87 161.84 -274.74 -601.88 696.03 -507.96 |^ 11788.31 2473.03 7766.87 -507.96 18899.76_ 35226.75 2887.64 3480.62 189.21 13051.061 2887.64 516.86 1124.73 -272.20 2670.96 3480.62 1124.73 4357.16 -589.61 9259.47 189.21 -272.00 -589.61 735.21 -428.13 JJ051.06 2670.96 9259.47 -428.13 22358.22 University of Ghana http://ugspace.ug.edu.gh 109. Now | S % | = 0.28 x 1011 > and | SS^ + SSy | = 0.41 x 1016 T h e re fo r e | SSp A = Hence 0.28 x 1011 0.41 x 1016 6.83 x 10" 6 Loge A = Loge 6.83 + Log0 10 6 , = 1.9213 + 14.1845 > = 12.1058 y = -11.8942 . From equation (4.1.6), = -m Loge A > = m x 11.8942 ) where m = n - = 62 - 5 +-20-+ 1. = 49 2 2 Therefore Q = 49 x 11.8942 , = 582.82 „ University of Ghana http://ugspace.ug.edu.gh 110. o Since m = 49 is large, the X approximation is approximate for the test. From standard tables, .Ah h (54 level) = 124.3. Thus, the varieties are significantly different at the 54 level. The two tests given in Sections 4.2(1) and 4.2(11) show the varieties to be significantly different. We note, however, that the x value for the case of the correlated variates x, y, z, h, and 0 is more than 4 times the corresponding value for the case of the yields of the individual plants on a plot. This is an indication that the correlated variates x, y, z, h, and 0 do discriminate between the varieties better and may therefore be a better criterion for comparing the varieties. University of Ghana http://ugspace.ug.edu.gh CHAPTER FI'VE PAIRWISE MULTIPLE COMPARISONS OF MEANS 5.1 Multiple Comparisons of Means Procedures. Six multiple comparisons of means procedures - FSD, MRT, SNK, TSD, SSD, and BET - have been listed in Section 1.4. We shall in this section define these various procedures. I . The Studentized Range . Let y^, i=l>.... I> he I uncorrelated equally replica­ ted treatment means. Let y^, i=l, I, be unbiased estimates 2 2of the means, each distributed as N(y^,b a ) where b is a known positive constant. Furthermore, let S 2 be an independent mean square estimate of a with f degrees of freedom. Suppose the means to be ordered, and denote them by H - ^2 5 .... - yI * To any pair y^ and £ , i fs2 2 ( i i ) — r ^ X £ , fs2(iii) (ur - P^) and — are independent (43). a University of Ghana http://ugspace.ug.edu.gh 111 . It has been shown (34) that the random variable Cur ~ P-;) fc 2 -j i yr ~ q = _ I i_ /(-il • i ) 2 = r--- x- u a o 2 f b s has the distribution of the studentized range (43) denoted by q (=, I, f), where * is the upper percentage point and I is the number of means included in the range. For the randomised block experiment described in Section 1.1, distributed as N(y., a /J), where a is the variance of X - . withl 11 estimates of the treatment means are given by X^., and each is 2 2 ' rnfi r « E E -1 2 S as its estimate. They are equally replicated and assumed to be uncorrelated. Hence from the above discussion X . - X- . q = — ----- .................. (5.1.1) S/J2 has the distribution of the studentized range, qf", I, f). 11. Fisher's Significant Difference (FSD), The earliest multiple comparisons of means procedure was that of calculating an estimated standard error of the difference between two means and comparing the observed diffe­ rence with it, using the appropriate number of degrees of free­ dom in the t-distribution (43). We thus set up a least signi­ ficant difference (LSD) and compare each of the |I(I-1) observed differences with it. The critical value is given (14) by University of Ghana http://ugspace.ug.edu.gh 113. LSD = t(°=,f)Sd i (5.1.2) where t(oc,f) is the tabular value of student's t at the chosen Type I error rate « for the number of degrees of freedom f associated with the standard error, Sd. A modification of the LSD by Fisher (18) led to what is termed Fisher's Significant Difference (FSD). The modification consists in performing the LSD only after obtaining a signifi­ cant observed F ratio of the treatments mean square to the error mean square. He computed the critical value of the FSD FSD = LSD = t (“ , f) Sd .... (5.1.3) if the observed F value is significant or FSD = 00 if the observed F value is not significant. III. Tukey's Significant Difference (TSD)t Utilizing the distribution of the studentized range - equation (5.1.1) - Tukey (34) showed that, of the 11(1-1) differences of the type (Xr> - X^.), the probability is 1 - = that all these differences simultaneously satisfy (Xr . - X ^ ) - TS 5 yr - p. < (Xr . - X ^ ) + TS , where the constant T is University of Ghana http://ugspace.ug.edu.gh 114. T = J-l q O , I, f) . As a multiple comparisons of means procedure, Tukey regarded the estimated differences (X • - X^.) as being significantly different from zero at the = level if Xr . - Xj. > TS = SJ " 1 q(«, I, f) - Let Sd be the standard error of the difference between any two means. Then 9 Q 2 1 sd = c-=j-0 2 ’ Sd Sor — — = — r z1 1 - 3 Hence the probability is = that (X . - X. .) > q (oc, I, f) (5.1.4) r 1 / T Tukey adjudged the difference (X • - X^.) significantly different from zero (and the extreme means X . and X^. there­ fore regarded as from different populations) if equation (5.1.4) holds. His critical value, denoted by TSD, is therefore given by TSD = q(«, I, f ) - ^ - * .... (5.1.5) ' T T University of Ghana http://ugspace.ug.edu.gh 115. In practice, X^ . is successively compared with X^., X2 ., etc, until a non-significant difference is obtained. If CXT . - V non-significant the test ends there; otherwise all means X^. , X2•,.... found to be significant from Xj . are tested against Xj_1 ., again in succession starting from X^. The procedure is continued until no further differences are significant. The TSD is equivalent to FSD if is the same for both and 1 =2 , since tO,f)Sd = q O , 2 , f ) - ^ — . If I > 2, the TSD value will be larger than the FSD value. The TSD utilizes the upper percentage points of the studen- tized range as found in (30). IV. Student-Newman-Keuls (SNK), Instead of using the fixed critical value of the studentized range based on the I means, Newman and Keuls (26) suggested testing the difference (Xr . - X^.) against Sd/JT~ times the studentized range based on (r - i +1) means. Their procedure adjudges a pair of treatment means significantly different only if every subset of adjacent treatment means containing that pair is significantly different by the studenti­ zed range test suggested by them. This test takes account of University of Ghana http://ugspace.ug.edu.gh the number of means in the subset. From equation (5,1.4), we obtain the critical value as SNKV = q(<*, v, f ) - ^ - ? (5.1.6) >/T for v = 2,3, I «• Thus, SNK^ equals the FSD value, SNK^ equals the TSD value, and, for intermediate values of v, SNK is intermediate to the FSD and TSD. The computational procedure is the same as that for the TSD, except that the critical value now varies (I-l critical values in all) in the component tests instead of being fixed as previously. The SNK also utilizes the upper percentage points of the studentized range as found in (30). V. Duncan’s Multiple Range Test (MRT)- Duncan's MRT (12) is essentially a modification of the SNK procedure in which each difference (Xr- - 5L.) is tested against the 1 0 0 ( 1 - i+i^ Per cent point of the studentized range based on (r - i + 1 ) means. ar-i+l depends °n (r - i) through the relation (26) University of Ghana http://ugspace.ug.edu.gh 117. With the above modification, Duncan showed that the probabilities of error are redistributed among the components of the test, falling as the means compared get closer together in the ordering. He computed the critical values as MRTy = q O v , v, f ) - ^ - > (5.1.7) /~2~ for v= 2,3, I « Except for v = 2, the MRT values are all larger than the FSD, but smaller than the TSD, and smaller than the corresponding SNKv values. Modified studentized ranges in line with the modification in the Type I error rate «, and calculated by Duncan (12), are used for the MRT. VI. Scheffe''s Significant Difference (SSD)t Suppose that the F-test of the hypothesis H^: all <*^=0 at the “ level of significance has rejected H^. Suppose further that the F-test has I-l and fe degrees of freedom for treatments and error, respectively. Let yi , i = l, I, be the treatment means with estimates y^ . Scheffe* (33) has devised a method, based on the F distribution, of judging all contrasts among the means y^ . University of Ghana http://ugspace.ug.edu.gh 113. By defining C = E X - y - , E X - = 0 J i * 2 as any contrast with C as its estimate and Sc its estimated variance, he showed (33) that for the totality of contrasts, no matter what the true values of the C's, the probability is 1 - = that they all satisfy C - FsSc 5 C < C + FsSc > where (i) F2 = (I-l)F(-, I-l,fe) J (ii) F(“ , I-l, fe) is the = upper point of the F distribution with I-l and fe degrees of freedom. As a pairwise multiple comparisons of means procedure, Scheffe* regarded the estimated contrast C as being significantly different from zero at the <* level if I C | > FsSc . If the contrasts are the JI (I —1) differences of the type (X .-X^.^ the treatment means X^. and X . will be adjudged significantly different at the level if Xr . - Xi . > VsSd, where in this case Sc = S(j « The critical value of Scheffe's method is therefore given by University of Ghana http://ugspace.ug.edu.gh 119. SSD = FCI-I) F(«, I-l, £eJ J' Sd • (5.1.8) 1 According to Scheffe, the SSD, unlike the methods so far discussed, is insensitive to violations of the assumptions of normality of errors and equality of error variances, and does not require that the treatments be equally replicated. He, however, gave a warning that if all the . have the same variance, and all pairs X^., X . have the same covariance, and the only contrasts of interest are the JI (I-l) differences (X - - X^.)> the method of Tukey (TSD) should be used in prefe­ rence to the SSD, because the confidence intervals will then be shorter. VII. Bayes Exact Test (BET). Duncan (11) and others have investigated Bayesian approaches to the multiple comparisons problem from which they have developed procedures which directly incorporate the obser­ ved F value into the critical value. In lieu of choosing a significant level , a measure of the relative seriousness of a Type I error to a Type II error is selected. Waller and Duncan (41) improved upon the earlier work of Duncan and the result was Bayes exact test (BET). An outline of their work is presented below. As usual, we consider the JI(I —1) contrasts of the form University of Ghana http://ugspace.ug.edu.gh 120. (X . - Xj.). Waller and Duncan (41) defined a three-decision problem Q(i,r) where for each pair of estimated means (Xr-,X^ .) we choose one of the three decisions (i) d1lr \X is significantly larger than X^J (ii) d2lr : X . r is not significantly different from X^J (iii) d 3lr : X . r is significantly smaller than X^. The multiple comparisons problem was then defined as that of solving the three-decision problem Q(i,r) for all the |I(I-1) combinations of (Xr .,X^.) considered simultaneously. They formulated a linear additive loss model and defined and k,, as positive contants measuring the seriousness of the Type I and Type II errors, respectively. They then obtained the loss of any multiple comparisons decision. This loss was found to be the sum of the linear losses for the component decisions. Noting that the multiple comparisons problem was invariant with respect to changes in location, they concerned themselves only with solutions depending on the treatment means through the (I-l) Helmert comparisons (41) c : = (x r - x2 . ) / / ~ r > C2 = (Xr + l 2 . ) / S J , Ct-1 = Xr + ... + XI_1 - (I-l) Xj 1(1-1) -1 2 University of Ghana http://ugspace.ug.edu.gh 12 1, In the theoretical discussions, they considered a two- decision problem P(i,r) and showed that the derivation of the overall solution of the decision problem was equivalent to solving the two-decision problems P(i,r). From this angle they obtained a Bayes solution t(k,F,I-l,fe) for the critical equation they obtained. t(k,F ,I-l,fe) is the minimum average risk t value for the chosen Type I to Type II error weight ratio k, the observed F value, and the degrees of freedom (I-l) and fe for treatments and error, respectively. Applying their results to the three-decision problem Q(i,r) , they obtained the following three-decision component rules: (i) X is significantly larger than . if xr.- X.. > BET ; (ii) Xr . is not significantly different from X^. if Xr . - X.. < BET ; (iii) Xr . is significantly smaller than X^. if X . - X.. < -BET • r 1 > where BET denotes the critical value (the k-ratio least significant difference), and is given by BET = t(k,F,I-l,fe)Sd ............(5.1.9) Fairly extensive tables of minimum risk t values have been University of Ghana http://ugspace.ug.edu.gh 121. published by the authors (41). Examining the properties of the BET procedure, Carmer and Swanson (5) observed that large observed F ratios.result in small critical values similar in magnitude to the FSD, while small F values (F<2.5) result in considerable larger critical values. According to them therefore, BET should avoid Type I errors when the observed F value indicates little or no varia­ tion among the means, and should avoid Type II errors when the F value indicates considerable variation among the means. 5.2 The Effect of the Standard Error on Multiple Comparisons of Treatment Means-______________ From Section 5.1, we observe that each of the multiple comparisons of means procedures utilizes the standard error of the difference between any two means, Sd. We shall in this section investigate how the different standard errors from the various analyses given in Chapters 3 and 4 affect multiple com­ parisons of treatment means. We shall concern ourselves only with the case where the treatments are fixed. If the treatments are random, no meaning­ ful multiple comparisons of their means can be made, for the treatments used in the experiment are then a random sample from a population of treatments. On the other hand, if the treat­ ments are random but we wish to compare the treatments for a given sample, then the treatments under this condition are fixed University of Ghana http://ugspace.ug.edu.gh 123. We can therefore perform a multiple comparisons of the treat­ ment means, where any inferences made are strictly confined to the treatments in the sample. Let Xs . and X . be any two treatment means when the average yields per plot are used. Then, under the fixed-effects model, 2 (i) Var(X .) = Var(X .) = s r Jn (ii) Cov(Xs ., Xr .) = 0 Therefore Var(X . - X .) = Sdi = 1-- - (5.2.1)S r 1 T-n ? a a ‘ . Let Xs .. and X .. be any two treatment means when yields of the individual plants on a plot are used. Then, under the fixed-effects model, 2 (i) Var(Xg ..) = Var(Xr>.) = jn (ii) Cov(Xs .., Xr ..) = 0 There fore Var(X . - X ..) = Sj r 2 Jn ? 2 2 = . (5.2.2) c2 _ n 2o i — o ^ c dl 2 From equations (5.2.1) and (5.2.2), University of Ghana http://ugspace.ug.edu.gh ? 2 Sd; is estimated by 2n(MSP)/Jn and Sd is estimated by 1 E 2 2(MS^)/Jn. We found in Chapter 3 that MSff is a more efficient 2 estimator of ae than n(MSg). Hence, it is expected that Sd^ Sd2 Thus, a multiple comparisons procedure based on Sd2 will have shorter confidence intervals for the differences than if it is based on Sd^* An approximation to the permutation test requires the degrees of freedom of the F-test to be multiplied by (defined in Section 3.4). If >l, the error mean square, and hence the standard error of the difference between any two treatment means, will be smaller than the corresponding one under the normal- theory test. A multiple comparisons of treatment means under the permutation test may therefore have shorter confidence intervals for the differences than a corresponding one under the normal-theory test. 5.3. Application of the FSD, SSD, MRT, and BET Procedures to the "Kpong Data"____________ The FSD, MRT and BET procedures have theoretically been shown (5,41) to be more sensitive than the TSD, SSD and SNK; the sensitivity being related to the lengths of the intervals for contrasts. We shall,in this section, apply the FSD, MRT, 124. University of Ghana http://ugspace.ug.edu.gh 125. BET, and SSD procedures to the "Kpong data" (using the average yields per plot) to illustrate the theoretical findings. The varietal means in descending order are: V10 V12 V4 V11 V6 V7 V5 60.98 56.57 55.99 53.31 44.98 42.53 36.26 V8 V2 V9 V1 V21 V16 V13 34.72 29.58 25.51 23.53 21.64 20.42 18.61 V14 V20 V17 V19 OO V3 V15 17.47 14.61 13.12 12.65 9.72 5.43 4.27 There are 21 varieties and hence 210 varietal mean differences, I . Fisher’s Significant Difference (FSD). From equation (5.1.3), the critical value of the FSD test is given by FSD = t(«,f)Sd » MSE iwhe re S ^ = (2 x — — ) 2 » Making use of ANOVA Table 3.6.2 and choosing <* = 0.05, we obtain , FSD = t (0.05 , 40) (2 x ■ ) 2 > = 2 .02 (■- 1— -1.9 ! .2 ) 2 , = 1 5 . 3 4 University of Ghana http://ugspace.ug.edu.gh 126. With this critical value there are 117 significant differences and the varietal means may be partitioned into 1 1 groups; the members of each group are not significantly different. A graphical array of the means with the groups indicated by brackets is shown in Figure 5.3.1 (page 139). II. Scheffe's Significant Difference (SSD). From equation (5.1.8), the critical value of the SSD is given by i SSD = Ql-1) F(°c,I-l,fe)^2Sd * From ANOVA Table 3.6.2 and choosing « = 0.05, we obtain r- - t - SSD = h o F(0.05 , 20, 4 0 ) T (2 x ■■8-6-^-4-6.— ) * , = (20 x 1. 84)* ( 172.932 j 1 9 = 46.06 . With this critical value, there are 14 significant differences and the varietal means may be partitioned into 4 groups. A graphical array of the means with the groups indicated by brackets is shown in Figure 5.3.2 (page 331). III. Duncan's Multiple Range Test’ (MRT),. From equation (5.1.5), the I-l critical values of the MRT are given by University of Ghana http://ugspace.ug.edu.gh 127. MRTV - q (°cv, v, f)S(j/J1 ; v 2,3,.,.I «, Choosing <* = 0.05 and making use of ANOVA Table 3.6.2, we obtain MRTv = qC«v, v, 40) ( 86•-4-6-6- ) 1 - The res ultin g 20 criti cal values are given below. daNAH P 18.63 ^ 1 7 = 18.52 MRTi3= 18.28 MRTg = 17. 88 mki5 = 17..02 ngq1h P 18.63 MRT^ = 18.47 MRTiz = 18.20 MKIg = 17.72 daNSRP 16..64 mrt19 = 18.60 MRIlS = 18.41 MRT11 = 18.09 MRT? = 17.56 mrt3 = 16,.16 * * 1 8 - 18.58 ^ 1 4 = 18. 36 mrt10 = 17.98 mkt6 = 17.29 mkt2 = 15,.35 With these critical values there are 112 significant differences giving rise to 10 subsets of the means. A graphical array of the means with the subsets indicated by brackets is shown in Figure 5.3.3 (page 135). IV. Bayes Exact Test (BET). From equation (5.1.9), the critical value of the BET is given by BET = t(k,F ,I-l,fe) Sd . Using ANOVA Table 3.6.2 and Table A2 (41), we obtain BET = t(100, 10.929, 20, 40) (2 x .- ? 6 ^ ~6.6) * • By linearly interpolating with respect to F, the critical value i c cri •vrprt hv University of Ghana http://ugspace.ug.edu.gh 12.9 . BET = 1 . 859 (2 x * » 3 = 14.11 . With this critical value there are 112 significant differences. The resulting 11 groups of means are shown in Figure 5.3.4 (p.133) Theoretically, of all the four procedures discussed above, it is the BET which avoids Type I errors when there is little or no variation among the means, and avoids Type II errors when the F value indicates considerable variation among the means. It may therefore be used as a criterion for comparing the other tests. The observed F value of 10.93, as compared with the 5% level tabular value of 1.84, indicates considerable variation among the means. The results of the BET, shown in Figure 5.3.4, reflect this fact. The FSD results, shown in Figure 5.3.1, are almost the same as those of BET. In each case, there are eleven groups with almost the same composition. In effect, four of the groups have the same elements while the rest differ only by about one or two elements. These results are in agree­ ment with the findings of Carmer, Swanson, Waller, and Duncan mentioned in Section 1.4. Duncan's MRT results, shown in Figure 5.3.3, are somewhat different, though not markedly so, from those of BET and FSD. The heterogeneous nature of the means is reflected but not to University of Ghana http://ugspace.ug.edu.gh 129. the same degree as found with BET. In this case, there are 10 groups with sizes larger than those of BET and FSD, The results of SSD, shown in Figure 5.3.2, are markedly different from any of the tests discussed above. There are only four groups. This gives an impression that the means are not different contrary to the verdict of the F-test that the varieties are significantly different. The SSD is therefore less appropriate than either the BET, FSD, or MRT. This is in agreement with the findings of Carmer and Swanson (2) stated in Section 1.4. University of Ghana http://ugspace.ug.edu.gh 1*>0. F I G . 5 . 3 , 1 : G R A P H I C A L A R R A Y O F 21 V A R I E T A L M E A N S I N D I C A T E D B Y T H E V A R I E T A L S Y M B O L S . T H E B R A C K E T I N G IS P O S S I B L E AT T H E E N D O F F S D T E S T T H E M E A N S W I T H I N A B R A C K E T A R E S A I D N O T T O D I F F E R , B U T M E A N S N O T B R A C K E T E D T O G E T H E R A R E A S S E R T E D T O B E D I F F E R E N T - t ) 1------------------------- 1 1----------------------- 1 I------------------1 ------------------------- 1 I------- 1 ^15 ^ 3 ^18 ^19^17 V 2 0 M4 ^13 M g \^2I V l ^ 9 ^ 2 V 8 ^ 5 ^ 7 V 6 V „ % V , 2 V 10 University of Ghana http://ugspace.ug.edu.gh 131 . F IG . 5 . 3 . 2 S G R A P H I C A L A R R A Y O F 21 V A R I E T A L M E ANS . T H E B R A C K E T I N G IS P O S S I B L E A T T H E E N D O F S S D T E S T . M E A N S B R A C K E T E D T O G E T H E R A R E N O T D I F F E R E N T . vi5 v3 v„ v k v t0 vl4vB x 4°) (----3--- ^ * The individual critical values are given below. Mt^i = 26.17 MRTie = 25,95 MRru = 25.42 MRle = 24.29 ^ 2 0 = 26.17 MRTls = 25.87 MRTio = 25.27 i 6 C 2 = 23.91 MRlig = 26.14 ^ 1 4 = 25.80 MRTg = 25.12 i 6 C 1 = 23.38 MRTig = 26.10 ^ 1 3 = 25.68 MRTg = 24.89 i 6 C l = 22.70 MRr^ = 26.02 = 25.75 MRT? = 24.66 i 6 C X = 21.57 With these critical values there are 71 significant differences. The resulting grouping of the means is shown in Figure 5.4.2, (page 141). University of Ghana http://ugspace.ug.edu.gh 138. (ii) The Case of Using the Correlated Variates x,y,h,z and 0 . In this case, the varietal means are defined by the average of the five correlated variates x,y,h,z and 0 . As a result, the means are different from those given in Section 5.3, They are given below in descending order. V4 V 10 V 7 V2 VS V11 V12 187.46 165.14 160.58 157.10 155.29 154.10 150.76 V r V, V„ V, V„16 V1 9 21 20 150.13 137.51 135.82 133.87 133.36 129.81 128.17 '13 17 19 "14 122.74 117.40 106.08 104.67 V18 94.26 3 93.55 15 92.68 . From Section 4.2 (II), the error sum of squares is given by SSg 1689.50 123.20 476.90 27.37 1262.75 123.20 21.64 79.17 2.53 197.92 476.90 79.17 685.69 12.27 1492.59 27.37 2.53 12.27 39.17 79.83 1262.75 197.92 1492.59 79.83 3458.46 By the same argument given in Section 5.4 (IV) (i), the error mean square is given by S“ = 1689.50 + 21.64 + 685.69 + 39.17 + 3458.46 E ________________________________________ 40 x 5 29.472 . University of Ghana http://ugspace.ug.edu.gh The critical values of Duncan's MRT are given by MRTV = q O v> v, 40) ( 2 -9- ' 4- - 2 ) 1 . The individual critical values are given below. 139. MRTzi = 10 .8 8 ^ 1 6 = 10.78 MRTn = 10.56 mrt6 = 10.09 ^ 2 0 = 10 .8 8 MRTis = 10.75 MffiO- 10.50 mrt5 = 9.94 MRTig = 10 .86 MR114 = 10.72 MRTg = 10.44 mrt4 = 9.72 MRTis = 10.84 MRr^ = 10.67 MRTg = 10.34 mrt3 = 9.43 ^ 1 7 = 10.81 * * 1 2 = 10.63 MRTy = 10.25 mrt2 = 8.96 With these critical values there are 170 significant differences. The resulting grouping of the means is shown in Figure 5.4.3, (page 142). University of Ghana http://ugspace.ug.edu.gh 14°. F IG . 5 . 4 . i : G R A P H I C A L A R R A Y O F 21 V A R I E T Y M E A N S . T H E B R A C K E T I N G IS T H E R E S U L T O F D U N C A N ' S M U L T I P L E R A N G E T E S T ( M R T ) . M E A N S B R A C K E T E D T O G E T H E R A R E N O T D I F F E R E N T ( T h e case of using Y iel ds of the individual p l a n t s a plot with within plots mean squar e as error) V V e 3 V V V V V V V V v vI t 19 17 2 0 14 13 16 21 I 9 V V X I \ P it V V VI I 4 12 ” 1 V. , University of Ghana http://ugspace.ug.edu.gh 141 . F I G . 5.4.2; G R A P H I C A L A R R A Y O F 21 V A R I E T A L M E A N S . T H E B R A C K E T I N G IS P O S S I B L E A T THE END O F D U N C A N ' S M U L T I P L E R A N G E T E S T ( M R T ) . M E A N S B R A C K E T E D T O G E T H E R A R E N O T D I F F E R E N T (The cos e of Mult ivor iote Anolys i s of Var iance using Yields of the ind iv idual p l an t s On a p lot ) r 1 Vie V^V^V jo V,3 V^ V2I V, V9 V2 I-------------------------------- 1 V8 v 5 V7 V6 Vtl V4 Vli: V|0 University of Ghana http://ugspace.ug.edu.gh 141. F IG . 5 . 4 . 3 : G R A P H I C A L A R R A Y O F 21 V A R I E T A L M E A N S O B T A I N E D F R O M THE C O R R E L A T E D V A R I A B L E S . T H E B R A C K E T I N G IS T H E R E S U L T OF D U N C A N ' S M U L T I P L E R A N G E T E S T ( MR T ) . M E A N S B R A C K E T E D T O G E T H E R A R E N O T D I F F E R E N T . ( T he cose of mul t i var i ate Anal ysi s of Va r i ance using the correlated V a r t a U S x , V V14 19 YzO Y 2I V9 VI V I6 V6 V V V V V V V 8 12 L 5 2 7 v IO University of Ghana http://ugspace.ug.edu.gh There is considerable variation in the results shown in Figures 5.4.1, 5.4.2 and 5.4.3 even though the same test was employed. The variation is due to the different standard errors used in each case. Duncan's MRT produced 10 groups when the average yields per plot were used. Using yields of the individual plants on a plot, the MRT produced 12 groups. Both cases show evidence of heterogeneous means but the latter one is more pronounced. This is to be expected in view of the deductions given in Section 5.2. The MRT applied to the multivariate analysis of variance using yields of the individual plants on a plot produced 8 groups. This gives an impression of homogeneous means contrary 2 to the verdicts of the x and F tests. For the multivariate analysis of variance using the corre­ lated variates x,y,h,z, and 0 , the MRT also produced 8 groups (Figure 5.4.3). The nature of these 8 groups is, however, different from all the other cases considered earlier on. They do not overlap to the extent as found in the other cases ; in fact, three of the groups are disjoint. Thus, the heteroge­ neous nature of the means is reflected. The MRT applied to the permutation test produced the same results as the case of the average yields per plot under the fixed-effects model. This is to be expected since, because 145. University of Ghana http://ugspace.ug.edu.gh 144- only three blocks were used, M2 X a22 has the F-distribution with f^ and f2 degrees of freedom. Davies (10) method consists in finding approximate confidence 2 1 i m i f e • P n ' w n r University of Ghana http://ugspace.ug.edu.gh 147. In the F tables, the values of f^ or f^ equal to infinity correspond to the case where one variance is known exactly, 2 The confidence limits for c a r e obtained by multiplying 2 2 those for the ratio by the known value of . If a 2 = 1, 2 the confidence limits of the ratio are those of . To find 2 the limits for a single variance therefore, we find the percentage points corresponding to f^ = » and f^ = number of 2 degrees of freedom on which the estimate of cr^ is based. Let (i) L^(°0 = the multiplier for the lower confidence limit of the ratio of two variances based on f^ and f2 degrees of freedom for probability ( i i ) ^ ( “O = corresPon 2 ~ 2 ■ • 1 aa and to do the comparison. Since a a and are similar, » 2we will only consider a . P University of Ghana http://ugspace.ug.edu.gh 153. From equation (6.2.1), Var(a^ ) = Dg = ~ b J 2 Var(MSy) + Var(MS£) J2n2(I-l) From equation (6.2.3), r 2 +„ 2 l2' , 2 . 2 . T 2 ,2 , tCTe 5(ag + ncr^ + Jna^ ) + ------------ J-l V a r K 2) - D 6 - 4 - j J n Var(MSy) + VarfMS^) J2n2 (I-l) , 2 ^ 2 . 2 r 2 2 T 2 , 2 , Cae + na* J “T (a- + ncr~ + Jna^ ) + ----------- 6 J-l 2 Thus Dj. = Dg. That is, the efficiency with which a „ is estimated is the same whether one uses the average yields or yields of the individual plants on a plot. 6.3 Estimation of Variance Components by the Restricted Maximum Likelihood Method*___________________________ We remarked in Section 6.2 that the analysis of variance method for estimating variance components could lead to negative estimates, and that this objectionable situation led to the search for other methods. One of the methods which evolved, and which is appropriate for the' randomised block experiment described in Section 1.1, is based on a restricted maximum likelihood princi- University of Ghana http://ugspace.ug.edu.gh 154. pie. It is called the restricted maximum likelihood method and is due to Thompson (39). Generally, it consists of maxi­ mizing the joint likelihood of that portion of the set of sufficient statistics which is location invariant. I . The Restricted Maximum Likelihood Method. To begin with, Thompson (39) considered a maximization problem. He defined the following: (i) w and w are p dimensional column vectors where w is the restricted maximum likelihood estimate of w,* (ii) A is a non-singular p x p matrix with 9 7 as its transpose J (iii) g(w) is a function of p variables (iv) G-(w) = - - g- & L . , i = l ,2 ,.... p * 1 3wi (v) G(w) is the column vector with elements Gj (w) , G2 (w),.... Gp(w)‘, t h ~ 1 ?(vi) bj and a^ are the i rows of A and A respec­ tively. He then set about the maximization problem by considering the following theorem from the theory of non-linear programming. Theorem I : If g(w) is a differentiable function at w, then a necessary set of conditions that it has a relative maximum at w subject to the constraints A ^w > 0 is University of Ghana http://ugspace.ug.edu.gh 155. (i) A_1w > 0 ( i i ) A G(w) < 0 * ( i i i ) w 'G(w) = 0 » (6.3.1) An alternative form of the theorem is: for each i, i=l,2,....p, either or (i) b^w = 0 and a^G(w) < 0 (ii) b^(w)>0 and a^G(w) = Ok ( 6 . 3 . 2 ) He referred to the set of all points w satisfying A ^w > 0 as the admissible region. Thompson studied the conditions under which there will be a unique solution to the relations (6.3.1) or (6.3.2) by consi­ dering a property of strictly concave functions and their tangent planes. In effect, he came out with the following theorem and its corollary. Theorem 2 : If g(w) is a strictly concave function having gradient vector G(w) and if w satisfies (6.3.1), then g(w) > g(w) whenever A \ ^ 0 m d w / w. Corollary: If g(w) is a strictly concave function having gradient vector G(w), then the solution to (6.3.1) or (6.3.2) is unique. He next considered a general theory for multiple classifica- University of Ghana http://ugspace.ug.edu.gh 15$. (i) Mp be p mean squares whose expected values are the p independently distributed variables w ^ , w^ with degrees of freedom f^ ,......fp respectively J (ii) M = (Mx ......Mp ) ' - ,. - -, 2 , 2 2 •, 1 ( m ) a = (ct1 .... ap ) ; (iv) w = (w1 ,... Wp) ' . Then E(M) = w = A a2 - Thus, a2 = A ^w if the inverse exists. He noted that each has a gamma distribution with parameters f^/2 and 2w^/f^. He obtained the logarithm of the density function of M, , M as1 1 p P Mi g(w) = constant - Z f. (log w^ + — ) t .... (6,3.3) i=l 1 wi „2 In order to find a (the vector of restricted maximum likeli- 2hood estimates of the components of a ) he maximized equation (6.3.3) twice with respect to w, , w and subject to the P set of linear constraints A-1w > 0 » University of Ghana http://ugspace.ug.edu.gh 157 . For the function g(w) - equation (6.3.3) (6.3.4) Hence condition (6.3.1) (iii) becomes ( 6 . 3 . 5 ) Thompson applied the above results to two special cases - randomised block experiment with one observation per plot and n observations per plot. Discussed below are the two special cases with observations defined as average yields per plot in one case and yields of individual plants on a plot in the other case. II. The Case of Using Average Yields Per Plot> The ANOVA table for this situation is given by Table 3.1.1. The expected values of the mean squares are given in Table 3,1.2 and is reproduced below in Table 6,3,1, University of Ghana http://ugspace.ug.edu.gh 158. Table 6.3.1 £ Expected Values of Mean Squares of Table 3,1,1 Under the Random-Effects Model. Treatments and Environmental Effects .are assumed to be Additive, Source Mean Square Expected Mean Square ^Treatments ... . _ < II 2 U3 2 EO E v)l - w 3 - ”4 * Jo. ; Blocks msb = m2 2 e (hsb) - » 2 - ^ ; \ 2 Error nri l n E 2 6CnrIm = wx Thus , where E (M) = w (i) M A 2 > r A m 2 ; (ii) w = | w 2 ’ (iii) a2 = 2 d N V w„ V 3 / \ 1 a e /n\ l ° t JO a 1 0 °\ f 1 0 1 1 0 * > (v) A- 1 = - 1 H 1 0 i V- 1 0 The conditions (6.3.1) then become , £2 (W2 ~ V , f3(w3 ‘ M3). (ij U I —x + , w. - 2 W2 (ii) wx < w2 ; (ii) w2 > M2 ; (iii) w-j 1 w-^ ; (ii i)1 w, * M'3 » .(6.3.6) University of Ghana http://ugspace.ug.edu.gh 159. According to Thompson, the necessary conditions for maximum likelihood estimates are that, for the conditions given in (6.3.6), equality must hold in at least one equation of each pair; further, if M^ > 0 , a maximum likelihood estimate cannot occur at w^ = 0. From these considerations, he obtain­ ed four possibilities. The solutions under the various conditions are given in Table 6.3.2 below. These solutions are found to be mutually exclusive and hence the maximum likelihood estimates are unique. He denoted Mr ^ to mean the mean square obtained by pooling Mr , ,M^. That is (f M . f i M, ) ^ r r +.... + k kJM (£r +____+ £k) (6.3.7) Table 6.3.2 * Estimates of Variance Components of Table 6.3.1 Under Four Possible Conditions. CCNDITICNS a.2 /% /n * B2 J^ cc2 (i) < M2 , M1 < M3 Ml «2 -«L m3 - m l (ii) M l ^ M2 , M2 * ^ 2 0 (iii) V - M3, M3 1 M 13 M2, M ^ > M3 and either or M123 0 0 University of Ghana http://ugspace.ug.edu.gh 160. III. The Case of Using Yields of the Individual Plants on a Plot*____________________________ The ANOVA table for this situation is given by Table 3.2.1. The expected values of the mean squares (under the random-effects model) are given in Table 3.2.2 and is reproduced below in Table 6.3.3. Table 6.3.3 ; Expected Values of Mean Squares of Table 3.2.1 Under the Random-Effects Model * Source Mean Square Expected Mean Square Treatments s T«> E 0By) = w4 = ae2 + na/ + Jna*2 Blocks = M3 2 2 2 E(MSg) = = ae + na^ + Ino^ Interaction nro i c E C ^ ) = w2 = oe2 + v a f Within Plots -- E(MSW) = wx = ae2 Thus, E(M) = w = Act > f « \ where (i) M = M, V M, •, (ii) w = V w. w„ w. w„ (iii) a2 = University of Ghana http://ugspace.ug.edu.gh 161 Civ) A = A 1 1 >1 Cv) , - 1 The conditions given by equation (6.3.1) now become C i ) w : > 0 ; f 1 Cw1. - M ^ f 2 Cw2 - M2 ) _ £ 3 (i ) > .) = ; ES ( i S > . S l i j u - ------------ -^------ + w Wo T w„ w. C i i ) wi < w 2 • y C i i ) ' j . V a C i i i ) * W2 - * 3 V 9 C i i i ) ' w 3 > r ) a Civ) W2 1 A • C i v ) ' s T A 1 .... (6.3.8) In this case, Thompson obtained eight possibilities depending on whether equality does or does not hold in (ii) , (iii) and (iv) of equation (6.3.8). Table 6.3.4 gives the various solu­ tions and again they are found to be mutually exclusive showing that the maximum likelihood estimates are unique. The notation is that of equation (6.3.7). University of Ghana http://ugspace.ug.edu.gh 161. Table 6.3.4 * Estimates of the Variance Components of Table 6.3.3 Under Eight Possible Conditions,. Conditions „ 2 p e * 2 njo: » 2 J n a j (i) , M2 < Ml M2 - A M3 - M 2 m4 - M2 (ii) m1 M4 Ml “3 - ^ 4 0 (iv) Mj_ > M2 , M12 <*r = M4 M 2 3 Jn £2M 2 + f3M 3 where = * f + f 2 3 The variances of these estimators are given in Table 6.3.5, A 2 A 2 Table 6.3.5 I Variances of a e and (Restricted Maximum Likelihood Method). Treatments and Environmental Effects are assumed to be Additive in Both Cases » Var(ae2 ) Varfa^2) Average Yields Per Plot 9 42 a„ n - D - 2 rr 2 2,2 4-i (ae + Jna^ ) aQ 5 I (J-1) 6 j V I-l I(J-1) Yields of Individual Plants on a Plot 2 dp4 n - ' 2 (ae 2 + Jna*2) 2 o f c •"J II I-H 20 r— \ 3 1 h-1 V—/ 8 T 2 n 2J n I-l I(J-1) University of Ghana http://ugspace.ug.edu.gh D7 < D5 and Dg = Dg • This situation is the same as that obtained in Section 6.2 (III). Accordingly, the inferences made in that section hold good for this section. V. Comparison of the Analysis of Variance and Restricted Maximum Likelihood Methods v____ Examination of the estimates provided by the analysis of variance and restricted maximum likelihood methods indicates that the two methods are identical when the former gives posi­ tive estimates. For example, under condition (i) of Tables 6.3.2 and 6.3,4, the analysis of variance method gives positive estimates which are the same as those obtained by the restricted maximum likelihood method. The two methods are, however, diffe­ rent when the analysis of variance method gives negative esti­ mates. What the restricted maximum likelihood method does in this situation is to assign the value zero to the otherwise negative estimates and to simultaneously adjust the values of the other estimates by pooling some of the mean squares. We shall examine one of the conditions under which the 164*. From Table 6.3.5, University of Ghana http://ugspace.ug.edu.gh 165. analysis of variance method gives negative estimates. One such condition is (ii) of Tables 6.3.2 and 6.3.4. (a) The Case of Using Average Yields Per Plot* We shall assume that the treatments and environmental effects are additive. Let oe ^a, aoc-^ and ia n (I-l)(J-l) (ii) Var(a 2 )la' T2 2J n s 'f A Jna J ) 2 I-l (I-l)(J-l) Let Sei2r , 3„^ and be the restricted maximum likelihood 2 2 2 estimators of /n, a^ and aR when the average yields per plot are used. University of Ghana http://ugspace.ug.edu.gh 166. Then under condition (ii), 2 CTeir M1 2 9 ° B l r = 0 » - 2 M 3' “ M12 Ooclr J Their respective variances are 2a~, (i) Var(aelr) = n I(J-l) (ii) VarfS.^) = i ~ z J n c 2 j 2 . 2 (ae +.Jna0C) ae + I-l I (J-l) Comparing the four variances given above, we observe that (i) Var(£ei2r) < Var(Sel2a)^ (ii) Var(a M. - .^ 3 (ii) (iii) S, c2r 2 Tf2r Jn M23 ~ “l n Their respective variances are V# University of Ghana http://ugspace.ug.edu.gh 162. (i) Var(42r) 2 oe IJ (n-1) (ii) Var(5tt2r) = 2 ^ 2 -r 2 , 2 , (0e - + na^ + Jnad J t.a fij + nj=iT 2 2 ,2. + re "Oy (iii) V a r ^ ^ p = n r~ r 2 2,2(0e + na^ ) iQ-i) 4 -r IJ(n-l) Comparing the six variances given above, we see that (i) Var(ae2r) Var(5e2V ; (ii) Var(3a:22rJ < Var(o0c72a) *,*2a-' > (iii) V a r ( ^ 22r) < V a r ^ ^ ) . (6.3.1 0 ) The results shown in equations (6.3.9) and (6.3.10) indicate that the restricted maximum likelihood estimators are more efficient than those of the analysis of variance. Thompson has, however, shown that the estimators may be biased due to the pooling of the mean squares. Nevertheless, the restricted maximum likelihood method is more plausible than the analysis of variance method when the latter gives negative estimates. Biased estimates are more tolerable than negative University of Ghana http://ugspace.ug.edu.gh estimates of essentially positive parameters since no suitable explanation can be offered for the negative estimates. 6.4. Estimation of Variance Components for the "Kpong Data", Assuming the random-effects model, the variance compo­ nents and their estimates for the cases of using the average yields per plot and yields of the individual plants on a plot, employing the analysis of variance and restricted maximum likelihood methods, are those given in Sections 6.2 and 6.3. I . The Analysis of Variance Method. Employing the analysis of variance method, the estima­ tors using the average yields per plot are those given by equa­ tions (6.2.1) and (6.2.2). From Table 3.6.2 therefore, we obtain Making use of equations (6.1.1), (6.1.2), and (6.1.3), the standard errors of the estimates and the 95% confidence intervals for the variance components are: (i) 58.105 i a e2/5 1 141.631 ; (ii) S.E.tf,2 ) = 95.183 > 125 . 527 < a J < 689 . 373 J (iii) S.E.(Sg2 ) = 1.117 » -3.849 1 0 o 2 1 38.248. When yields of the individual plants on a plot are used, the estimators provided by the analysis of variance method are those given by equations (6.2.3) and (6.2.4). From Table 3,6.4 therefore, we obtain (i) 0 e2 = 131.208 ; ,2 112.980 - 423.984 _ on t o . (n) a. = ------------ ------ - > (iii) S 21 x 5 4741.839 - 423.9 84 3 x 5 423.984 ■- 131.208(iv) a 2 = 4 2 5 , 9 8 4 ~ 131-208- = 58.555 From equations (6.1.1), (6.1.2), and (6.1.3), the standard errors of the estimates and the 95°s confidence intervals for the true parameters are: University of Ghana http://ugspace.ug.edu.gh 171. (i) S.E. (5e2 ) = 11 .643 • Confidence limits coincide with its value (ii) S.E. (5g2 ) = 1. 355 , -3. 772 < 0 g2 < 38. 397 ‘ (iii) S.E. (a. 2 ) = 95.514 , 126 .634 < a j < 692.494 (iv) S.E . ( ^ 2 ) = 18.650 , 30.742 1 ct? 2 5 112.654 • 2 We observe from the above estimates that estimates of ag are negative. A remark on the possibility of such a situation occurring was made in Section 6.2 (II). 2 2Estimates of a K and og using yields of the individual plants on a plot are slightly bigger than the corresponding ones when the average yields per plot are used. The same is 2 a 2 not true for a e - the value of a e for the latter case is more than three times the corresponding value for the former case. This has been explained in Section 3.6. (II). University of Ghana http://ugspace.ug.edu.gh II. The Restricted Maximum Likelihood Method, The estimators provided by the restricted maximum likelihood method using the average yields per plot are those given in Table 6.3.2. From Jf able 3.6.2, Ml = 86 .466; M 2 = 22. 559‘, M 3 = 944.993 . This is condition (ii) of Table 6.3.2. Hence, the estimates are: 2 flMl + f2M2 * C i ) 5 e / n = ^ 2 = — » f + f 1 2 i7a. -'2, 40 x- 86.466 + 2 x 22 .559 _ QX ,7I-Hence a 0 /5 = 83. 375 40 + 2 (ii) 15 2 = 0. Hence (iii) J S 2 = M 3 - M 12 * *2 M 3 M 12 944.99 3 - 8 3.375 00_Hence a = = = 287.206 J 3 -------- The standard errors and 95% confidence intervals are: (i) S.E.(ae2 /5) = 17. 776 , 56.445 1 a f /5 1 135.318 ; (ii) S.E. ( a e ) = 95. 169 > 127.502 <; at2 < 688.533 • > University of Ghana http://ugspace.ug.edu.gh 173. When yields of the individual plants on a plot are used, the estimators provided by the restricted maximum likelihood method are those given in Table 6.3.4. From Table 3.6.4, Mx = 131.208; M2 = 423.984*, M 3 = 112.980; M4 = 4741. 839. This is condition (ii) of Table 6.3.4. Hence the estimates are: (i) 0 e2 = Mx = 131.208 ; * 2 (ii) na^ = M 23 - M.^ > where M „ - £ * . . , « ^ 9 J 4 5 _ , 409.174 1 f + f 4 0 + 2 2 3 Hence I 2 = 409 . 174,,- 131.208 = 5 5 > 5 9 3 0 ^ * * 2 a 2(iii) InCTg = 0. Hence a. = 0 ‘ (iv) Jna 2^ = M 4 - M 2 3 ; Hence 5a2 = 47.41^39.,,-„.,409.174 = 3 x 5 -------- The standard errors and 95% confidence intervals are: (i) S.E. s jaf ) = 11 . 643 \ The confidence limits coincide with its value. University of Ghana http://ugspace.ug.edu.gh 174. (ii) S.E. ( ^ 2 ) = 17.602, 29.161 ^ n 2 < 106.5760 ~ (iii) S . E . ( S j ) = 95 . 492 ? 128.570 < a 2 $ 691.584 . 2 2Estimates of and cj6 for the cases of average yields per plot and yields of individual plants on a plot follow the same trend as obtained in Section 6.4 (I). With the notation of Section 5.4, we have (i) 5ei2a = 86 .466 , = 286 . 176 , = -3.043 ; (ii) ael2r = 83.375 , S ^ T = 287.206 , 3 ^ = 0 J (iii) 5e22a = 131.208, 5 ^ = 287.857, < ^ = -2.962, 5 2 = 58.555; (iv) ae22r = 131.208, = 288.844, a ^ r = 0 , 0^ = 55.593 . Thus the restricted maximum likelihood method has elimina­ ted the objectionable negative estimates while not causing any major changes in the other estimates as compared with those obtained by the analysis of variance method. University of Ghana http://ugspace.ug.edu.gh CHAPTER SEVEN PSEUDO-FACTORIAL ARRANGEMENT The pseudo-factorial arrangement, proposed by Yates (44), is a way of ensuring increased efficiency in an agricultural field experiment with considerable soil fertility fluctuations. It increases the efficiency by avoiding the use of excessively large blocks while at the same time avoiding the use of some of the treatments as controls. In this type of arrangement, the treatments are divided into sets for comparison in more than one way, the sets of each division being so arranged that they cut across those of all the other divisions. Thus 64 treatments, numbered 00 - 63, may be divided into sets of 8 in two ways, the first group of 8 sets consisting of treatments 00 - 07, 08 - 15, 16 - 23, etc., and the second group of 8 sets consisting of treatments 0 0 , 08, 16...... 48; 01, 09, 17,..... 49 ,57; 02, 10, 18,......50,58; etc. Each set of 8 can be arranged in the field in the form of one or more randomised blocks of 8 plots each, or in the form of a 8 x 8 Latin square, according to the number of replications that are feasible. Pseudo-Factorial Arrangement in Two Equal Groups of Sets. According to Yates, two equal groups of sets are possible 2 if the number of treatments is a perfect square, say p . He University of Ghana http://ugspace.ug.edu.gh 176. defined the following: = mean yield of the treatment in the i^h set of the first group and the j ^ set of the second group over the replicates of the first group of setsj Y^. = mean yield of the treatment in the i^*1 set of the first group and the set of the second group over the replicates of the second group of sets. With the above notation, the two groups may be set up as in Table 7.1. Table 7.11 Mean Yields and Marginal Means of Each Group of Sets. FIRST GROUP SECOND GROUP Set 1 2 ....p ! !Mean Set /Mean \ l X21 Xp] X [2 X2 2 ---Xp2 x.i ■X. 2 1 2 P Y11 Y 2 1---Ypl Y Y Y12 22--- p2 /Y ‘l Y ’ 2 X. X„ .... XIp 2p pp :x.p Y, Y - ___Ylp 2p pp % f-fe an CL, IX O J I Xi— l IX ;x.. Mean Y, . ' Y_. ...Y . - 1 2 p Y.. University of Ghana http://ugspace.ug.edu.gh 177. According to Yates, such an experimental arrangement in two equal groups of sets is analogous to a factorial experiment (7,14) involving two factors (each with p values) in which the main effects of one factor are confounded in one half (the first group) of the replications and the main effects of the other factor are confounded in the other half (the second group). He obtained the adjusted treatment means as t.. = 1(X.. + Y . . - X. . + X. . + Y, . - Y . •) . il i] i] l 'l i 1 Differences between the adjusted values t ^ .will give efficient estimates of the treatment differences freed from block effects. He obtained the variance of the difference of two such adjusted means. The analysis of variance table for such an experiment has been given by him (Yates). For the case of r randomised blocks, he partitioned the degrees of freedom in the analysis of variance in the way shown in Table 7.2. University of Ghana http://ugspace.ug.edu.gh 118. Table 7.2 ; Partition of Degrees of Freedom for a Pseudo- Factorial Arrangement in Two Groups of Sets in r Randomised Blocks . Source df BLOCKS Between Grotps Between Sets Within Sets Group I Group II 'Group I Group II 1 p-1 p-1 pQ r - 1) p Q r - 1) Pr-1 TREATMENTS First factor Second factor Interactions p-1 p-1 (P-1)2 2 iP "I ERROR Between Groups Within Group I Within Group II ( p - i r P (p-1) G r -l) p(p -l) Qr-1) (P-1) (P r -P 'l TOTAL He obtained the various sums of squares as follows: First factor: M = lr(pEY?. - p2z:Y?.) \ Second factor; N = Jr(pEX?^ - p2EX?.) ; University of Ghana http://ugspace.ug.edu.gh 179. Interactions : S = —r 4 gCX i3. . Y j Z - b C V * Y r ) 2 - PE ex.^ + Y . , . ) 2 + p2 (X.. + Y . . ) 2 Sum of Squares for . „ _ , sets and Groups ’ zT Pfex?. + iy?j) - Ip2ex.. + y..): PutH = M + N + S + R = —r 4 £E(X + Yj . ) 2 + p£(X. - Yi0 2 ij i + pe(x.,-y o 2 -p2cx.. -y..)2-p2(x.. +Y..)2 j J J Total Sum of Squares . v t between all X and Y * 2 U X 2. + EEY2^ - p2(X.. + Y ..)2 Hence, Treatment Sum of Squares: SS^ = H - R £ Sum of Squares for Error: SSF = K - H « To test for treatment differences, the mean squares for treatments and error are compared by means of the Z-test. The cases of two-dimensional pseudo-factorial arrangements in three equal groups of sets forming a Latin square, three- dimensional pseudo-factorial arrangements in two unequal groups University of Ghana http://ugspace.ug.edu.gh ISO. of sets, and three-dimensional pseudo-factorial arrangements in three unequal groups of sets have also been discussed by Yates. From the efficiencies of the various arrangements relative to that of ordinary randomised blocks, he found out that pseudo-factorial arrangements are the most efficient when there are considerable soil fertility fluctuations. Furthermore, three-dimensional arrangements are likely to be most advanta­ geous with a larger number of treatments and in cases in which there are sufficient replications for the randomised blocks to be replaced by Latin squares. University of Ghana http://ugspace.ug.edu.gh CHAPTER EIGHT CONCLUSION This thesis has shown that, for an agricultural field experiment involving treatment comparisons, if one uses a randomised block design with n plants per plot the following will be the results: I. Under both the fixed-effects and random-effects models, analysing the experiment using yields of the individual plants on a plot is more efficient than a corresponding analysis using the average yields per plot. In particular, because the error variance is efficiently estimated, (i) the F test for treatment differences is more sensitive; (ii) the standard error of the difference between any two treatment means is smaller and thus a corresponding multiple comparisons of the treat­ ment means is more efficient. II, When there are three or more blocks, an F test for treatment differences under randomization models is more sensitive than the corresponding F test under normal-theory models. University of Ghana http://ugspace.ug.edu.gh i8a. III. As far as testing of treatment differences and multiple comparisons of means are concerned, a multivariate analysis of variance using measurements of some correlated variates may produce better results than a corresponding analysis using yields of the individual plants on a plot. In fact, of all the multiple comparisons of means treated in this thesis, the one resulting from the correlated variates was the least ambiguous. This thesis has also confirmed the findings that of all the multiple comparisons of means procedures considered, Bayes exact test (BET) and Fisher's significant difference (FSD) are the best. A reasonable alternative to any of these two is Duncan's multiple range test (MRT). We saw further that, in estimating variance components, the restricted maximum likelihood estimators are more efficient than those of the analysis of variance. They may, however, produce biased estimates. In spite of this, we noted that they are preferable to those of the analysis of variance since they avoid negative estimates. We also identified a type of arrangement, called pseudo­ factorial, which could be used in conjunction with the rando­ mised block and Latin square designs to increase efficiency when there are many treatments and considerable soil fertility fluctuations. University of Ghana http://ugspace.ug.edu.gh With regard to the "Kpong data", if no variance components estimation is required, the most efficient method of comparing treatments is to use measurements of the correlated variates to perform a multivariate analysis of variance. In the subse­ quent multiple comparisons of means, Duncan's multiple range test (MRT) is the appropriate procedure to use. But if variance components estimation is required, then one may use yields of the individual plants on a plot. In this case, Bayes exact test (BET) and Fisher's significant difference (FSD) are the best procedures to use in the multiple comparisons of means. Generally, for an agricultural field experiment involving treatment comparisons, (a) one may use a randomised block design with N plants per plot if the treatments are not many. Of the N plants on a plot, if it is necessary to take a sample of n plants (n< N ) , the selection must be random. Efforts should be made to use the optimum sampling size and the optimum number of replica­ tions. In this situation, the interaction mean square is the appropriate error mean square to use in the analysis. In analysing the experiment, (i) one may take measurements of some correlated variates to perform a multivariate analysis of variance. This approach is advantageous to those experimenters who will go on further to do discriminant and canonical analysis. It is not necessary 18ft. University of Ghana http://ugspace.ug.edu.gh to measure the actual yield as long as the yield is a function of the correlated variates. (ii) If it is not possible to take measurements of some correlated variates, one may use yields of the individual plants on a plot to perform a univariate analysis of variance. (iii) If in using yields of the individual plants on a plot, the F test for treatment differences just misses signi­ ficance, one may use the average yields per plot to perform a permutation test provided there are enough blocks. (b) If there are many treatments and there is evidence of high soil fertility fluctuations, the pseudo-factorial type of arrangement may be employed in conjunction with the rando­ mised block design. The average yields per plot may then be used with the formulae given by Yates (44). In such situations, the blocking should efficiently be done. (c) A multiple comparisons of means may be performed by using either Bayes exact test (BET) or Fisher's significant difference (FSD) with « = 0.05. Where any of these is not applicable, Duncan's multiple range test (MRT) will be a suit­ able substitute. (d) If variance components estimation is required, the restricted maximum likelihood method is quite appropriate. 184. University of Ghana http://ugspace.ug.edu.gh REFERENCES i8*r. 1. Anderson, T.W.; 2. Bartlett, M.S.: 3. Box, G.E.P.: 4. Box, G.E.P. and Anderson, S.L. 5. Carmer, S.G. and Swanson, M.R. 6. Cochran, W.G.: 7. Cochran, W.G. and Cox, G.M. 8. Crump, L.S. : 9. Crump, L.S. : 10. Davies, O.L.: 11. Duncan, D.B.: 12. Duncan, D.B.: 13. Eisenhart, C. An Introduction to Multivariate Statistical Analysis; Wiley, New York, 1958. The Use of Transformations; Biometrics, 3, 39-52, 1947. Non-Normality and Tests on Variances; Biometrika, Vol.40, 318-335. Permutation Theory in the Derivation of Robust Criteria and the Study of Departures from Assump­ tion; Journal of Royal Statistical Society, Series B, Vol.17, 1-34. Evaluation of Ten Paiiwise Multiple Comparison Procedures by Monte Carlo Methods; Journal of the American Statistical Association] March 1973, Vol.6 8, No.341. Some Consequences When the Assumptions for the Analysis of Variance are not Satisfied; Biometrics, 3, 1-21, 1947. Experimental Designs; Wiley, New York, 1957, (2nd Edition). The Estimation of Variance Components in Analysis of Variance; Biometrics, 2, 1946. The Present Status of Variance Component Analysis; Biometrics, 7, 1951 Statistical Methods in Research and Production; Oliver and Boyd, London, l96l. Bayes Rule for a Common Multiple Comparisons Problem and Related Student - t Problems ; Annals of Mathematical Statistics, 32, 1961. Multiple-Range and Multiple-F Tests; Biometrics, 11, 1955. The Assumptions Underlying the Analysis of Variance; Biometrics. 3, 1947. University of Ghana http://ugspace.ug.edu.gh 14. Federer, W.T.: 15. Finney, D.J.: 16. Finney, D.J.: 17. Fisher, R.A.: 18. Fisher, R.A.: 19. Graybill, F.: 20. Hoeffding, W.: 21. Homsnell, G.: 22. Immer, F.R.: 23. Johnson, N.L.: 24. Keith, H.: 25. Kempthome, 0. 26. Kendall, M.G. and Stuart, A. 27. Li, c.c. : 28. Nelder, J.A. Experimental Design; MacMillan, New York, 1955. An Introduction to the-Theory of Experimental Design; University of Chicago Press, Chicago, 1960. The Statistician and the Planning of Field Experiments; Journal of Royal Statistical Society, A, 119, 1956. Statistical Methods for Research Workers; First Edition, Oliver and Boyd, Edinburg . The Design of Experiments; 8th Edition, London, Oliver and Boyd, 1935-1966. Variance Heterogeneity in Randomised Block Design; Biometrics, 10, 1954. The Large Sample Power of Tests based on Permutations of Observations; Annals of Mathematical Statistics, Vol.23, 169-192. The Effect of Unequal Group Variances on the F-test for Homogeneity of Group Means; Biometrika, Vol.40, 1953. Some Uses of Statistical Methods in Plant Breeding; Biometrics, 1, 1945. Alternative Systems in the Analysis of Variance; Biometrics, 35, 80-87, 1948. Methods of Multivariate Analysis; University of London Press, 1968. The Design and Analysis of Experiments; Wiley, New York, T 55T . The Advanced Theory of Statistics; Vols 2 and 3, Charles Griffin and Co., London, W.C.2. Introduction to Experimental Statistics; McGraw-Hill Inc. I M The Interpretation of Negative Components of Variance; Biometrika, 41, 544-548. University of Ghana http://ugspace.ug.edu.gh 187. 29. Neyman, J. and Pearson, E.S. 30. Pearson, E.S. and Hartley, H.O. 31. Rao, C.R.: 32. Rao, C.R.: 33. Scheffe', H.: 34. Scheffe', H.: 35. Searle, S.R.: 36. Shapiro, S.S.: and Wilk, M.B. 37. Snedecor, G.W. and Cochran, W.C. 38. Thompson, C.M.: and Merrington, M. 39. Thompson, W.A.: 40. Tukey, J.W.: 41. Waller, R.A. and Duncan, D.B. 42. Wilk, M.B. and Kempthome, 0 . Joint Statistical Papers; Cambridge University Press, 1967, London, N.W.l. Biometrika Tables for Statisticians; Vol.l, Cambridge University Press, 1954. Advanced Statistical’Methods in Biometric Research; Wiley, New York, 1952. Linear Statistical Inference and Its Applications; Wiley, New York, 1965. A Method for Judging All Contrasts in the Analysis of Variance; Biometrika, 40, 1953. The Analysis of Variance; Wiley, New York, 1959. Topics in Variance Component Estimation; Biometrics, 27, No.l, 1971. An Analysis of Variance Test for Normality (Complete Samples); Biometrika, 52, 591-611. Statistical Methods; Iowa State College Press, 5th Edition, 1956. Tables for Testing the Homogeneity of a Set of Estimated Variances; Biometrika, 33, 1943-1946. The Problem of Negative Estimates of Variance Components; Annals of Mathematical Statistics, 33, 273-289. One Degree of Freedom for Non-Additivity; Biometrics, 5, 1949. A Bayes Rule for the Symmetric Multiple Comparisons Problem; Journal of American Statistical Associa­ tion, 64, 1969. Fixed, Mixed and Random Models in the Analysis of Variance; Journal of American Statistical Associa- tion, Vol.50, 1144-1167. University of Ghana http://ugspace.ug.edu.gh 18g. 43. Wilks, S.S. 44. Yates, F.: 45. Yates, F. and Zacopanay Mathematical Statistics; Wiley, New York, 1962. Experimental Design. (Selected Papers) ; Charles Griffin and Co. Ltd., London, 1970. The Estimation of the Efficiency of Sampling, with Special Reference to Sanpling for Yield in Cereal Experiments; Journal of Agricultural Science, 25, 545-577, 1935. University of Ghana http://ugspace.ug.edu.gh