GENETIC ANALYSIS OF HOST PLANT RESISTANCE TO THE CASSAVA MOSAIC DISEASE By Yvonne Rosaline Naa Norle Lokko BSc (Hons.) Botany with Zoology, University of Ghana. MSc Plant Biotechnology, University of London This thesis is submitted to the University of Ghana, Legon in partial fulfilment for the requirement of the award for the Ph.D degree in Crop Science April 2002 '£,(,0 S , C3 3 L23 /! c , c . I ^ 3 7 0 4 2 9 Abstract The genetic control of resistance to the cassava mosaic disease (CMD) in some African landraces, their relationship with the widely used resistant genetic stock, clone 58308, and molecular markers associated with resistance to CMD in the landraces were evaluated in this study. The F-i progenies of 54 cassava crosses and their parents (clone 58308, six improved clones and 15 African landraces) were evaluated in two genetic experiments in three environments in Nigeria. Genetic analysis revealed that additive gene effect was more important in the genetics of resistance to CMD among the landraces, while both additive and non-additive gene effecsts were important for clone 58308. The segregating Fi crosses exhibited varying levels of resistance and susceptibility to CMD suggesting polygenic inheritance. Resistant phenotypes were detected in the crosses involving susceptible parents which suggests that resistance to CMD in the clones studied is due to recessive genes and susceptible phenotypes in crosses involving resistant parents suggested that the resistance genes are non allelic. Positive transgressive segregants were also detected in some crosses. The number of effective factors for resistance to CMD ranged from two to seven and was contributed by both parents in a cross. Significant differences in the mean distribution of F-i progeny disease severity scores further revealed allelic differences between three improved clones and some landraces, and that expression of resistance is influenced by the nature of the female parent. Bulk segregant analysis (BSA) showed that an SSR marker SSR30-180 was associated with CMD resistance in a cross between the susceptible clone TMSI30555 and the resistant landrace TME7. Linkage analysis revealed that SSR30- 180, which was donated by TME7, was 14.9 cM from a putative CMD resistance locus, CMDRL. Marker-trait association detected by regression analysis further showed that markers, SSR30-180, E-ACC/M-CTC-225, SSR119-A and SSR119 accounted for 57.41%, 22.5%, 32.24% and 37.83% of the total phenotypic variation respectively, for resistance to CMD in the mapping population. The results further showed differences between TME7 and six out of the 15 resistant clones used as parents and checks in the genetic experiments with respect to SSR30-180, suggesting that different alleles are involved in resistance to CMD. Two susceptible clones TME31 and TME117 however, had the marker, which suggests that SSR30- 180 is not tightly linked to resistance, thus there is a need to saturate the map with more markers. These findings are useful in the selection of cassava parental clones for resistance breeding. Declaration We declare that this work was carried out by Ms. Yvonne Lokko, of the University of Ghana, Ghana, in collaboration with the International Institute of Tropical Agriculture (IITA), Ibadan, Nigeria under our supervision. Ms. Yvonne R.N. Lokko Student Department of Crop Science University of Ghana Dr. S.K. Offei Molecular Plant Virologist Department of Crop Science University of Ghana Dr. Eric Danquah Geneticist Department of Crop Science University of Ghana Dr. A.G.O. Dixon Plant Breeder Crop Improvement Division International Institute of Tropical Agriculture Ibadan, Nigeria Acknowledgements This work was supported by the Rockefeller Foundation grant RF 97007 #28 and 2000 FS 137 and managed by the International Institute of Tropical Agriculture (IITA), Ibadan Nigeria under its Graduate Research Fellowship programme. First of all, I thank God my heavenly father, for the opportunity I had to pursue a PhD degree. I thank God for the strength he gave me, for the perseverance and successful completion of the thesis. I am grateful to Dr. John Amuasi, Executive Secretary of the Ghana Atomic Energy and Dr. George Klu, Director of the Biotechnology and Nuclear Agriculture Research Institute, Accra, Ghana for granting me study leave to pursue the programme. I am extremely grateful to Drs. Robert Asiedu and Alfred Dixon of IITA who encouraged me to submit the project proposal. My sincere gratitude goes to Dr. Margaret Quin for her encouragement and assistance in ensuring that I did come to Nigeria to begin work on the project. Many thanks to Dr. Richard Jefferson who was instrumental in obtaining an extension of the Rockefeller award. I wish to express my profound gratitude to my supervisors Drs. S.K. Offei and Eric Danquah of the University of Ghana, and Alfred Dixon of IITA. I thank them for their advice and assistance throughout the programme. I am very grateful to Dr. Dixon, for providing the experimental materials for the work and his guidance in executing the experiments. My sincere gratitude goes to Dr. Kwaku Agyemang for his assistance with the genetic analyses and also to Dr. Sagary Nokoe for his advice on statistical analysis. Dr. Melaku Gedil was a Godsend at the right time. I am extremely grateful for his help, encouragement and partial supervision of the project. I am grateful for the help given by Dr. Martin Fregene of the International Centre for Tropical Agriculture CIAT, Colombia. I am extremely grateful to all staff of the Cassava Breeding Programme of IITA who were always very helpful. Special thanks go to Mr. Gbenga Akinwale, Ibrahim and Kinsley of the cassava breeding programme for their assistance in data collection. Thanks to Paul Ilona, Peter, C.C. Okonkwo and Willie for setting up the field experiments. I am grateful for the assistance given by Richardson Okechukwu, Fisayo Kolade, Esther, Nkechi and Mrs. Karen Lawal. My sincere gratitude also goes to Dr. Chris Okafor, Manager, Individual Training, IITA, and all the staff of the IITA training programme for their help. Many thanks to Mrs Chinyere Woods, for all her friendship and help in preparing the manuscript. I am extremely grateful for the editorial assistance given by Mrs. Umelo. Profund gratitude to my friends Adebola Raji and Ida Yovo-Badejoko. We shared our joys and our tears, our excitements and our anxieties. I am thankful for the friendship I shared with my colleagues, Ego Uzokwe, Biodun Cladius-Cole, Francis Ogbe, Joseph Onyeka, Elizabeth Okai, Charles Kwoseh, Chiedozie Egesi, Ayo Salami, and all the members of International Association of Research Scholars and Fellows, IITA. My sincere gratitude goes to Drs Asafo-Agyei, Kormawa and Larbi of IITA/ILRI in Nigeria, to and my friends and colleagues of the Biotechnology and Nuclear Agriculture Research Institute, Accra, Ghana, for their encouragement and well wishes. Finally, I remain indebted to my father for all the sacrifices he made for me. I am thankful for brothers, my sister and my sisters-in-law. Their love and prayers were invaluable. v This thesis is dedicated to my father Dr. R.B. Lokko, my brothers and sister, and the loving memory of my mother, Mrs. Mercy Lokko. Dedication v i Table of Contents Abstract....................................................................................................... i Declaration...................................................................................................iii Acknowledgements....................................................................................... iv Dedication.................................................................................................vi Table of Contents.........................................................................................vii List of Tables..............................................................................................xii List of Plates...............................................................................................xv List of Figures............................................................................................ xvi Chapter One Introduction................................................................................1 Chapter Two Literature Review........................................................................6 2.1 Importance of Cassava and production constraints........................................6 2. 2 Viruses affecting cassava......................................................................... 7 2.3 Cassava mosaic disease........................................................................... 8 2.3.1 Characteristics of cassava mosaic viruses.................................................8 2.3.2. Effects of CMD on cassava.................................................................... 9 2.3.3 Control.............................................................................................. 10 2.4 Host plant resistance to the cassava mosaic disease (CMD)........................ 13 2.4.1 Mechanism of resistance.......................................................................14 2.5. Genetics of cassava............................................................................ 16 2.5.1 Genome analysis................................................................................ 17 2.5.2 Cytogenetics...................................................................................... 17 2.5.3 Hybridisation and genetic diversity..........................................................18 2.6 Quantitative genetics in cassava breeding................................................19 2.6.1 Mating designs....................................................................................19 2.6.2 Heterosis........................................................................................... 21 2.6.3. Number of genes and complementarity among genes affecting quantitative traits......................................................................................................... 22 2.6.4. Heritability....................................................................................... 24 2.6.5 Genetic correlation studies.................................................................... 25 2.7 Biometrical analysis............................................................................... 26 2.7.1 Analysis of mixed models in genetic studies............................................. 26 2.7.2. The diallel analysis............................................................................28 2.7.3 The number of effective factors............................................................30 2.8 Molecular marker technology and its application in cassava breeding and genetics.....................................................................................................32 2.8.1 Isozyme markers................................................................................. 33 2.8.2 DNA markers......................................................................................34 2.8.3 Restriction Fragment Length Polymorphism (RFLP)..................................35 2.8.4 Random Amplified Polymorphic DNA (RAPD)...........................................37 2.8.5 Variable number of tandem repeats (VNTR)............................................ 37 2.8.6 Microsatellite or simple sequence repeats (SSRs).....................................38 2.8.7 Amplified fragment length polymorphism (AFLP)....................................... 39 2.9 Determination of molecular markers associated with traits of interest.............. 40 2.9.1 Bulk segregant analysis....................................................................... 41 2.9.2 Linkage mapping.................................................................................42 2.9.3 Quantitative trait loci (QTL)...................................................................43 2.9.4 Screening for resistance gene analogs................................................... 45 2.9.5 Sequence tagged sites (STS).............. 45 2.9.6 Type of population............................................................................... 46 Chater Three Materials and Methods..............................................................48 3.1 Analysis of variance and combining ability analyses.................................... 48 3.1.1 Genetic Material..................................................................................48 3.1.2 Experimental design............................................................................ 48 3.1.3 Assessment of CMD ............................................................................51 3.1.4 Data analyses.................................................................................... 51 3.1.4.1 3X18 North Carolina Design II (NCD II) analysis...................................54 3.1.4.2 4X4 Diallel analysis........................................................................... 58 3.1.4.3 Relative importance of general and specific combing ability......................60 3.1.4.4 Heritability....................................................................................... 62 3.1.4.5 Heterosis......................................................................................... 62 3.2 Genetic relationships among the various sources of resistance to CMD .......... 63 3.2.1 Area under the disease progress curve (AUDPC) of the sources of resistance to CMD ..................................................................................................... 63 3.2.2 Principal component analysis (PCA) of sources of resistance to CMD.......... 64 3.2.3 Gene complementarity among various sources of resistance to CMD based on the distribution of the F-i disease severity scores............................................... 65 3.2.3.1 Transgressive segregation for resistance to CMD...................................65 3.2.3.2 Cochran-Mantel-Haesnzel.................................................................. 65 3.2.3.3 Number of effective factors (NE) contributing to resistance to CMD........... 65 3.3. Determination of DNA markers associated with resistance to CMD ............... 68 3.3.1 Genetic material and experimental procedure...........................................68 3.3.2 Resistance screening and selection of plants for bulk segregant analysis 69 3.3.3 DNA extraction....................................................................................69 3.3.4 DNA quantification.............................................................................. 70 3.3.4.1. Agarose gel electrophoresis.............................................................70 3.3.4.2 Estimation of DNA concentration and purity.........................................71 3.3.5 Determination of CMD virus strain causing symptoms in the mapping population..................................................................................................72 3.3.6 Bulk segregant analysis....................................................................... 75 3.3.6.1 Random amplified fragment length polymorphism (RAPD) analysis...........75 3.3.6.2 Amplified fragment length polymorphism (AFLP) analysis........................ 75 Ligation of adapters.................................................................................... 78 Selective amplification reactions....................................................................80 3.3.6.3 Simple Sequence Repeats (SSR) analysis............................................ 81 3.3.6.4 Sequencing gel electrophoresis.......................................................... 82 3.3.6.5 Silver staining...................................................................................83 3.3.6.6 Gel analysis, marker scoring and nomenclature................................... 84 3.3.7 DNA markers associated with CMD resistance in F-i population................... 85 3.3.7.1 Genetic linkage mapping.................................................................... 85 3.3.7.3 Quantitative trait loci (QTL) analysis..................................................... 89 3.3.8 DNA markers associated with CMD resistance in various sources of resistance ................................................................................................................ 89 Chapter Four Results......................................... 90 4.1 Analysis of variance and combining ability analyses.....................................90 4.1.1. 3X18 North Carolina Design II mating scheme......................................... 90 4.1.2. 4X4 Diallel mating scheme................................................................... 93 4.1.3 Combining ability effects of 3X18 NCD II mating scheme............................95 4.1.4 Combining ability and maternal effects in the 4X4 Diallel mating scheme.... 100 4.1.5 Heterosis for resistance to CMD in F-i crosses........................................ 103 4.2 Genetic relationships among the various sources of resistance to CMD .........104 4.2.1 Relationships among sources of resistance based on area under the disease progress curve (AUDPC) for CMD................................................................104 4.2.2. Relationships among sources of resistance based on principal component analysis (PCA) of CMD responses................................................................107 Fig 4.1: Plot of first principal component versus second principal component analysis of parents and checks across environments....................................................110 4.2.3 Complementarity of genes for resistance to CMD among various sources of resistance................................................................................................ 111 4.2.3.1 Frequency distribution of CMD severity scores in Fi progenies of various sources of resistance................................................................................. 111 4.2.3.2 Transgressive segregants in Fi progenies of various sources of resistance .............................................................................................................. 117 4.2.3.3 Cochran-Mantel-Haenzel (CMH) statistic test of association................... 119 4.2.3.4. Number of effective factors.............................................................125 4.3 Determination of DNA markers associated with resistance to CMD ...............131 4.3.1 Resistance screening and selection of plants for bulk segregant analysis.... 131 4.3.2 Genomic DNA quality and concentration................................................ 131 4.3.3 Cassava mosaic virus strains in mapping population............................... 131 x 4.5 Association of DNA markers with CMD resistance in F-i mapping population ... 142 4.5.1 Genetic Linkage analysis.................................................................... 142 Fig. 4.2 Linkage group-X of the linkage map showing the location of the resistance to CMD, defined as the CMDRL....................................................................... 152 4.5.2 QTL analysis.....................................................................................153 4.6 Association of DNA marker SSR30-180 with CMD resistance in other cassava clones..................................................................................................... 156 Chapter Five Discussion.............................................................................158 Chapter Six Conclusion...............................................................................176 References.............................................................................................. 178 Appendices.............................................................................................. 200 xi Table 3.1 Description of cassava clones used as parents in 4X4 diallel and 3X18 North Carolina Design II matings and checks............................................. 49 Table 3.2 Standard five point scale scoring systems for CMD (IITA, 1990)............52 Table 3.3 Nucleotide sequences of DNA primers used in polymerase chain reaction for the detection of cassava mosaic begomoviruses strain in Ft mapping population.............................................................................................74 Table 3.4. Nucleotide sequences of AFLP adapters, primers and primer extensions76 Table 3.5 Description of cassava SSR loci and their primer pairs used in genetic linkage analysis.....................................................................................88 Table 4.1 Analysis of variance for CMD severity at 12 weeks after planting (WAP) among all genotypes in 3X18 NCD II mating scheme, evaluated in three environments........................................................................................ 91 Table 4.2 Analysis of variance for CMD severity at 12 weeks after planting (WAP) among all genotypes in 4X4 diallel mating scheme, evaluated in three environments........................................................................................ 94 Table 4.3 Mean* CMD severity scores and GCA effects for CMD symptom severity of parents in the 3X18 NCD II mating scheme, evaluated in three environments...96 Table 4.4 Mean CMD severity scores and SCA effects for CMD symptom severity of crosses in 3X18 design II crosses in environment with significant variation for SCA..........................,......................................................................... 98 Table 4.5 Mean, GCA and maternal (MAT) effects for CMD symptom severity of parents in 4X4 diallel crosses in environments with significant variation for their effects................................................................................................ 101 Table 4.6 Mean CMD severity scores and SCA effects for CMD symptom severity of crosses in 4X4 diallel in environments with significant variation for their effects 102 Table 4.7 Area under the disease progress curve (AUDPC) of cassava clones based on mean CMD severity scores and their ranks for CMD resistance............... 105 Table 4.8 Spearman's rank correlation coefficient for AUDPC values for shoot-tip and whole plant CMD symptom severity.........................................................106 Table 4.9 Eigenvectors and eigenvalues as a proportion of total variance of the first three principal components for the CMD resistance responses of the 23 parental and check clones................................................................................. 108 List of Tables Table 4.10. Frequency distribution of Ft disease severity scores (based on mean disease severity scores of progenies) in an environment and mean number of progenies (N) in crosses involving susceptible parents................................112 Table 4.11 Frequency distribution of F-i disease severity scores (based on mean disease severity scores of progenies) in an environment and mean number of progenies (N) in crosses involving resistant parents....................................113 Table 4.12 Frequency distribution of Fi disease severity scores (based on mean disease severity scores of progenies) in an environment and mean number of progenies (N) in crosses involving resistant and susceptible parents............ 115 Table 4.13 Frequency of positive transgressive in percentages (PTS) with disease severity scores at least one score better than the better parent in a cross in three environments.......................................................................................118 Table 4. 14 Chi-square values for differences in the mean distribution of disease severity scores of F-i progenies in crosses involving resistant male parents when crossed with each of the female parents 130001 resistant to CMD, I30555 susceptible to CMD and I30572 resistant to CMD...................................... 120 Table 4.15 Chi-square values for differences in the mean distribution of disease severity scores of F-i progenies in crosses involving resistant and susceptible male parents when crossed with each of the female parents 130001 resistant to CMD, I30555 susceptible to CMD and I30572 resistant to CMD................... 122 Table 4. 16 Chi-square values for differences in the mean distribution of disease severity scores of Fi progenies in crosses involving susceptible male parents when crossed with each of the female parents 130001 resistant to CMD, I30555 susceptible to CMD and I30572 resistant to CMD...................................... 124 Table 4.17 Number of effective factors (NE), number of effective factors contributed by better parent (NBP) and number of effective factors contributed by poorer parent (NPP) in F1 crosses between resistant parents................................ 126 Table 4.18 Number of effective factors (NE), number of effective factors contributed by better parent (NBP) and number of effective factors contributed by poorer parent (NPP) in F1 crosses between resistant and susceptible parents 128 Table 4.19 Number of effective factors (NE), number of effective factors contributed by better parent (NBP) and number of effective factors contributed by poorer parent (NPP) in F1 crosses between susceptible parents............................ 130 Table 4.20 Reaction of Ft plants in mapping population to CMD severity in their shoot tips (MST), whole plant (MWP), mean severity of first 10 leaves plant (MLS), total mean response presence or absence of ACMV virus and their classification (Class)............................................................................................... 132 Table 4.21 Average number of bands, level of polymorphism among parents (TMS I30555, TME7) and bulks with different AFLP systems................................139 Table 4.22 Segregation analysis for conformity to 1:1 and 3:1 segregation ratio of polymorphic DNA markers scored in mapping population contributed by P1 I30555............................................................................................... 146 Table 4.23 Segregation analysis for conformity to 1:1 segregation ratio of polymorphic DNA markers scored in mapping population contributed by P2 TME7.................................................................................................148 Table 4.24 Molecular markers mapped on fourteen linkage groups of cassava genome.............................................................................................. 151 Table 4.25 Simple linear correlation coefficient between markers and CMD responses...........................................................................................154 Table 4.26 Association between markers SSR30-180 and E-ACC/M-CTC-225 with all CMD resistance traits (All traits), mean shoot-tip severity (MST), mean whole plant severity (MWP), and mean leaf severity (MLS) in mapping population.... 155 x iv Plate 3.1 Cassava leaves, with varying levels of cassava mosaic disease severity................................................................................................ 53 Plate 4.1 Amplification products of section of mapping population showing the presence of ACMV with two ACMV specific primers.................................. 135 Plate 4.2 Amplification profile generated by some RAPD primers on, parents TMS I30555 and TME7, resistant bulk and susceptible bulk................................ 136 Plate 4.3 Amplification products of SSR primer 30 in parents, bulks and bulk members............................................................................................ 141 Plate 4.4. A cross section of AFLP products of parents TMS I30555 and TME7 and progenies............................................................................................143 Plate 4.5. Section of AFLP products for primer P-ACC/T-CTA...........................144 Plate 4.6. Example of multiplexing PCR products of three SSR primers, I, SSR6, II, SSR70, III, SSR7................................................................................. 145 Plate 4.7 Amplification product of SSR 30 on 23 cassava clones........................157 List of Plates xv List of Figures Fig 4.1: Plot of first principal component versus second principal component analysis of parents and checks across environments..............................................110 Fig. 4.2 Linkage group-X of the linkage map showing the location of the resistance to CMD, defined as the CMDRL................................................................. 152 xvi Chapter One Introduction The cassava mosaic virus disease (CMD) is considered the most devastating disease of cassava in Africa (Jennings, 1977; Geddes, 1990). It is caused by the cassava mosaic geminiviruses, transmitted by the whitefly (Bemisia tabaci Genn) and spread through propagation of infected vegetative propagules. The cassava mosaic disease causes severe yield losses ranging from 20 to95%, and the effect of the disease is most severe when plants are infected at the early crop stage (Terry and Hahn, 1980; Otim-Nape et al., 1994a; Thresh et al., 1994a). It is estimated that total crop yield losses due to CMD in Africa is about $2 billion per annum (IITA, 1997). The most effective means of controlling CMD is by cultivating resistant genotypes (Thresh et al., 1994b). In tackling the early disease challenges in Africa, resistance to CMD was obtained from a cross between cassava and its relative Manihot glaziovii Muller von Argau (Nichols, 1947). Progenies from the third backcross generation were evaluated in different African countries and clone 58308, which had edible storage roots and resistance to the disease, was selected in Nigeria. Clone 58308 has been the main source of resistance used by breeders for several decades and has resulted in the selection of several improved cultivated cassava genotypes of the Tropical Manihot Selection (TMS) series, with resistance to CMD (IITA, 1973; Hahn et al., 1989). Some of the genotypes which are widely cultivated in Africa include TMS 130001, TMS I30572, TMS 14(2)1425, TMS 160142 and TMS I90257. Despite the progress made in resistance breeding, the disease is still a threat to cassava production. Recent disease epidemics in East Africa, caused by a recombinant strain of two of the cassava begomoviruses (Zhou et al., 1997) resulted in losses of 2.2 billion tons losses in storage root yield that amounted to US $440 1 million (Thresh et al, 1997). Furthermore, exotic germplasm from South America are generally not resistant to CMD and succumb to the disease under field conditions in Africa (Porto et al., 1994). Thus, much of the germplasm introduced from the crop's centre of origin, to broaden the genetic base in Africa has been lost (Jennings and Hershey, 1985). There is therefore a need to increase the levels of resistance within the cassava genepool (Thresh et al., 1994a). To ensure that durable resistance is maintained within the African cassava germplasm, additional sources of resistance with a wider genetic base will be required to diversify resistance to the disease. This would allow the accumulation of desirable gene combinations that would be difficult for the pathogens to circumvent in the long term. Cassava is open pollinated and in the field hybridises with itself and related wild species creating greater variability for different traits. Silvestre and Arraudeaus (1983), noted that African farmers sometimes take cuttings from spontaneous seedlings for their subsequent planting (cited in Lefevre and Charrier, 1993). These new genotypes, referred to over time as landraces, provide a wealth of new genes for desirable traits and form part of the crop’s genetic resource (Gulick et al., 1983; Hershey, 1987). Some of these African landraces, the Tropical Manihot Esculenta series (TME), have been identified to be resistant to CMD (Raji, 1995) and could serve as new sources of resistance. However, selection of parents to be incorporated in a breeding programme cannot be based on their performance alone. Inheritance of traits depends on the type of gene action that may be involved and the extent of diversity among the parents. Understanding the mode of gene action and inheritance in the various sources of resistance is important in determining the appropriate breeding strategy to adopt. To efficiently utilise new sources of resistance to CMD in breeding programmes and to diversify resistance when transferred into elite material, it is essential to compare the various resistant genotypes with each other to determine if the loci for resistance 2 are similar, and how their effects complement each other to enhance resistance. Establishing genetic relationships between the different sources of resistance would facilitate the choice of parents for developing new resistant cultivars. In addition, by combining different genes that relate to different sources of resistance, epistatic interaction may be identified such that higher levels of resistance can be developed to protect the crop. Resistance to CMD derived from clone 58308 has been reported to be polygenic, recessive and inherited largely in an additive manner (Hahn and Howland, 1972). Polygenic inheritance was also reported for some of the African landraces (Lokko et at., 1998). Hahn et at. (1980a), deduced from this that resistance to the disease must be attributed to the combined action of a number of loci which are linked on a chromosome or a set of chromosomes. Population improvement and recurrent selection was therefore recommended in breeding for resistance to the disease (Hahn et at., 1979). Interestingly, a major dominant gene responsible for resistance in an African landrace TME3 (known as 2nd Agric in Nigeria) has been reported (IITA, 2000; Akano et at., 2002). Further progress in understanding CMD resistance could be made by identifying the individual gene or genes affecting resistance and determining their position on the chromosome map. These resistance genes could then be transfered to susceptible clones using conventional crossing coupled with marker assisted selection (MAS) and by genetic transformation. The expression of CMD in different cassava genotypes is known to be influenced by genotype x environment interaction (GXE) on the trait (Fargette et at., 1994). Significant GXE effect on a trait reduces the reliability of the phenotype as an indicator of the genotype (Kang, 1998). To obtain more precise genetic information on resistance to CMD, experiments are conducted in different environments (i.e. locations x years). This would demonstrate the GXE effect on the expression of the trait. 3 The detection of multiple genes for virus resistance using segregation analysis alone is not efficient because of the differences due to genotype by environment interaction (McMullen and Louie, 1989). Molecular markers which are not affected by environmental conditions and are insensitive to gene interactions allow geneticists and to locate and follow the numerous interacting genes that determine a complex trait as well as tagging those controlled by single genes (Botstein, et al., 1980). The two main strategies used to identify molecular markers associated with traits of interest are genetic linkage mapping and bulk segregant analysis (Tanksley et al., 1989; Michelmore et al., 1991; Giovannoni et al., 1991). Genetic linkage mapping as a tool for localising and isolating both simple and complex traits such as resistance to CMD can provide a more direct method for selecting desirable genes via their linkage to easily detectable molecular markers (Tanksley et al., 1989). Once the trait is identified and mapped, MAS could be used to verify its introduction into new genotypes developed in a breeding programme. MAS can reduce breeding population sizes, continuous recurrent testing, and the time required in developing a superior line by selecting the desired genotype early in the breeding programme. Bulk segregant analysis (BSA) is a rapid method for identification of markers in specific regions of the genome (Michelmore et al., 1991; Giovannoni et al., 1991). The technique has the potential of detecting unique regions in the genome associated with the trait of interest. DNA samples are pooled from individuals with extreme phenotypes for a particular trait, or marker allele(s), and are then screened to determine molecular markers associated with the trait of interest. A marker of interest can then be converted into a sequence characterised amplified region (SCAR) marker, for the rapid screening of the trait in other populations. 4 A genetic linkage map of cassava has been constructed at the Centro Internacional de Agricultura Tropical (CIAT) in Colombia (Fregene et al., 1997). The fully saturated map would allow cassava breeders and geneticists to identify quantitative trait loci (QTLs) and clone the genes associated with resistance to CMD. Another map that is being developed at the International Institute of Tropical Agriculture (IITA) in Nigeria, to identify genes for resistance to CMD, could be used together with the CIAT map in developing a consensus map of cassava. A consensus genetic linkage map would be of tremendous importance to cassava research in that it would allow researchers to (i) identify informative markers to use in experiments, (ii) identify QTL of economic traits, (iii) generate a genomic database of cassava and (iv) could also be used for comparative mapping with other species. To facilitate saturating the maps with molecular markers associated with CMD resistance, more segregating populations need to be developed and evaluated in different environments to accumulate quantitative data on resistance to CMD. Bulk segregant analysis (BSA) and linkage analysis could then be used to place the markers on the maps. The objectives of this study were therefore to: i) evaluate the relative importance of general and specific combining ability, estimate heterosis and heritability, and determine maternal effects for resistance to CMD: ii) evaluate the genetic relationships among the various sources of resistance to CMD; iii) identify molecular markers linked to genes for resistance to CMD. 5 Chapter Two Literature Review 2.1 Importance of Cassava and production constraints The cassava crop is a perennial shrub belonging to the Euphorbiaceae family. It was introduced into Africa and Asia from South America by Portuguese travellers in the fifteenth century (Jennings and Hershey, 1985; Allem, 1994). Cassava is cultivated mainly for its storage roots, which are a major staple in Africa, and serves as a foreign exchange earner in many countries (Dorosh, 1988; Cock, 1985b). The storage roots are consumed directly after boiling or processed into various food products (Allem and Hahn, 1991). The storage roots are also processed into starch for industrial purposes, flour for confectionery and animal feed. The peels also serve as fodder for livestock (Bokanga, 1998; Mandal, 1993). The leaves which contain appreciable amounts of vitamins, minerals and proteins are consumed as green vegetables in many parts of Africa particularly in Zaire, Sierra Leone, Liberia, Guinea and Cameroon (Lancaster and Brooks, 1983; West et a/., 1988; Bokanga, 1994). The crop can grow in poor soils on marginal lands where other crops cannot grow and requires minimal fertiliser, pesticides and rainfall (FAO, 2000; Cock and Howeler, 1978). It can be harvested any time from 6 to 24 months after planting, and can be left in the ground as a food reserve for household food security in times of famine, drought and war (FAO, 2000; Best and Henry, 1994; Cock, 1985a). Global cassava production in 1999 was over 160 million tons with Africa contributing over 50% of the global production cultivated on about 75 million hectares of farmland (Thresh etai, 1994b; FAO, 2000). 6 Cassava is cultivated in diverse agroecologies in Africa including the coastal belts, the rain forest, the moist and dry savannah, as well as the Sahel regions (Allem and Hahn, 1991; IITA 1998). It has been estimated that in Africa, the annual production of 6.1 t/ha under farmer’s conditions, is a shortfall from its potential of 30.5-51 t/ha (Hahn et al., 1989; FAO, 2000). This shortfall in production is attributed to yield losses due to diseases, pest and civil strife in some African countries, which disrupt farming activities (FAO, 1997). The major diseases that affect the crop in Africa are, the cassava mosaic disease (CMD), cassava bacterial blight (CBB), and the cassava anthracnose disease (CAD). The cassava green mite (CGM) is also an important pest of the crop (Theberge et al., 1985; Yaninek et al., 1989). Yield losses in storage root caused by any one of these diseases or pest in isolation is estimated at over 80% (Yaninek et al., 1989; Lozano. 1992; Thresh et al., 1997; Otim-Nape et al., 1994a). 2. 2 Viruses affecting cassava Cassava is affected by a number of viruses belonging to several genera and families. The viruses of major economic importance are, the cassava mosaic geminiviruses (CMG) which are begomoviruses belonging to the geminiviridae\ the cassava common mosaic virus (CsCMV) which is a potexvirus; the cassava vein mosaic virus (CsVMV) belonging to the caulimoviridae family; the cassava frog skin virus (CsFSV) belongs to the redoviridae family (Bock and Harrison, 1985; Fauquet and Fargette, 1990; de Kochko et al., 1998; Calvert et al., 1996). The CsCMV, CsVMV and CsFSV are important in South America, while the cassava mosaic geminiviruses are important in India and in Africa (de Kochko et al., 1998; Calvert et al., 1996; Harrison et al., 1997b). The cassava brown streak virus (CBSV) belonging to the family potyviridae, genus Ipomovirus has recently been identified as a major problem in East Africa (Msikita et al., 2000). 7 2.3 Cassava mosaic disease The cassava mosaic disease was first reported in East Africa in 1894 (Jennings, 1994). It is now prevalent in many parts of Africa, India and Sri Lanka (Thresh et al., 1994a). The disease is caused by the cassava begomoviruses and transmitted by the whitefly (Bemisia tabaci Genn), which feeds on young cassava leaves. The disease is spread through propagation of infected vegetative propagules and infected plants serve as a source of inoculum for further whitefly spread (Storey and Nichols, 1938; Theberge et al., 1985). The cassava mosaic disease has been rated as the most important vector-borne disease of any crop in Africa (Geddes, 1990). 2.3.1 Characteristics of cassava mosaic viruses The cassava begomoviruses are composed of two circular single-stranded DNA particles belonging to the family Geminiviridae (Harrison and Robinson, 1988; Hong et al., 1993). There are five distinct cassava mosaic geminiviruses (CMG), which have been identified based on their nucleotide sequences. These include the African cassava mosaic virus (ACMV), the East African cassava mosaic virus (EACMV), the Indian cassava mosaic virus (ICMV), the South African cassava mosaic virus (SACMV), and the Uganda variant of the East African cassava mosaic virus (EACMV-Ugv) (Zhou et al., 1997; Berrie et al., 1998). ACMV and EACMV have been reported in West Africa, East Africa, and some parts of Southern Africa while the ICMV occurs in India and Sri Lanka (Harrison et al., 1997a; Ogbe et al., 1999; Offei et al., 1999). The Ugandan variant has been reported in only East Africa and SACMV has also been reported in only Southern Africa. Recently, another strain of EACMV was reported in Cameroon (Fondong et al., 2000). The viruses can be detected and distinguished serologically by their reaction with specific monoclonal antibodies (Mabs) raised against them and by the polymerase chain reaction (PCR) using primers based on the differences in the nucleotide 8 sequences of their genome (Thomas et al., 1986; Zhou et ai, 1997). Nucleotide sequence analysis and serological reactions suggest that Ugv and SACMV are variants of EACMV and ACMV (Zhou et al, 1997; Berrie et al., 1998). Biological differences, such as ease of mechanical transmission into Nicotiana benthamiana and optimum temperature for replication, may also be used to distinguish between the viruses (Robinson et al., 1984; Hong etal., 1993). 2.3.2. Effects of CMD on cassava The distinct symptom of the disease is mosaic (Rossel et al., 1992). The symptoms first appear in young leaves and shoot apices after infection. When the whiteflies feed on the plant, the virus enters and moves down rapidly to the base of the stem. It is then systemically transferred into side branches and shoot apices causing characteristic disease symptoms (Storey and Nichols, 1938; Jos et al., 1984; Rossel et al., 1992). Infected stem cuttings also sprout into infected plants due to the upward translocation of the virus particles. The internal anatomy of cassava plants and the test plant Nicotiana benthamiana infected with cassava mosaic virus, have been shown to include changes in phloem parenchyma and sieve tube cells (Horvat and Verhoyen, 1981; Adejare and Coutts, 1982). Reduced stem girth, plant height and petiole length of infected plants have been attributed to infection (Thankappan and Chacko, 1976). In some cases, the infected plants become stunted (Kaiser and Teemba, 1979). Recent studies have shown that most plants with such symptoms of severe stunting may be due to mixed infection, involving both ACMV and EACMV or ACMV and UgV (Harrison et al., 1997b; Fondong etal., 2000). Adverse effects of the disease on the physiology of plants have also been studied. These include a reduction in the plant photosynthetic leaf area that affects the plant’s metabolism (Vuylsteike, 1981; Nian et al., 1976). Vuylsteike (1981) described the 9 adverse effects on the leaves and metabolism as an increase in chlorophylase production in the leaves, and an increase in amylase in the storage roots, which adversely affects the accumulation of starch in the storage roots. Thus at harvest, storage root yield, quality and starch content are reduced. The quality and nutrient content of the leaves is also reduced due to the reduction in leaf sizes, mosaic pattern and physiological changes (Alagianagalingan and Ramakrishaman,1970; Nian etal., 1976). Various authors have shown a positive correlation between symptom severity and yield reduction (Hahn et al., 1980a; Otim-Nape et al., 1994a). Fauquet et al., (1987a) observed that varietal responses to the disease under the same growth conditions also vary. Yield losses sustained in different cassava genotypes are reported to range from 20 to 95% (Thankappan and Chacko, 1976; Terry and Hahn, 1980; Seif, 1982; Fargette et al., 1988). The effect of CMD on the yield of storage roots, however, depends on the variety and age of the plant (Thresh et al., 1994a). Losses are more severe when plants are infected during their early growth stage and plants infected through infected clonal material sustain greater storage root yield losses than those infected by whiteflies (Terry and Hahn, 1980; Fargette and Vie, 1988). Fargette et al., (1988) estimated yield losses from CMD per annum of up to 30 million tons out of 51 million tons production in are incurred in Cote d’Ivoire. An epidemic of CMD during the 1990s in Uganda caused over 2.2 billion tons storage root losses that amounted to US $440 million (Thresh et al., 1997). It is estimated that total crop yield losses due to CMD in Africa is about $2 billion per annum (IITA, 1997). 2.3.3 Control Various control methods have been suggested for the control of CMD. Since infected planting material is the primary source of the viruses, most of these methods are 10 based on phytosanitation, cultural practices and the use of resistant material (Thresh et al, 1998b; Fauquet and Fargette, 1990; Bock, 1994). Phytosanitation include the use of virus-free planting material derived from thermotherapy, and meristem culture, roguing out diseased plants from within crop stands and removing diseased cassava plants around an intended planting site and exploiting the phenomenon of reversion (Thresh et al., 1998a). In thermotherapy, the plants are grown under temperatures of 37°C and 39°C for 28 to 105 days, to reduces the virus concentration and incidence of the disease in some genotypes (Kaiser and Teemba, 1979; Kaiser and Louie, 1982). In some genotypes, symptoms reappear when heat treatment is stopped, whiles others can remain healthy for 4 to 5 months after heat treatment (Gibson, 1994). Virus-free cassava plants are produced through meristem-tip culture (Kaiser and Teemba, 1979; Adejare and Coutts, 1981; Ng et al., 1992). The use of heat therapy before meristem-tip culturing gives a higher number of virus-free plants and also allows the use of larger pieces (0.5 - 0.8 mm) of the meristematic dome to regenerate disease-free plants (Adejare and Coutts, 1981). Roguing is recommended at the early stage of growth before the infected plants serve as inoculum for further spread (Thresh and Otim-Nape, 1994). This practice, however, is not effective in reducing CMD under high disease pressure and among genotypes that do not have substantial resistance to virus infection (Thresh et al., 1998b; Fargette et al, 1990). Fargette et al. (1990) observed that CMD spread between fields was more important than within fields. Thresh et al, 1998b deduced from this that roguing could be effective only if practised by farmers throughout whole localities. In the 1950s in East Africa for instance, roguing as a control measure was effective with strict enforcement by the local authorities (Thresh et al., 1998b; Thresh eta!., 1994b). 11 Reversion whereby the infected cuttings produce disease-free plants in the subsequent generation has also been exploited to select and produce healthy cuttings for CMD epidemiological studies in Cote d’Ivoire (Fargette et al., 1985; Fargette et al., 1988). Fargette and Vie (1995) using this method showed that although the healthy plant may become infected in the cause of the growth period, these plants have less yield loss than plants that were raised from infected material. The recommended cultural practices to control CMD are targeted at time of planting and stem types. Cuttings obtained from the lower portions of the main stem have been found to sprout into more infected plants than cuttings from the upper portion of the main stem and stem branches (Njock et al., 1994). Ogbe et al., (1996), comparing cuttings from 6 to 8 month-old mother-plants with those from 10 to 14 month-old mother-plants, observed more infected plants in the former. It has, therefore, been recommended that cuttings from the upper portion of mature plants be used for planting to control the disease. Another strategy to control CMD is to avoid planting when the whitefly populations are high. In the rainforest and transition forest agroecologies of Nigeria, for example, the major peak of whitefly population is between April and June. This is followed by a low population of the insect between July and August and a minor peak during September to December (Leuschner, 1978). Fargette et al. (1994) demonstrated that plants are generally more susceptible to secondary infection by whiteflies during the first 8-12 weeks. Terry and Hahn (1980), also showed that CMD infection at two months greatly reduced yield of storage roots. The results of Terry and Hahn (1980) also showed that losses due to late infections were lower in plants derived from clean planting material than plants from infected cuttings. Fauquet et al. (1988) suggested the use of low CMD incidence or low pressure sites to produce virus free planting material for distribution and as experimental material. 12 Plant resistance is the most important means to control virus diseases and the resultant crop losses due to them (Fraser, 1990). Resistant cassava genotypes tend to have lower CMD spread with mild to moderate symptoms (Jennings, 1960; Otim- Nape et al., 1998) and the resistant genotypes also have higher storage root yield when compared with the susceptible genotypes (Terry and Hahn, 1980). Resistant varieties were used to control the spread of CMD (Cours-Darne, 1968) in Madagascar in the 1920s. More recently in the 1990s improved resistant cassava varieties were used when there was an outbreak of UgV in Uganda (Otim-Nape et al., 1994b). 2.4 Host plant resistance to the cassava mosaic disease (CMD) Breeding for resistance to CMD started in the 1920s in East Africa and Madagascar (Jennings, 1994). Initially several cassava varieties were screened and the resistant ones released to farmers to stop the spread of the disease (Cours-Darne, 1968). There was, however, the need for higher levels of resistance. Story and Nichols (1938), developed an interspecific cross between M. esculenta and M. glaziovii Muller von Argau, which resulted in hybrids which had non-tuberous root and showed mild and transient symptoms. After three backcrosses to cassava, they were able to restore root quality and retain resistance to CMD (Nichols, 1947). Intercrosses between the third backcross selections produced a number of promising clones with good quality roots and resistance to virus was enhanced (Jennings, 1994). Open pollinated seeds from these clones were distributed to cassava breeders in Africa including Nigeria where clone 58308 was selected (Hahn et al., 1989; Jennings, 1994). The clone 58308 has poor root yield but has maintained resistance to CMD. It has high parental value for CMD and has been used as the main source of resistance in IITA’s breeding programme resulting in several improved cultivars of the Tropical Manihot Selection (TMS) series, with resistance to CMD (IITA, 1973; Hahn et al., 1989). 13 Resistance from this source is known to be polygenic, recessive, and is additively inherited with a heritability of over 60% (Hahn et al., 1980a; Hahn et al., 1989). Therefore, population improvement and recurrent selection are recommended in breeding for resistance to the disease (Hahn et al., 1989). This would accumulate desirable gene combinations that would be difficult for the pathogens to circumvent in the long term. Mutation breeding, somaclonal variation and genetic engineering have been suggested as additional tools in breeding for resistance to the cassava mosaic virus disease (Ahiabu et al., 1995; Ahloowalia, 1995; Maluszynski et al., 1995; Fauquet and Beachy, 1991). These methods, however, require skills in mutant induction and selection, generating somaclones and transgenics, and an efficient regeneration system to avoid chimeras and undesirable mutants. Although no genes conferring resistance to CMD have been cloned, transformation protocols for cassava have been established (Schopke et al., 1997). The coat protein sequences of the virus have been characterised (Hong et al., 1993) and could be used to generate gene constructs for transformation. The defective infective (Dl) DNA of the ACMV-B particle has been shown to induce amelioration of symptoms in transgenic N. benthamiana plants and the gene was passed on when the transgenics were used as a source of inoculum (Stanley et al., 1990). This could also be adapted in cassava transformation, for resistance to CMD. 2.4.1 Mechanism of resistance The mechanism of resistance to CMD is not fully understood and no immunity to CMD has been found within the Manihot species. Resistance to CMD is either by resistance to virus infection or resistance to the vector, B. tabaci (Rossel et al., 1992). Cassava genotypes which are resistant to virus infection show mild symptoms, which may be restricted to some shoots and have higher storage root 14 yield when compared to the susceptible genotypes (Jennings, 1976; Otim-Nape et al., 1998; Fargette et al, 1996; Terry and Hahn, 1980). They recover from symptoms with age (Njock et al., 1996) and contain low virus concentrations (Fargette et al., 1996). The ability of resistant genotypes to localise or restrict the movement of the virus has been referred to as non-systemicity (Thresh et al., 1998b; Njock, 1994). Infected cuttings of resistant genotypes are also able to sprout into healthy plants in subsequent generations. This phenomenon is known as reversion (Fauquet et al., 1987b). It has been observed that the extent of reversion is higher when plants are infected by the whitefly earlier in its growth cycle rather when infection is due to existing virus concentration in the mother cuttings used in planting or when infection by whitefly is delayed later in the plant’s growth cycle or (Thresh et al., 1998b). Simulation models have postulated that virus infection of resistant genotypes couid be maintained at an equilibrium below 50% by the combined effect of reversion and the use of virus-free planting material (Fargette et al., 1994; Fargette and Vie, 1995). Resistance to the whitefly vector has also been established. Fauquet et al. (1987b) observed that there was a clearly pronounced resistance to the vector. Fargette et al. (1996) also reported that resistant and susceptible cassava genotypes display a wide range of responses to whitefly population. Legg (1994), also observed variation in the suitability of some cassava genotypes, as hosts of B. tabaci. In contrast, Hahn et al. (1980b) observed similar numbers of whiteflies on resistant and susceptible genotypes and inferred that resistance to the vector was unlikely. Resistance to CMD is assessed by a number of components of resistance. These include field incidence (percentage of infected plants); vector resistance (number of adult whiteflies per plant); inoculation resistance, virus resistance (virus content estimated by ELISA), virus diffusion resistance (development of symptoms with time), 15 symptom intensity and non-systemicity incidence of young growing tips or shoot tips (Fargette et al., 1996; Changa, 1992). Fargette et al. (1996) demonstrated a significant correlation between symptom severity and ACMV titre among resistant genotypes in two separate trials. Resistance to virus multiplication and movement were the major responses to ACMV detected among resistant cassava landraces and improved cultivars (Ogbe, 2001). 2.5. Genetics of cassava Cassava is a highly heterozygous species (Kawano, 1978; Lefevre and Charrier, 1993). When selfed, it exhibits inbreeding depression but with outcrossing it results in wide segregation of most traits (Kawano, 1978). The F, is, therefore, genetically equivalent to an F2 with respect to loci common to parents (Magoon and Krishnan, 1977). It is propagated using stem cuttings to reproduce the genotype since propagation via seeds results in trait losses in the genotype due to the highly heterozygous nature (Kawano, 1980). Apomixis whereby seeds are produced without undergoing the sexual process has also been reported (Grattapagalia et al., 1996). Cultivated cassava genotypes generally have erratic flowering (Bonierbale et al., 1997a). Development of crosses for genetic studies has to some extent been difficult because of some levels of sterility, low seed set per pollination, and seed sterility that exist in cassava (Kawano, 1980; Hahn et al., 1979). Nevertheless, some genetic studies and intraspecific and interspecific crosses have been made for genetic studies and genotype improvement. Some of the well documented genetic studies in cassava include studies on cytogenetics of cassava and its wild relatives (Bai, 1987; Fregene, 1996), genome analysis and diversity studies (Awoleye et al., 1994; Grattapagalia et al., 1996; Lefevre and Charrier, 1993), hybridisation (Haysom et al., 1994; Asiedu et al., 1992; Magoon 1967; IITA, 1988; Bai et al., 1993; Wanyera, 1993) and the mode of inheritance of some morphological traits (Hahn et al., 1989; Bryne, 1987; Rajendran, 1989; Amma et al., 1995). 16 2.5.1 Genome analysis Cassava generally has a diploid genome (2n=36). However, some authors have described it as being an ailotetraploid. Since the breeding behaviour of allotetraploids is similar to that of diploids (Wricke and Weber, 1986), this is not certain. Although most cassava genotypes studied are diploid, spontaneous polyploids such as triploids (3x) and tetraploids (4 x) of some genotypes have been reported (IITA, 1980; Hahn et al., 1989). The nucleic acid content of diploid cassava is 1.67 picogram per nucleus, that is, 772 mega base pairs in the haploid genome (Awoleye et al., 1994). Lefevre and Charrier (1993) confirmed the heterozygous nature of the cassava genome and its relatives in their study on isozymes diversity among Manihot germplasm. Their studies also confirmed the diploid genome based on the inheritance of the isozyme markers. Similar studies on the inheritance of molecular markers also indicated a diploid genome (Fregene et al., 1994). 2.5.2 Cytogenetics Fregene (1996), reviewed the work of several authors who postulated that cassava is a segmental ailotetraploid with basic chromosome number x =9. Jos and Nair (1979), however, observed regular 18 bivalents during meiosis in several cassava genotypes the chromosomes suggesting a diploid genome with 2n=2x=36 chromosomes. They also detected the presence of pachytene abnormalities such as deletions, duplications, inversion and inregular pairing, in cassava clones that were partially pollen-sterile and concluded that this was due to hybridisation (Jos and Nair, 1979). Normal pairing at meiosis were observed in different independent studies of interspecific Fi crosses, between cassava and M. glaziovil and between cassava and arborescent cassava, which confirmed that interspecfic hybridisation occurs within the genus (Magoon et al., 1969; IITA, 1988; Bai, 1982; Wanyera, 1993). 17 2.5.3 Hybridisation and genetic diversity There is no genetic barrier to hybridisation within the genus Manihot (Haysom et al., 1994; Asiedu et al., 1992; Magoon, 1967; Bai, 1982; Wanyera, 1993). Spontaneous hybrids between cassava and other Manihot species have been reported to occur naturally in Africa and in Brazil (Nassar, 1994). In earlier hybridisation studies in the late 1930s, Doughty suggested that tree cassava is a natural hybrid between cassava and M. glaziovii (Fregene, 1996). Doughty’s suggestion was later confirmed by other workers who observed normal pairing at meiosis in an F-i between cassava and M. glaziovii and in an F-, between cassava and arborescent cassava (Magoon, 1967; IITA, 1988; Bai, 1982; Wanyera, 1993). Resistance to the cassava mosaic and the cassava bacterial blight diseases were derived from an interspecific cross between cassava and M. glaziovii (Nichols, 1947; Jennings, 1977; Hahn et al., 1989). The extent of genetic diversity in the crop may be attributed to the occasional use of seeds in propagation since cassava would combine easily with itself and wild relatives. Silvestre and Arraudeaus (1983), noted that African farmers sometimes take the cuttings from spontaneous seedlings for their subsequent planting (cited in Lefevre and Charrier, 1993). These new genotypes, referred to as the landraces provide a wealth of allelic genes for some traits and form part of the crops genetic resource which could be included in breeding programmes for the improvement of the crop (Gulick et al., 1983; Hershey, 1987). Some of these African landraces, collected in West Africa have been identified to be resistant to CMD (Raji, 1995; Mignouna and Dixon, 1997). The incorporation of the African landraces into the cassava breeding programme at IITA was hindered for several years due to the fact that many of them exhibit shy flowering or no flowering at all. It was discovered in 1990 that these landraces 18 flowered at Ubiaja, in Edo State, Nigeria, and this has encouraged the development of several crosses for genetic improvement and for genetic studies (IITA, 1990). 2.6 Quantitative genetics in cassava breeding 2.6.1 Mating designs Most traits studied in cassava are polygenic (Byrne, 1987; Rajendran, 1989; Amma et al., 1995; Hahn et al., 1989). Variation in polygenic traits is attributed to quantitative trait loci, QTLs. Quantitave traits in plants and animals are studied using a variety of genetic models and designs. The analysis of mating designs allows genetic analysis to be carried out after one generation and provides tests of the adequacy of the model (Mather and Jinks, 1982). The diallel and the North Carolina design II (NCD II) mating designs provide genetic interpretations including combining abilities and the inheritance of quantitative traits (Kang, 1994). The concept of general and specific combining ability was first defined by Sprague and Tatum, (1942). Knowledge on the relative importance of general combining ability (GCA) and specific combining ability (SCA) which represent two major modes of gene action for quantitative traits is essential in formulating an efficient breeding strategy. GCA of a line refers to the average value of the line based on its performance when crossed with other lines. It is due largely to additive gene effect. SCA is the deviation of a cross from the average GCA of the parent lines and is due to non-additive gene effects (Sprague and Tatum, 1942; Falconer and Mackay, 1997). The theory of the diallel was first developed by Jinks and Hayman (1953). Hayman (1954a; b) gave some assumptions which should be satisfied before a meaningful interpretation of the analysis is done. These assumptions are that: i, the parents are homozygous, 19 ii, there is normal diploid segregation, iii, there are no maternal effects, iv, genes are independently distributed among the parents, and v, there is no allelic interaction or no epistasis. These assumptions, have been shown to be unrealistic (Kempthorne, 1956; Matzinger and Kempthorne, 1956; Gilbert, 1973; Feyt, 1976). Kempthorne (1956) considered that the assumption concerning the independent distribution of genes in the parents is not reliable because it implies that the presence or absence of an allele at a particular locus is statistically independent of the presence or absence of an allele at any other locus. Failure of the assumption is a result of linkage of genes. Gilbert (1973) suggested that because of the biochemical pathways involved in the expression of traits, the assumption of no epistasis cannot be justified. Feyt (1976) showed that genes at n loci cannot be independent unless a minmum of 2n parents are used in a diallel cross. In other words, if four gene loci condition a trait, these loci cannot be independent unless 24=16 parents are used in a cross (Kang, 1994). This further shows that this assumption is unrealistic. Matzinger and Kempthorne (1956) considered varying degrees of inbreeding as acceptable to complete homozygosity of the parents. The North Carolina design II (NCD II) mating scheme is a cross-classification design that was first proposed by Comstock and Robinson (1948). It differs from the diallel in that different sets of parents are used as males and females. Thus, it accommodates more parents in determining combining abilities than a diallel and provides the same sort of genetic information (Hallauer and Miranda, 1988). The main effects of males and females are equivalent to GCA and the female x male interaction is equivalent to SCA. 20 Both the diallel and NCD II mating designs have been used to obtain genetic information on morphological and agronomical traits of importance in cassava (Hahn et al., 1989; Rajendran, 1989; Amma et al., 1995). Rajendran (1989) reported additive gene action for storage root yield and non-additive gene action for the yield components (harvest index, storage root number and storage root weight). Amma et al., (1995) reported that root quality traits, starch content, dry matter and cyanide (HCN) content are predominantly non-additive. Low HCN is also recessive (Kawano, 1978; Amma et al. 1995). Resistance to CMD derived from clone 58308 has been reported to be polygenic, recessive and inherited largely in an additive manner (Hahn et al., 1980a, 1989; IITA 1973; Msabaha, 1983). Recently, the presence of a major dominant resistant gene in the African landrace TME3 (known locally in Nigeria as “2nd agric”) designated as CMD2was reported (Akano etal., 2002; IITA, 2000). CBB, which was also derived from the 58308 source of resistance, is inherited largely in an additive manner. However, some degree of non-additive effect is also involved (Hahn etal., 1980a; 1989; Umemura and Kawano, 1983; Jennings, 1977). 2.6.2 Heterosis Heterosis is defined as the amount by which the mean of an Fi exceeds its better parent or midparent (Kang, 1994). Heterosis occurs when directional dominance exists, or there is allelic interaction or overdominance at some or all alleles and when the parents differ in gene frequency. However, the genetic basis of heterosis is not quite clear (Kang, 1994; Falconer and Mackay, 1997). Evidence from maize and other crops suggest that the genetic basis of heterosis is due to partial or complete dominance (Hallauer and Miranda, 1988; Brummer, 1999). 21 Overdominance arising from dominant alleles in repulsion at linked loci and epistasis, may also play a role in heterosis expression (Xiao et al., 1995; Yu et al., 1998). The condition that differences in allelic frequencies should exist for heterosis to be expressed, suggests that crosses between diverse populations would result in heterosis. However, highly divergent populations or certain cases of multiple allele populations may prevent heterosis expression (Moll et al., 1962; Bonierbale et al., 1993). The amount of heterosis exhibited in crosses between local and exotic cultivars of a crop is an important criterion in selecting cultivars to be used in a cross (Kang, 1994). To maximise the potential of heterosis of a trait in a breeding programme, crosses among the various populations, lines or clones need to be evaluated. If the progenies express heterosis the two parents are accorded to different heterotic groups (Hallauer and Miranda, 1988; Brummer, 1999). The extent of heterosis for most traits in cassava has not been studied extensively. Arnma et al. (1995) detected positive and significant high parent heterosis for dry matter content and starch in a diallel cross involving inbred lines. They also detected negative and significant heterosis for HCN. Heterosis for resistance to CMD was observed in crosses involving some improved clones and resistant African landraces (Lokko et al., 1998). 2.6.3. Number of genes and complementarity among genes affecting quantitative traits In Mendelian genetics, analysis of the number of genes is determined by observing the segregation ratios. In quantitative genetics, however, effective factors are estimated to determine if a few or many genes condition the quantitative trait. This is because quantitative genetics is based on the infinitesimal model where a large number of loci (polygenes) all with small effects affect the trait (Lynch and Walsh, 22 1998). Environmental effects usually modify these multiple genetic factors. The most important assumption for estimating effective factors is additivity of gene action. Molecular markers or biometrical methods can be used to estimate effective factors for quantitative traits (Lynch and Walsh, 1998; Zeng et al., 1990). Most of the biometrical estimating techniques do not estimate the exact number of genes but rather the number of effective factors (Wright, 1968; Lande, 1981; Cockerham, 1986; Zeng et al., 1990; Zeng 1992). This is because the number of genes affecting quantitative traits do not have the same effect and the total response of a trait depends primarily on the number of loci, which affect it (Falconer and Mackay 1997). The biometrical methods for estimating effective factors are based on the method which was first developed by Castle (1921) and Wright (1968). Using the Castle- Wright formula, the number of effective factors in theory cannot exceed the number of independent segregating chromosome segments, which is the haploid number. Since there are usually 1 or 2 recombination events in eukaryotes, the maximum number of effective factors can be 2 to 3 times the haploid chromosome number. However, each segment can contain many genes (Lynch and Walsh, 1998). The number of effective factors affecting resistance to CMD has not previously been reported. Hahn et al. (1980a) inferred from the polygenic nature of resistance to CMD that resistance must be attributed to the combined action of the number of loci which are linked on a chromosome or a set of chromosome of a genome or genomes. They, however, suggested that since cassava is genetically heterozygous and probably an allotetraploid, a study on the genetic mechanism of resistance to CMD is complicated and difficult. The main purpose of testing gene complementarity is to examine allelic relationships between different sources of a trait in order to determine whether the genes are allelic among themselves. Progenies between parents varying for a quantitative trait 23 would exhibit phenotypes ranging between those of the parents. The nature of the progeny’s distribution is determined by the number of genes which account for the differences between the parents and environment (Lynch and Walsh, 1998). Gene complementarity among sources of resistance to disease and pests has been reported in oats, soybean and sorghum (Fox et al., 1997; Wang et al., 1998; Dixon et al., 1991). 2.6.4. Heritability Heritability is defined as the percentage of total phenotypic variability for a trait that is due to the genes, and their interaction with other genes (Kang, 1994). If the environmental effect on the trait were negligible, then heritability would be 100%. Heritability is an indication of ease with which a trait can be transferred to the progeny. The value is not a constant and is defined for a particular trait in a population at a particular time. Variation in the estimate may, therefore, occur from population to population or within a population from time to time due mainly to changes in the additive genetic variance. Two forms of heritability are useful in breeding; heritability in the broad sense which is the ratio of total genetic variation to phenotypic variation, and in the narrow sense which is the fraction of the phenotypic variation that is due to additive genetic variation (Falconer and Mackay, 1997). From a seven-parent diallel cross, heritability of 60% was estimated for resistance to CMD (Hahn and Howland 1972; IITA, 1973). In related studies, selected clones from 10,000 F-i crosses made by crossing 58308 with three other cassava clones, gave narrow sense heritability of 35% and 30% for resistance to CMD estimated by parent offspring regression for two separate readings (Hahn et al., 1980a). Broad sense heritability of 48% and 68% for resistance to CMD estimated from variance components of half sib families from unreplicated trials in two planting seasons of 952 clones were also reported. Similar high heritability estimates were obtained for 24 CBB from these data sets (Hahn et al., 1980b). Based on these results, resistance to CMD and CBB is reported to be highly heritable (Hahn et al., 1989). Amma et al. (1995) reported high heritab.ii.ity (76.81%) for low HCN, based on narrow sense heritability, which was similar to an earlier report on the heritability of low HCN by Kawano (1978). Dixon et al.,(1994), on the other hand, reported that heritability for low HCN can vary from 0 to 50% depending on the composition of genotypes evaluated, number of years, locations, and seasons. Iglesias and Hershey (1991) observed that starch and dry matter content were highly heritable traits based on broad sense heritability, Amma et al., (1995), however, reported low heritability (3.25% for dry matter and 7.58% for starch content) based on narrow sense heritability. 2.6.5 Genetic correlation studies The phenotypic values of different traits are often found to be correlated. This may be due to the fact that genes, which control the expression of a quantitative trait, are likely to influence the expression of others. Genetic correlation or covariance between traits can be mainly attributed to pleiotropy, which is the influence of a single gene or locus on more than one trait, or gene linkage, which is a cause of transient correlation (Falconer and Mackay, 1997). Resistance to CMD and CBB is reported to be genetically correlated (IITA, 1973; Hahn 1978; Hahn et al., 1980b). Hahn et al. (1980b) obtained phenotypic (r2=0.423) and genotypic (1^ =0.899) correlations between resistance to CMD and CBB in half- sib families and genetic correlation (r2=0.689) in 925 clones. They attributed the correlation to genetic linkage, quoting the observation by Magoon et al. (1969) of random transmission of parental chromosomes in Manihot interspecific hybrids in support. Using canonical analyses, Jennings (1977) observed that the joint resistance was inconsistent in a population of both M. glaziovii and cassava 25 genotypes from the 58308 stock. Jennings proposed the alternative hypothesis of pleiotropic effect (Jennings, 1977; Jennings and Hershey, 1985). This reported linkage between resistance to CBB and CMD could have positive implications in breeding for resistance to the two diseases of cassava in that selection for one would lead to progress in the other. 2.7 Biometrical analysis 2.7.1 Analysis of mixed models in genetic studies For most genetic experiments, genetic material is evaluated over a set of years and locations or environments, to obtain precise genetic information. In such analysis, the genetic materials are considered fixed and the years and locations or environmental effects considered random (McIntosh, 1983; Zhang and Kang 1997). For random effects models, inferences are based on variances while in a fixed effects model, the inferences are based on means (Littell et al., 1996; Kang, 1994). Mixed model analyses, involving both fixed and random effects, allow the proper analysis of experimental data involving both random and fixed effects (Littell et al., 1996). The analysis of variance (ANOVA), the general linear models (GLM) and the variance components analysis (VARCOMP) which are statistical procedures available with the statistical analysis system (SAS), handle random, fixed and mixed model analyses with specifications. The GLM and ANOVA procedures in SAS use the method of least squares to fit general linear models. ANOVA handles balanced data only, while GLM handles unbalanced data and also allows the estimation of least square means. Least square means are the expected values of class means that would be expected for a balanced design of the class variable with all the covariates at their true mean (SAS, 1999). Least square means are thus suitable for estimating genetic effects such as combining abilities and heterosis in a mating experiment. It has been argued that 26 most analyses involving fixed effects models may indeed be mixed models and that the conventional least squares estimates in the analysis of variance may not lead to the 'best' estimates and may not be as informative as it should be (SAS, 1999). Henderson mixed model procedure of best linear unbiased predictors (BLUPs) are estimated in PROC MIXED procedure in SAS (SAS, 1997; Littell etal., 1996). The VARCOMP procedure in SAS computes estimates of the variance components in a general linear model. It is designed to handle models that have random effects. The minimum variance/norm quadratic unbiased estimate (MIVQUEO) method produces estimates which are locally the best quadratic unbiased estimates, given that the true ratio of each component to the residual error component is zero (SAS, 1997; SAS, 1999). The technique is similar to the type I methods which equates mean squares involving the random effects to its expected mean square, but the random effects in MIVQUEO are adjusted for the fixed effects. With fixed genetic effects in the analysis of data from either a diallel or a NCD II experiment, it is not appropriate to draw inferences about hypothetical populations (Ross et al., 1983). However, with qualifications, important quantitative information such as the ratio of additive variance to total genetic variance can be drawn from such studies that may aid in the improvement of the crop for the trait (Ross et al., 1983; Pedersen et al., 1983; Rooney et al., 1997). The ratio of additive variance to total genetic variance in the population gives an indication of the relative importance of GCA in predicting progeny performance. The closer this ratio is to 1, the greater the chances of predicting progeny performance based on GCA (Baker, 1978; Kang, 1994). If the genetic model assumed is a fixed effects model, the analogous mean square components of the fixed effects can be estimated to compute the ratios (Baker, 1978; Wricke and Weber, 1986; Ross et al., 1983; Kang 1994). The MIVQUEO procedure is, therefore, suitable for estimating the analogous mean square components. 27 2.7.2. The diallel analysis A number of genetic models and statistical procedures exist for analysing and interpreting the results of diallel crosses. Depending on the objectives and availability of materials for an experiment, reciprocal crosses could be included in a diallel analysis. The methods used for the analysis of a diallel cross include; i, Hayman’s method of analysis of variance for the genetic parameters ‘a’, ‘bs’, ‘c’ and ‘d’ (Hayman, 1954a), ii, Combining ability analysis which partitions combining ability into general combining ability (GCA) and specific combining ability (SCA) (Griffing, 1956b; Kempthorne and Curnow, 1961; Pooni et al., 1984; Singh and Chaudhary, 1985). iii, Gardener and Eberhart’s analysis II which estimates the effects vh hg, h and sy variety and heterosis effects (Gardener and Eberhart, 1966). The genetic parameter 'a' of Hayman’s method refers to additive gene effect and is equivalent to GCA, while the ‘bs’ which refers to dominance gene effect is equivalent to SCA. The paramter 'c’ refers to maternal effects and ‘d’ refers to reciprocal effects of the differences due to 'c\ The SCA is further partitioned into 'b/ (directional dominance which compares the mean of the Fi and the mid parent value), ‘b2 (directional dominance which quantifies gene distribution over arrays) and ‘b3 (discrepancy in reciprocals due to dominance). Using the Hayman analysis of variance, significance of the b effects allows further analysis of the data using the graphical analysis procedure, WrIVr graphs (Mather and Jinks, 1982). Here, Wr is the covariance between the parents and Vr is the array variances, an array being the crosses in which a particular parent is involved (Singh and Chaudhary, 1985). In the absence of non-allelic interaction and with independent 28 distribution of genes among parents, Wr is related to Vr by a regression slope. The distance between the origin and the intercept of the regression line on the curve provides a measure of the degree of dominance. The order of the array points along the regression lines gives an indication of dominant and recessive genes among the parents (Singh and Chaudhary, 1985). The theories underlying the analysis of diallel crosses in relation to combining ability were later explained by Griffing (1956a; b). Griffing (1956a) described four methods of the diallel crosses depending on the material used. Method (1) includes selfed parents, and reciprocals. Method (2) has the selfed parents and F-|S only. Method (3) involves the F^ and reciprocals, and method (4) involves the F-|S only. In Griffing’s approaches to the analysis of diallel crosses, variance components due to general and specific combining ability are estimated which are then translated into genetic components such as additive and dominance variance. The analysis may be based on fixed effects model I or random effects model II. The procedure for the analysis of variance of the two methods is the same except for the expectation of mean squares (Griffing 1956b; Pooni et al., 1984; Singh and Chaudhary, 1985). The first part of the analysis consist of testing the null hypothesis, that there are genotypic differences among F-, crosses, parents and the reciprocals. Significant differences among the genotypes are a prerequisite for the combining ability analysis (Singh and Chaudhary, 1985). The average heterosis ‘h’ in the Gardener and Eberhart analysis II is equivalent to the parent versus cross orthogonal contrast in a traditional analysis of variance and the specfic heterosis ‘s,/ is equivalent to SCA (Hallauer and Miranda, 1988; Pooni et al., 1984). The Gardener and Eberhart analysis II also gives two estimates of the parent, the variety heterosis, ‘h,/ and the variety mean square 'v, which includes information of the performance of the varieties themselves and the varieties within crosses. These two estimates are not equivalent to either the parent mean square in 29 a traditional analysis of variance or GCA of Griffing’s methods (Hallauer and Miranda, 1988; Pooni et al., 1984). In the analysis of variance of genetic data, an orthogonal contrast between the parents and crosses sums of squares is used as an estimate of average heterosis in a mating experiment (Hallauer and Miranda, 1988). The various methods of analyses of diallels were evaluated by Arunachalam (1976) who concluded that the various assumptions made for the Hayrnan, (1954a) and the Gardener and Eberhart (1966) methods were generally not fulfilled in practical situations with experimental material. He concluded that for testing parents and Fi materials, the combining ability analysis gives the best estimates of the gene action in parent and/or hybrid combinations. 2.7.3 The number of effective factors The most widely used method for estimating the number of effective factors, is the Castle-Wright formula (Castle, 1921; Wright, 1968), which utilises information on the phenotypic means and variances of the parent lines and subsequent generations (Fi, F2, Bi, B2 etc) The Castle-Wright formula is based on the assumptions that, i. all segregating loci contribute equally to the trait, ii. no linkage exists among loci affecting the trait, iii. no dominance, or the degree of direction of dominance of plus factors is similar for all loci, iv. all plus factors are contributed by one parent and all minus factors by the other, v. no epistasis occurs among alleles at contributing loci, and 30 vi. environmental and genotypic variances are independent and combine additively to give total variability. Since Castle (1921) and Wright (1968) developed their formula for inbred lines, a number of models have been proposed for the estimation of effective factors in genetically variable populations which are mainly modifications of the model given by Castle and Wright (Lawrence and Frey, 1976; Lande, 1981; Cockerham, 1986; Zeng et al., 1990; Zeng, 1992; Fenster and Ritland, 1994). Fenster and Ritland (1994), suggested that since the degree of effective dominance can change among segregating generations due to epistasis, assumption iii) would not be valid and proposed a formula for estimating effective factors which incorporated dominance based on the segregating generations. Lawrence and Frey (1976) also pointed out that if assumption iv) for the Castle- Wright formula is not met, then the numerator would be small and the number of factors would be underestimated. The range of the F-, segregates instead of the parental range was suggested as the numerator in the formula for estimating effective factors. They further argued that the range of the segregating population was a better estimate than the parental range if the parents did not represent genotypic extremes. The range of the segregating population is also appropriate when there is non-additive variance involved in the inheritance of the trait (Dixon et al., 1991; Lawrence and Frey, 1976; Fenster and Ritland, 1994). Lande (1981) also gave a modification of the Castle-Wright formula for use with genetically variable populations. Cockerham (1986) and Zeng (1992) separately proposed methods for correcting sampling error. Cockerham proposed using the experimental variance as a correction factor for the squared difference between the mean of the two parents, while Zeng (1992) proposed using least squares to estimate the genetic variance and a modified estimator. 31 Multize and Barker (1985), showed that the estimated number of effective factors increased as heritability and the probability of type I error increased. There was, however, a downward bias of the estimated number of effective factors when heritability decreased, the probability of type I error lowered and when there was an increase in the type II error. Zeng et al. (1990), observed that the Wright method is of little value in estimating the number of loci affecting a quantitative trait since linkage would prevent the estimation from exceeding the number of chromosomes. The sample variance may prevent one from concluding that a large number of loci are involved. They, therefore, proposed that the alternative is to estimate the number of genes using molecular markers. 2.8 Molecular marker technology and its application in cassava breeding and genetics Molecular markers, which include biochemical (isozymes and storage proteins) and DNA markers, exist in every genotype. They occur in large numbers and their expression is independent of phenotypic value and as such are powerful tools of genetic research (Soller and Beckmann, 1983; Beckmann and Soller, 1986). They allow geneticists and plant breeders to locate and follow the numerous interacting genes that determine a complex trait as well as tagging those controlled by single genes. Molecular markers offer many advantages over conventional phenotypic markers. They are i) developmental^ stable, ii) detectable in all tissues, iii) unaffected by environmental conditions, and iv) virtually insensitive to epistatic or pleiotropic effects. They provide a choice of codominant markers which can be identified in heterozygotes or dominant markers which are identified as present or absent subclasses (Botstein et al., 1980; Williams et al., 1990). Molecular markers have made an immense contribution to cassava breeding and genetics. The areas covered include, the development of genetic maps (Fregene et 32 al., 1997), the assessment of genetic diversity (Bonierbale et al., 1997b; Lefevre and Charrier, 1993; Beeching et al., 1993; Mignouna and Dixon 1997; Sanchez et al., 1999; Fregene et al., 2000), taxonomical studies (Second et al., 1997), understanding the phylogenetic relationships in the genus (Calvalho et al., 1993; Roa et al., 1997; Oslen and Schaal, 1999; Roa et al., 2000), confirmation of ploidy (Lefevre and Charrier, 1993; Fregene et al., 1994) and cultivar identification (Ocampo et al., 1993; Wanyera, 1993; Laminski et al., 1997). 2.8.1 Isozyme markers Isoenzymes or isozymes of a gene product refer to any two distinguishable proteins that catalyse the biochemical reaction of the gene. Their usefulness as molecular markers depends on the rarity of the allozyme variant (group of isozymes that catalyses the same gene product), how tightly linked they are to a gene and how difficult a direct screening method is. The main limitation of isozyme markers is that only few gene products can be revealed. Nevertheless, isozymes have been used in cassava breeding and genetics (Wanyera, 1993; Lefevre and Charrier, 1993; Ocampo et al., 1993; Laminski et al., 1997). Wanyera, (1993) demonstrated the usefulness of isoenzymes in confirming true hybrids in a cross between M. glaziovil and M. esculenta. Lefevre and Charrier, (1993) detected genetic diversity among several cassava clones with isozyme markers. The study also confirmed that cassava is a true diploid based on the inheritance of the markers. Ocampo et al. (1993) used the esterase isozyme to fingerprint the cassava germplasm collection held at the CIAT. This isozyme was also used in fingerprinting South African elite cultivars (Laminski et al., 1997). Fregene et al. (1997) placed three Isozyme markers on the cassava genetic linkage map developed at CIAT. Isozymes markers were also used to develop a procedure for identifying varieties of cassava (Hussain et al, 1987; Ramirez et al., 1987). 33 2.8.2 DNA markers A DNA marker is basically a small region of DNA showing sequence polymorphism in different individuals within a species (Liu, 1997). The markers are based on the enormous variation or polymorphism in DNA sequences of organisms. DNA markers eliminate the limitations in genome investigations using morphological and isozyme markers, such as gene expression and environmental interaction, heritability, and low map resolution (Vogel et al., 1996). DNA-based marker systems can also be used for indirect selection of tagged loci affecting qualitative or quantitative traits and to monitor loci during introgression or selection programmes, thus reducing the number of backcross generations (Baird et al., 1996). Many DNA fingerprinting techniques have been developed to generate markers and are generally based on one of two strategies; classical, hybridisation-based fingerprinting and PCR-based fingerprinting. Classical, hybridisation-based fingerprinting which was first developed for use in human genetics and is now used as a general method of genetic analysis involves the cutting of genomic DNA with restriction enzymes followed by electrophoretic separation of the DNA fragments. The restriction fragment length polymorphism (RFLP) generated is then detected with probes targeted to specific regions of the genome, after southern blot transfer to a nylon membrane (Soller and Beckman, 1983 ). PCR-based fingerprinting involves the amplification of particular DNA sequences using specific or arbitrary primers and a thermostable polymerase. Amplification products are separated by electrophoresis and detected by staining with ethidium bromide on agarose gels, silver on polyacrylamide gels or the use of radioactive or fluorescent labelled primers in the amplification reaction (Ehrlich et al., 1989). Techniques in this category include random amplified polymorphic DNA (RAPD) (Williams et al., 1990), DNA amplification fingerprinting (DAF) (Caetano-Anolles et 34 a/., 1991), arbitrarily primed-PCR (AP-PCR) (Welsh and McClelland, 1990) microsatellites (Hamada et al., 1982) and amplified fragment length polymorphism (AFLP) (Vos etal., 1995). DNA markers vary in their level of polymorphism and the amount of information generated. The most informative DNA markers are characterised by high polymorphism information coefficient (PIC), indicating relatively large number of alleles with similar frequencies in each locus (Botstein et al., 1980). Polymorphism may also be due to single site alterations due to mutations, which abolish or create a restriction site, insertions, deletions, or inversions between two restriction sites. In this case, the level of polymorphism is low. Polymorphism may also be due to variable number of tandem repeats (VNTRs) resulting in a marker system with high levels of polymorphism and can detect variation between closely related relatives. Markers that detect single site alterations include RFLPs, RAPD, and AFLPs, while minisatellites and microsatellite markers detect VNTRs. 2.8.3 Restriction Fragment Length Polymorphism (RFLP) RFLP is a powerful tool in genetic mapping. RFLP markers are cloned DNA segments that can be used to reveal base pair changes or rearrangements in homologous DNA sequences (Beckman and Soller, 1986). They are codominant markers and scoring is simple because of the low copy number. However, the use of RFLP markers requires that many probes are used which requires a complementary DNA (cDNA) or genomic library of the species already exist. The use of RFLP markers has led to the rapid construction of linkage maps in many plant species and the placement of genes that control qualitative traits on these maps. Single locus disease resistance genes were the first category of genes to be mapped extensively using RFLPs. This was partly because most characterised disease resistance genes are controlled by alleles at a single locus and most are 35 dominant genes that are easy to assay. RFLPs have also been used to study and identify markers that are tightly linked to genes controlling quantitative traits (Tanksley, 1993; Young, 1996). In RFLP mapping, regions of interest are detected based on their map positions by analying flanking markers when the defined intervals are small. RFLP markers have contributed to DNA marker technology in cassava. Angel et al. (1993) initiated work on a detailed genetic map of cassava for tagging agronomically important traits and to clone cassava genes. Currently the map has 239 RFLP markers (Fregene et al., 2001). Beeching et al., (1993), assessed the genetic diversity of several accessions of cassava with RFLPs. Recently, Oslen and Schaal (1999) demonstrated the geographic origin of cassava from its relative, M. esculenta subspecies flabellifolia based on sequence variation of the single copy nuclear gene glyceraldehyde 3-phosphate dehydrogenase (G3pdh). Despite the potentials of RFLP markers in breeding, there are limitations. It is laborious and intensive and does not generally offer the opportunity to target tight linkages to specific loci because of the specific probe used. In addition, RFLPs cannot detect all the differences that exist between nucleotides, only those that occur at restriction sites. The value of the marker comprising an RFLP map depends on the degree to which such markers can be used in a population other than the mapping population. This is a function of the degree of genotypic diversity in the primary gene pools (i.e the mapping population) but also depends on the usefulness of the probes used. Combinations of RFLP and RAPD markers are also commonly used in the constructions of linkage maps for many plant species (Fregene et al., 1994; Zhang et al., 1996). Calvalho et al. (1993) used RFLP of mtDNA and RAPDDs to identify phylogenetic relationships in the Manihot genus. 36 2.8.4 Random Amplified Polymorphic DNA (RAPD) RAPD primers usually have 10 base pair sequences that are used to amplify unknown regions of a genome. The short sequences of the primers make possible a multitude of primer binding sites throughout the genome. Efficient amplification of DNA fragments may occur when two primer binding sites occur in close proximity (Burrow and Blake, 1998). RAPD markers are dominant since the polymorphism is detected as a failure of one allele to amplify due to mutations in the primer-binding RAPD markers are the most extensively used markers in cassava biotechnology, especially in determination of phylogenetic relationships and genetic diversity in Manihot species (Marmey et al., 1994; Laminski et al., 1997; Schaal et al., 1997). They are simple, rapid, and only small amounts of DNA are required. Gomez et al. (1996) subjected 328 RAPD markers to linkage analysis in cassava. Out of this, 30 were placed on the cassava linkage map at CIAT (Fregene et al., 1997). Currently, the map has 80 RAPD markers (Fregene et al., 2001). Marmey et al. (1994) demonstrated genetic diversity among African cassava accessions using RAPD markers. Mignouna and Dixon (1997) observed that several African landraces with varying levels of resistance to CMD were genetically different from the genetic stock 58308 based on RAPD analysis. DAFs and AP-PCR are basically variations of the RAPD technique. DAF differ from RAPDs, in that the primers are made up of short (sequences 5-8 nucleotides long) and the fragments are separated on polyacrylamide gels to give better resolution of the DNA fragments. DAFs and AP-PCRs have not been reported in cassava. 2.8.5 Variable number of tandem repeats (VNTR) The two classes of VNTR markers are the minisatellite with repeat units from about 10-45 base pairs and microsatellite or simple sequence repeats (SSR) with repeat 37 units of 2-4 nucleotides (Jeffrey et al., 1985). Eukaryotic genomes contain extended repetitive elements, many of which are mobile such as the retrotransposons which are flanked by long terminal direct repeats (LTRs) (Rhode, 1996). The presence or absence of flanking LTRs which occur in high copy numbers in the genome have been characterised in a number of plant species and have been the basis to establish a PCR-based marker system known as the inverse sequence-tagged repeats (ISTR). 2.8.6 Microsatellite or simple sequence repeats (SSRs) Microsatellites or simple sequence repeats (SSR)s are well distributed within the genome of eukayotes (Morgante and Olivieri, 1993). They are flanked by unique sequences that serve as templates for specific primers to amplify the SSR allele via the polymerase chain reaction. Microsatellite polymorphism which can be detected by autoradiography, fluorescence labelling, silver staining and ethidium bromide staining have been used to complement both the RFLP and PCR based markers. SSRs have a high level of allelic diversity as a result of the variable number of repeat units within their structure, and this makes them valuable as genetic markers (Hamada et al., 1982; Morgante and Olivieri, 1993). They are often multiallelic, can be multiplexed, automated for high throughput genotyping and convenient to exchange between laboratories (Powell et al., 1996b; Chen et al., 1997). They are especially suited for distinguishing between closely related genotypes. Although the procedure for obtaining microsatellites is laborious and expensive, their conversion to PCR markers allows the screening of large numbers of alleles for defined loci. The markers are codominant and are, therefore, a suitable option for mapping and molecular characterisation in cassava (Agyare-Tabbi et al., 1997; Chavarriaga- Aguirre et al., 1998). 38 Chavarriaga-Aguirre et al., (1998) isolated and characterised 14 highly heterozygous GA rich microsatellite DNA in cassava. A total of 521 accessions from the cassava core collection at CIAT were successfully screened with four SSR loci developed into PCR-based markers, to determine genetic diversity (Chavarriaga-Aguirre et al., 1999). Cassava microsatellites were also used to assess genetic diversity among cassava accessions and assess genetic relationships between cassava and its wild relatives (Roa et al., 2000). Recently, Mba et al. (2001), developed and characterised 172 new SSR markers and placed 36 of these on the cassava linkage map. 2.8.7 Amplified fragment length polymorphism (AFLP) The amplified fragment length polymorphism (AFLP) technology, which is based on a combination of restriction digestion and PCR, was developed by Vos et al. (1995). It involves the selective amplification by PCR of a subset of restriction fragments of agenomic sample. DNA is digested with two restriction enzymes, a frequent cutter and a rare cutter. Double-stranded DNA adapters are then ligated to the ends of the DNA fragments to generate template DNA for amplification. Thus, the sequence of the adapters and the adjacent restriction site serve as primer binding sites for subsequent amplification of the restriction fragments by PCR. Selective nucleotides extending into the restriction fragments are added to the 3' ends of the PCR primers such that only subsets of the restriction fragments are recognised. Only restriction fragments in which the nucleotides flanking the restriction site match the selective nucleotides will be amplified. The subset of amplified fragments are then analysed by denaturing polyacrylamide gel electrophoresis to generate the fingerprint. The frequent cutter generates fragments with the optimal size range for separation on denaturing gels and the number of fragments to be amplified is reduced by using the rare cutter, since only the rare cutter/frequent cutter fragments are amplified (Vos et al., 1995). In most AFLP studies, genomic DNA is digested with the enzymes 39 EcoR1/Mse1 that is suitable for species with AT (adenine and thiamine) rich genomes (Vos et al., 1995; Janessen et al., 1996). Other enzyme combinations such as Pst1/Mse1, Hindll/Mse1 are useful for genomes with 40-50% GC (guanine and cytocine) content, while Apa1/Taq1 are used for GC rich genomes (Wong et al., 1999). AFLP markers have high multiplex ratios since several genomic fragments are analysed in on assay and require no prior sequence data (Rusell et al., 1997; Powell et al., 1996a). They are generally dominant markers based on the presence or absence of a particular product. However, with appropriate software, such as Keygene (Wageningen, the Netherlands) they can be scored as codominant markers The main advantage of AFLP technology over other marker systems is that variability can be assessed at a large number of independent loci and data can be obtained quickly and reproducibly (Sanchez et al., 1999). AFLP markers have been extensively used in cassava molecular marker technology (Bonierbale et al., 1997b; Second et al., 1997; Roa et al., 1997; Sanchez et al., 1999; Fregene et al., 2000). Fregene et al. (2000) were able to detect a considerable number of duplicates in some African landraces with AFLP markers. They also detected a unique AFLP fragment among the African landraces, which was associated with branching pattern. Sanchez et al. (1999) established genetic similarities among cassava accessions resistant to the CBB disease. AFLP markers were also used to demonstrate genetic relationships among different cassava accessions and other manihot species (Bonierbale et al., 1997b; Second et al., 1997; Roa et al., 1997). 2.9 Determination of molecular markers associated with traits of interest The strategies available to facilitate detection of molecular markers associated with specific traits of interest include bulk segregant analysis (Michelmore et al., 1991; Giovannoni et al., 1991), linkage mapping (Tanksley et al., 1989) and QTL analysis 40 (Dudley, 1993; Tanksley, 1993). For markers associated with disease resistance genes, degenerate primers based on isolated candidate resistant genes can be used to screen for resistance gene analogs (Kanazin et al., 1996; Collins et al., 1998). 2.9.1 Bulk segregant analysis Bulk segregant analysis (BSA) is a rapid method for identification of markers in specific regions of the genome (Michelmore et al., 1991; Giovannoni et al., 1991). The method is especially useful when there are no other markers available within the region of interest. It involves comparing two pooled DNA samples of individuals from a segregating population. The two DNA pools contrast for a trait (e.g. resistant and susceptible) or a marker allele in a previously mapped population are analysed to identify markers distinguishing them. The individuals in each bulk are identical for the trait or gene of interest but are arbitrary for other traits. So the genomic region is studied against a randomised genetic background of unlinked loci. Markers that are polymorphic between the pools will be genetically linked to the loci determining the trait used to make up the pools. Giovannoni et al. (1991) demonstrated the use of DNA pooling to saturate specific chromosomal regions with additional markers, based on DNA pools from an existing tomato mapping population. Kiehne and Neale (1998) adapted this strategy in the outcrossing species loblolly pine, and identified nine new RAPD markers linked to a chromosomal interval containing the previously mapped QTL for wood specific gravity. Reamon-Buttner et al. (1998) detected nine AFLP makers, tightly linked to the sex locus of the dioecious plant Asparagus officinalis L, which is important to plant breeding in developing sex-specific PCR primers. Bulk segregant analysis has also been explored in tagging molecular markers linked to major disease and pest resistance genes (Michelmore et al., 1991; Haley et al., 1993; Miklas et al., 1993; Garcia et al., 1996; Seo et al., 1997; Fritz et al., 1999; 41 Correa et al., 2000). Michelmore et al., (1991), were the first to report three RAPD markers linked to major disease resistance genes using contrasting DNA bulks composed of F2 individuals of known genotype. Akano et al. (2002) recently used BSA to identify a SSR marker linked to the CMD resistance gene, CMD2. Molecular markers linked to two major rust resistance genes in common bean, were detected using introgression and bulk segregant analysis (Miklas et al., 1993; Haley et al., 1993). The bean studies, however, revealed that a major limiting factor in the application of dominant markers linked in coupling with dominant alleles or in repulsion with recessive resistant alleles is the amplification of the DNA fragments of equal sizes in susceptible germplasm (Miklas et al., 1996). Sequencing results of a RAPD marker, which distinguished between resistance and susceptibility to the Russian wheat aphid (Fritz et al., 1999) revealed a deletion in the susceptible parent relative to the resistant parent. Markers of interest are therefore, usually developed into sequence characterised amplified regions (SCARs) which are codominant markers for resistance screening (Garcia et al., 1996; Seo et al., 1997; Fritz et al., 1999; Correa et al., 2000). 2.9.2 Linkage mapping Linkage maps are based on recombination frequencies. The percentage of a segregating progeny that are recombinants for a pair of linked loci is the recombination frequency. The recombination frequency gives an estimate of the distance between two loci in a chromosome, on the assumption that the probability of crossover is proportional to the distance between loci. Construction of genetic linkage maps requires that the most appropriate mapping population(s) is available. Pairwise recombination frequencies of these populations are calculated from genetic data such as DNA marker scores, linkage groups are established and map distances estimated, and then map order is determined (Staub 42 et al., 1996). Different marker systems may be included in the analysis of the population. Computer packages such as Linkage 1 (Suiter et al., 1983), GMendel (Holloway and Knapp, 1994), Mapmaker (Lander and Botstein, 1986; Lander et al., 1987), MapManager (Manly and Cudmore, 1996) and JoinMap (Stam, 1993) have been developed to aid in the analysis of genetic data for map construction. These programmes use data obtained from segregating populations to estimate recombination frequencies that are then used to determine the linear arrangement of genetic markers by minimising recombination events. Genetic linkage maps can provide a more direct method for selecting desirable genes via their linkage to easily detectable molecular markers (Tanksley et al., 1989). Once a trait is identified and mapped, marker-assisted selection (MAS) could be used in early identification of the trait. This would reduce, breeding population sizes, continuous recurrent testing and the time required in developing a superior line. The cassava genetic linkage map developed at CIAT (Fregene et al., 1997), currently has five known genes including two CMD resistance genes CMD1 and CMD2 placed on it and a number of quantitative trait loci (QTL) associated with some linkage groups (Fregene et al., 2001). 2.9.3 Quantitative trait loci (QTL) Molecular markers are being utilised in studying quantitative traits loci (QTLs) that affect traits and have been suggested as tools to enhance the efficiency of selecting quantitative traits in breeding programmes (Dudley, 1993; Young, 1996). The theory of QTL mapping was first described in 1923 by Sax and later elaborated by Thoday in 1961 who suggested that if the segregation is simply inherited, monogenes could be used to detect linked QTLs (Thoday, 1961). Young (1996) suggested that defined sequences of DNA, in modern day QTL mapping act as the linked monogenic markers used to detect QTLs. 43 Mapping QTLs with molecular markers require the analysis of large segregating populations (Tanksley, 1993). A population derived from distinctively different parents (e.g. resistant and susceptible) is analysed with the marker to test for linkage of markers to a QTL. This is based on its position on the map from flanking markers. The population is then classified into homozygous or heterozygous for the marker of the parents and compared statistically with phenotypic values to determine linkage of the molecular marker with the QTL. One of the first published reports of QTL mapping with DNA markers involved fruit size, pH and soluble solids in tomato (Paterson etal., 1988). QTL mapping for quantitative disease resistance has been reported in a number of plant species (Young, 1996). Wang et al. 1994, mapped QTLs affecting resistance to the rice blast fungus in a population consisting of recombinant inbred lines. QTL’s associated with resistance to potato late blight was studied in a Ft cross between two highly heterozygous parents where the phenotypic disease responses were compared with DNA marker data (Leonards-Schippers et al., 1994). The results indicated that resistance to the pathogen is a cumulative phenotype resulting from a small number of race-specific and race-nonspecific resistance loci. Other QTL mapping studies on quantitative disease resistance include resistance to leaf spot fungus of maize, resistance to the soybean cyst nematode, and tomato bacterial wilt (Bubeck et al., 1993; Danesh et al., 1994; Concibido et al., 1994). Recently, QTLs associated with dry matter yield and dry matter percentage have been linked to linkage group D of the cassava framework map (Fregene et al., 2001). A number of computer software have been developed for QTL analysis, these include MAPMARKER/QTL, QTLSTAT, LINKAGE, MAPQTL, QGENE, Map Manager QT and PGRI (Liu, 1998). 44 2.9.4 Screening for resistance gene analogs Genes conferring resistance to several diseases caused by a wide range of pathogens including viruses, fungi, bacteria and nematode have been isolated using map-based cloning or insertional mutation (Michelmore, 1995; Staskawicz et al., 1995). Most of the cloned resistance genes have been found to have common structural motifs known as leucin rich repeats (LRR) which are actually the specificity determinants (Thomas et al., 1997; Wang et al., 1998). Leucin rich repeats containing resistant genes can be grouped into the transmembrane R proteins which lack nucleotide binding sites (NBS) and the cytoplasmically located R proteins with NBS (Barker et al., 1997; Ori etal., 1997). The presence of these conserved sequences among R genes has provided an opportunity for isolating new resistant genes or resistant gene analogs (RGAs) based on PCR methods by designing degenerate primers based on the conserved sequences (Kanazin et al., 1996; Collins et al., 1998; Gentzbittel et al., 1998). The cloning and analysis of these resistant gene analogs has opened new avenues of research on plant disease and produced markers tightly linked to resistance genes. Gentzbittel et al., (1998) isolated RGAs in sunflower using NBS and protein kinase conserved motiffs. They further mapped three of the clones to linkage group 1 where downy mildew resistance genes are located. RGA were also mapped to four downy mildew R genes clusters in lettuce (Shen et al., 1998). 2.9.5 Sequence tagged sites (STS) A recent addition to the tools for fine physical mapping of genes of interest is sequence tagged sites (STS). An STS is a short DNA sequence generally, between 100-500 bp that is easily recognisable and occurs only once in a chromosome or genome studied (Brown, 1999). STS are derived in the genome from expressed sequence tags (EST), which are obtained from cDNA clones; simple, sequence 45 length polymorphism (SSLP), or from random genomic sequences. To date, six ESTs have been placed on the cassava genetic map (Fregene et al., 2001). Single nucleotide polymorphism (SNPs) are another group of site specific markers, which are the most common type of DNA sequence variation and occur every 100 to 300 base pairs (Lohmann et al., 2000). Due to their high frequencies, the chances are high that disease resistance or susceptibility would be caused by or closely associated with specific SNPs. 2.9.6 Type of population The choice of an appropriate mapping population depends on the type of marker systems employed (Tanksley et al., 1988). Generally F2s and backcross progenies are used in the construction of linkage maps, QTL analysis, and BSA for homozygous species. In a heterozygous species, the segregating F-, is used (Liu, 1998; Pillay and Kenny, 1996; Fregene et al., 1997). However, this should be an informative cross between two divergent parents that are heterozygous for the markers (Falconer and Mackay, 1997). The cassava genetic map at CIAT is based on an Ft population of two geographically divergent parents. The female parent TMS I30572, having resistance from clone 58308, while the South American male parent has no resistance to CMD (Fregene et al., 1997). Maximum genetic information is obtained from a completely classified F2 or segregating Ft population using a codominant marker system. Information from a dominant marker system can be equivalent to a completely classified F2 population if progeny tests (i.e. F3 or F2, BC) are used to identify heterozygous F2 individuals (Staub et al., 1996). Progeny testing is required when the phenotypes do not consistently reflect genotype or where trait expression is controlled by QTLs. Recombinant inbred lines (RIL) are suitable for QTL mapping in that they consist of many distinct lines each a mosaic of different homozygous segments from the 46 original parent. Dominant markers also supply as much information as codominant markers in RIL (i.e. an array of genetically related lines; usually F8 or more), doubled haploid, or backcross populations in coupling phase because in these populations, all loci are homozygous, or nearly so (Burr and Burr, 1991 ;Liu, 1997). Other types of population used in identification of markers tightly linked to disease resistance genes are the near-isogenic lines (NILs) (Muehlbauer et al., 1988; Young et al., 1988). A donor-parent (DP) carrying the gene of interest is repeatedly backcrossed to a recurrent parent (RP), selection of the desired gene and recovery of the RP phenotype is continued until the newly developed line is theoretically nearly isogenic with the RP, except for the chromosome segment containing the target gene. Linkage between a molecular marker and the target gene can be assessed by determining the marker genotype of the RP, its various NIL derivatives, and their corresponding DPs (Melchinger, 1990). Although the development of NILs through back11 crossing for economically important genes is costly and time consuming (Kelly, 1995), the use of such NILs has permitted the successful tagging of three disease resistance genes in common bean (Haley et al., 1994; Miklas et al., 1996). Inter-gene-pool crossing technique was used successfully to tag two rust resistance genes in common bean (Haley et al., 1993; Miklas et al., 1993) and in tomato, interspecific populations have been applied in mapping of some disease resistance genes (Paterson et al., 1990). 47 Chapter Three Materials and Methods The study was carried out in the experimental fields and at the Cellular and Molecular Technology Laboratories (CMTL) of the International Institute of Tropical Agriculture (1ITA) in Ibadan, Nigeria. 3.1 Analysis of variance and combining ability analyses 3.1.1 Genetic Material Diverse cassava genotypes with varying levels of resistance and susceptibility to CMD, including the resistant genetic stock 58308, six improved clones and 15 African landraces and the progenies from their F-) crosses were used in the study. The cassava genotypes, TMS I30572 (moderately resistant), TMS 91/02324 (resistant to CMD), a resistant landrace TME1 (Antiota) and a susceptible landrace TME117 (Isunikankiyan) were included as checks. The parental and check genotypes are described in Table 3.1. 3.1.2 Experimental design Seeds of the Ft crosses, parental and check genotypes were provided by the cassava breeding unit at IITA for the study. The seeds were obtained from two mating designs. One was a 4X4 diallel mating scheme in all possible combinations including selfs and reciprocals. It involved the resistant genetic stock 58308 and three improved clones, TMS 130001, TMS I30572 (resistant to CMD), and TMS I30555 (susceptible to CMD). The other mating design was a 3X18 North Carolina design II (NCD II) with TMS 130001, TMS I30572, and TMS I30555 as the female parents, three resistant improved clones and the 15 landraces were the male parents. 48 Table 3.1 Description of cassava clones used as parents in 4X4 diallel and 3X18 North Carolina Design II matings and checks Clone Pedigree information, local name and origin Genetic Experiment Diallel Design II TMS 130001 Pedigree information lost Parent 1 Female 1 TMS I30555 58308 X Oyarugba dudu Parent 2 Female 2 TMS I30572* 58308 XBranca de Santa Caterina (OP)+ Parent 3 Female 3 TMS 160142 KR685 OP Male 1 TMS I90257 58308 X Oyarugba dudu Male 2 TMS 14(2)1425 58308 X Oyarugba fufun Male 3 58308 M. esculenta X M. glaziovii Parent 4 91/02324* TME1 OP TME1* Antiota (Ondo, Ondo State, Nigeria) Male 4 TME2 Odungbo (Opeji, Ogun State, Nigeria) Male 5 TME4 Atu (Iwo, Kwara State, Nigeria) Male 6 TME5 Bagiwawa (New Busa, Niger State, Nigeria) Male 7 TME6 Lapai-1 (Lapai, Niger State, Nigeria) Male 8 TME7 Oko-lyawo(New Lapai, Niger State, Nigeria) Male 9 TME8 Amala (Ireuekpen, Edo State, Nigeria) Male 10 TME9 Olekanga (Ogbomosho, Oyo State, Nigeria) Male 11 TME10 Orente (Ogbomosho, Oyo State, Nigeria) Male 12 TME11 Igueeba (Warri, Delta State, Nigeria)- Male 13 TME12 Tokunbo (Ibadan, Oyo State, Nigeria) Male 14 TME14 Abbey Ife (Abbey-lfe, Osun State, Nigeria) Male 15 TME31 Bakince-lri (Bahago, Sokoto State, Nigeria) Male 16 TME41 Danbusa (Kanji, Niger State, Nigeria) Male 17 TME117* Isunikankiyan (Ibadan, Oyo State, Nigeria) Male 18 'C lone used as check +OP=open pollinated 49 In 1997, the seedlings from these crosses were established in a seedling nursery in Mokwa, Niger State, Nigeria, to produce woody cuttings for the experiments. Mokwa is in the Southern Guinea Savanna with a ferric luvisol soil type. It is generally a low disease pressure site for CMD. The genetic material was then established in two experiments at Mokwa and Ibadan (Oyo State, Nigeria) during the 1998 growing season and in Ibadan during the 1999 growing season. Ibadan is in the forest-savanna transition with a ferric luvisol soil type. The physical and chemical properties of soil and rainfall pattern of the two locations during the two growing seasons are described in Appendix I. In Mokwa, the two experiments together with their parents, and three checks (TMS I30572, TME1, and TME117) were evaluated, while in Ibadan, the two experiments, their parents and four checks (91/02324, TMS I30572, TME1, and TME117) were evaluated. The design for each experiment was a randomised complete block with two replications. Plants were established using 20 cm long cuttings. Plants were spaced at 0.5 m apart in rows (ridges 30 cm high and 10 m long) which were 1 m apart giving a plant population of 20,000 plants per hectare. Due to differential seed set of the parents, the number of progenies varied for the different crosses. The number of progenies obtained from the seedling nursery ranged from 16 to 300 in the diallel experiment and 52 to 934 in the NCD II experiment. To ensure the survival of each F-, genotype, two cuttings per genotype were planted in each replicate. The second cutting was removed at 6 weeks after planting (WAP) when the plants were established. Twenty cuttings of each parental or check genotype were planted in each replicate with the same spacing as the crosses. The experiments were evaluated for CMD under rain-fed conditions. No fertiliser or herbicide was applied during the course of the experiments. Hand weeding was done when necessary. 50 3.1.3 Assessment of CMD Individual plants in each F-, cross and 10 plants each of the parents and checks were assessed for their reaction to CMD under natural infection by whiteflies. Assessment was based on their phenotypic expression of severity of symptoms using the standard five point scoring scale system for CMD (IITA, 1990). Disease severity scores of the whole plants and shoot tips (from the shoot apex to the first fully expanded leaf) were assessed at 6, 12 and 20 WAP in 1998 and at 6 and 12 WAP in 1999. Table 3.2 describes the five point scoring scale and an illustration of the disease severity scores on cassava leaves is presented in Plate 3. 1. 3.1.4 Data analyses Data collected from the three environments where the experiments were conducted were analysed using the statistical analysis system (SAS) package (SAS, 1999). An environment was defined as the particular location and year that the experiments were conducted. Thus, the three environments were Mokwa 1998, Ibadan 1998, and Ibadan 1999. The weighted means for incidence and disease severity of CMD in shoot apices and whole plants of each of the checks, parents, and crosses at the different scoring dates were determined from the data collected and used for the genetic analysis. Incidence was determined as the percentage of diseased plants or percentage of plants with diseased shoot tips. The weighted average disease severity score of a parent, check, or cross was defined as the summation of the product of the frequency count and the value of the disease severity class, divided by the total number of plants evaluated, and was determined for each replicate. If the weighted average disease severity score of a parent, check, or Fi progeny cross was less than or equal to 2, it was classified as resistant, otherwise it was classified as susceptible. 51 Table 3.2 Standard five point scale scoring systems for CMD (IITA, 1990) Disease Score Symptom Status 1 No obvious symptoms. Highly resistant (HR) 2 Mild chlorotic patterns or mild leaf distortion at the base Resistant (R) 3 Strong mosaic on the entire leaf, distortions of leaves Moderately susceptible (MS) 4 Severe mosaic distortion, reductions of leave lamina with about 2/3 of the leaves affected Susceptible (S) 5 Severe mosaic, severe distortion of leaves, stunting of entire plant and about 4/5 of leaves affected Highly susceptible (HS) 5 2 I Plate 3.1 Cassava leaves, with varying levels of cassava mosaic disease severity CMD 1, No obvious symptoms. CMD 2, Mild chlorotic pattern, mild leaf distortion at the base. CMD 3, Mosaic on the entire leaf, distortion of leaf CMD 4, Severe mosaic distortion, reduction of leaf lamina. CMD 5, Severe mosaic, severe distortion of leaf and severe reduction of leaf lamina. 5 3 A simple phenotypic correlation analysis performed on the average CMD incidence and disease severity scores of all genotypes in the two experiments showed a consistent significant linear relationship between incidence and severity at the different scoring dates across environments and in the individual environments (Appendix II). Genetic analyses of the F-, crosses were based on CMD severity at 12 WAP and the assessment of genetic relationships among the parents was based on all the CMD severity and incidence responses. The analyses of variance for each experiment were performed on the individual environments and a combined analysis of variance was performed with data from the three environments. The analyses were based on mixed models with the genotypes considered fixed; replicate, environment, and genotype by environment interaction (GXE) considered as random effects. Appropriate F-tests were performed according to the expectations of the means squares, based on the procedures described by McIntosh (1983) for mixed model analysis. In the individual environment analyses, the genotypic components were tested with the pooled error. For the combined analyses, the environment was tested with the replicates nested within environments, Env(Rep) mean square, and the genotypic components were tested with their respective genotype by environment (GXE) interactions. The GXE effects were tested with the pooled error. 3.1.4.1 3X18 North Carolina Design II (NCD II) analysis The general linear model (GLM) procedure in SAS which uses the method of least squares to fit general linear models was used for the analyses of variance (SAS, 1999). Mean squares were calculated from Type III sums of squares. Genotypes were partitioned into the variation due to test genotypes (parents and crosses) and checks. Test genotypes were then partitioned into parents and crosses. The parents were further partitioned into female, male, resistant and susceptible parents and one 54 degree of freedom orthogonal contrasts, female versus male parents and resistant versus susceptible parents were made to test for significant variation between the different sets of parents. A one degree of freedom orthogonal contrast, between parents and crosses was also used to test the significance of average midparent heterosis. Crosses were further partitioned into variation due to the GCA (additive) effect of males, the GCA effect of females and variation due to the SCA (non-additive) effect (male and female interaction). The general linear model used for the analysis of the crosses in the NCD II for one environment was as follows: y ijk ~ M " m i ~.f ~mf ,j+b A. -i- e jJk (1) In this model, y jJk was the observed response to CMD; f.i the general mean; mj the GCA effect of the ith male; / ;the GCA effect of jth female; mf :j the SCA effect of the ijth cross; bk was the effect of the kth replicate; and &ijk the error associated with each observation (Singh and Chaudhary, 1985). The analysis of variance for the combined analysis across environments was performed as given for an individual environment. The GXE interactions however, were further partitioned into their components. The general linear model assumed for the NCD II population for the combined analysis across environments was: ym= M + mi + f j+mf ^ l k+ b m +m l lk+fl Jk+mfl iJk+ e ijkl .............(2) Here, y jjkl was the observed response to CMD across the three environments;/;, m, f j and mf were as described in equation (1) for an individual environment. 55 The /t th effect was the kth environment. The bl(k) th effect was the kth environment nested in Ith replicate. The mljk th effect was the ith male in the kth environment; and fl jk was the effect of the jth female in the kth environment. The fin ljjk th effect was the ijth cross in the kth environment and eiJkl the residual associated with each observation (Beil and Aitkins, 1967). Combining ability estimates were determined for individual environments and across environments using least square means. GCA of a clone was estimated as HereXij. was an F-, mean, X i. was the mean of crosses of the ith clone; X .; was the mean of crosses of the jth clone and X..is the mean of all the Fi crosses (Beil and Aitkins, 1967). The standard errors of these effects were calculated using the methods described by Singh and Chaudhary (1985) for a single environment and Cox and Frey (1984) for the combined environment. The SE for GCA was g, = X ,.-X .. (3) and the SCA of a cross was estimated as = X,j . - X i. - X . j + X.. (4) M SE(m -l) rfm for males in a single environment or, (5) 56 SE(.(.a = J MSE( f ^ for females in a single environment or, (6) rfrn SE,.,., = ^ for males in the combined environment .... (7) UA'" 11 mfrl or, SE = for females in the combined environment (8) (:CAr 11 mfrl The SE for SCA was SE SCA= J ^ E (m \){f—1) for a sjng|e environment..............(9) or, SEsca= I for the combined environment (10). mfr(l) In equations (5) to (10), MSE was the mean square error of the crosses, m and / were the number of males or females, r the number of replicates and /-(/) the number of replicates and environments. MSwl and M S were the mean squares of males or females by environment interaction and MSmP was the mean square of the male by female by environment interaction. The GCA and SCA effects were tested for significance in two tailed tests as described by Cox and Frey (1984) and the LSD was determined as, LSDGCA=t{d[fi^ S E 0CA and ................................................. (11) LSDSC/l=t(iif 005jxSES(:A (Singh and Chaudhary, 1985)................(12). 57 Negative and significant effects where considered as contributing to resistance and positive and significant effects were considered as contributing to susceptibility. 3.1.4.2 4X4 Diallel analysis Genotypes were partitioned into the variation due to test genotypes (parents and crosses) and checks using the GLM procedure in SAS (SAS, 1999). Test genotypes were further partitioned into parents and crosses. A one degree of freedom orthogonal contrast between parents and crosses was used to test for the presence of significant average midparent heterosis. The genetic analysis of the crosses was based on Griffing's method 1, in which the analysis is based on data from a complete diallel (crosses selfs and reciprocals) and using model 1, which is valid when the genotypes are fixed (Griffing, 1956b). Reciprocal effects were further partitioned into the maternal and non-maternal effects (Cockerham and Weir, 1977). The analysis was performed on individual environments and a combined analysis over all environments using the diallel-SAS programmes written by Kang (1994) for a single environment and Zhang and Kang (1997) for the combined environment. The general linear model for an individual environment was; Y iJk= M+g,+gJ+Sj+r u+bk+eV[..............................................(13) In this model, Ym was the observed response to CMD; (i was the general mean; g i the general combining ability of the ith parent; g . the general combining ability of the jth parent; su the specific combining ability associated with the ijth cross; r r the reciprocal effect associated with the ijth cross. The bA.th effect was the effect of the kth replicate and ejjk the residual effect associated with each observation (Kang, 58 1994). The reciprocal effect r ;/was further partitioned into the maternal effect of the ith parent m.t , the maternal effect of the jth parent nij and the non-maternal effect ntj of the ijth cross (Cockerham and Weir, 1977). The general linear model for the combined analysis was; YijkT M + g i+ g j+ s ij+ r ij+ lk+b,(k)+glH+gljk+slijk+ r lijk+ e ijki........... (14) In this model; Y,jkl was the observed response to CMD across the three environments, ju, g i , g . , s tj and r 0 and its partitions mn mj and were as described in equation (13) for the individual environment analysis. The effect l klh effect was the kth environment; b,^lh effect of the Ith replicate nested in the kth environment. The g likth effect was the general combining ability of the ith parent in the kth environment; the g ljk th effect was the general combining ability of the jth parent in the kth environment and the s ljk{h effect was the specific combining ability associated with the ijth cross in the kth environment. The r lijk th effect was the reciprocal effect associated with the ijth cross in the kth environment and eijkl the residual effect associated with each observation (Kang et al., 1999). The interaction of the reciprocal effects with the environment was also further partitioned into the maternal by environment interaction (mllk and ml jk) and non-maternal by environment (nl.jk) interaction . Estimates of GCA, SCA, reciprocal maternal and non-maternal effects of the crosses were generated by the SAS programme with least square means. The computational formulae, which generated the estimates, were based on Cockerham’s statistical model (1963) for Griffing's Methods, which was reviewed by Zhang and Kang (1997). 59 3.1.4.3 Relative importance of general and specific combing ability The relative importance of GCA and SCA in predicting progeny performance can be determined from the ratio of the additive variance to total genetic variance. Since the genotypes were considered fixed, estimation of variance components was not appropriate. The analogous mean square components of the fixed effects were therefore estimated to compute the ratios (Baker, 1978; Kang, 1994). The mean square components of the fixed GCA effects of males, females, and the SCA effect of the female by male interaction were estimated from the variance components of the random effects associated with these fixed effects using the MIVQUEO option of the VARCOMP procedure in SAS. For the NCD II the ratio was estimated from the relationship; .......................................................... (15, GCA + SCA Here, GCA was the sum of the GCA mean square components due to female and GCA mean square components due to male. SCA was the mean square component due to the female by male interaction (Cukador-Olemedo et al., 1997). For the diallel analyses, the ratio was estimated from the relationship 2 GCA .(16) 2 GCA + SCA where GCA and SCA are the mean square components of the GCA and SCA effects respectively (Kang, 1994). Mean square component of the GCA and SCA effects in the diallel were also estimated from the expected mean squares as described by Singh and Chaudhary (1985). The GCA component (# . ..) and SCA component {6srA) in one environment were estimated as described by Singh and Chaudhary (1985) as follows; 60 a 1 V ' „ 2 . . MSGCA MSe (!CA 2 r and ( 1 8 > Where, MS GCA,MS SCA and MSs were the mean square for GCA, SCA and Error and r was the number of replicates. For the combined analysis, the GCA component (daCA) and SCA component {6SCA) were estimated as described by Zhang and Kang (1997) as follows: a - 1 v „ 2 _ MSgca MSGCAi ~------- 2re......... <19) and a - 2 V V - 2 - M S s c j ~ M S s c m b ’“ ' W 1y )L L — ™ ...................... Where, MS GCA,MS SCA, MSe and r were as described for an individual environment, MSqcaxe, MSscaxe and e were the mean square for GCA X E, SCA X E and number of environments respectively. Pearson correlation was performed on the parent means and GCA effects to compare per se performance of the parents with their GCA effects and their topcross performance in the NCD II experiment. 61 3.1.4.4 Heritability Heritability across environments and the exact confidence intervals were estimated using the methods described by Knapp et al., (1985) based on progeny mean. Heritabilty across environments for each experiment was therefore estimated as, where MScmss and MSCXE were the mean squares of the crosses and crosses by environment across environments respectively. The exact 90% (1-a =0.90) confidence limits were determined as where F 0 0Sutrd/. and F %x.df„df were the tabulated values of F at 0.05 and 0.95 for the appropriate degree of freedom in each experiment, MScrmx and MScxr were as described above. 3.1.4.5 Heterosis Least square means were used to estimate average heterosis in each experiment and in the individual crosses. Heterosis of a cross was determined as the difference between the mean of a cross and the mean of the two parents (mid-parent), or the mean of the better parent (high parent), expressed as a percentage of the parent mean. Significance of average mid-parent heterosis was estimated by a one degree of freedom orthogonal contrast (parent versus cross) from the analysis of variance of the test genotypes. Significance of heterosis in an individual cross was determined (21) 1 - [ ^ o . o (MS )]_1 ■ uPPer limit (22) 1 - [ ^ ^ / - ^ r j o w e r N m it CXE (23) 62 using a least significant difference (LSD) between the parent and cross mean (Dixon et al., 1990), given below where, t(df, 0.05) was the tabular value of t at the 0.05 level of significance with the appropriate degrees of freedom, rwas the number of replicates and environments, a and b were the number of experimental units in a cross and parental mean respectively and MSe was the pooled error mean square. Since the average disease severity scores per replicate were used for the GLM analysis to generate the least square means used in the estimation of heterosis, the number of experimental units in the Fi or parent means was 2. 3.2 Genetic relationships among the various sources of resistance to CMD The genetic relationship between the different sources of resistance was assessed based on the phenotypic similarities of the parental responses to CMD and the segregation patterns of the Fi progenies with respect to their response to CMD. The experimental procedure and CMD assessment were as described in section 3.1 above. 3.2.1 Area under the disease progress curve (AUDPC) of the sources of resistance to CMD The mean CMD severity scores in the shoot tip and whole plant at 6 and 12 or 20 WAP of the parental and check clones which represent the sources of resistance were used to estimate the area under their disease progress curve (AUDPC) values in each environment and across environments. AUDPC was calculated using the method described by Jeger and Viljanen-Rollinson (2001) which allows estimates of AUDPC with values from two assessments in the equation, (24) 63 AUDPC = T + (25) r Here, y0 was the CMD value at 6 WAP, and T was 12 or 20 weeks. For Ibadan 1999 and Mokwa 1998 environments, yTwas the CMD value at 12 WAP while for Ibadan 1998 and across environments it was 20 WAP and rthe rate. The rate r, was determined as AUDPC values were then subjected to Proc Rank procedure in SAS (SAS, 1999) to assign the lowest rank value to the least (most resistant) AUDPC values. Spearmans correlation (SAS, 1999) was performed on the assigned ranks to test the association of the AUDPC values across environments. 3.2.2 Principal component analysis (PCA) of sources of resistance to CMD Principal component analysis (PCA) was performed on the mean CMD incidence and severity scores of the various sources of resistance for the different scoring dates in the three environments, using the correlation matrix of the means (SAS, 1999). A scatterplot of the PCA scores of the first and second principal components was generated to explore the genetic similarities among the parental clones. The aim was to group the clones according to their CMD responses. ln ( -^ M - ln ( - ------ i - y r i - y o T (26). 64 3.2.3 Gene complementarity among various sources of resistance to CMD based on the distribution of the Fi disease severity scores 3.2.3.1 Transgressive segregation for resistance to CMD The mean disease severity scores of individual Fi progenies of the various sources of resistance at 12 WAP (largest variation for CMD severity) in an environment, were used to construct a frequency table of the distribution of disease severity scores in the three environments. The percentage of positive transgressive segregants in the Ft cross was determined from the distribution of disease severity scores. A positive transgressive segregant was defined as an F-, progeny at least one disease severity score lower than the average disease severity score of the better parent in a cross. Positive transgressive segregants were determined for each cross in each environment and across environments. 3.2.3.2 Cochran-Mantel-Haesnzel The Cochran-Mantel-Haesnzel ANOVA statistic (SAS, 1999), was used to determine significant differences in the mean distribution of the Ft disease severity scores among the various sources of resistance and to test for reciprocal differences. 3.2.3.3 Number of effective factors (NE) contributing to resistance to CMD The minimum number of independently segregating effective factor pairs responsible for resistance to CMD was estimated in the segregating Ft populations using a modification of the Castle-Wright formula (Castle, 1922; Wright, 1968), where NE was the number of effective factors by which the parents in a cross differ; P-i and P2 were the mean values of the better parent and poorer parent respectively, (27) 65 VarF2 , the variance of the segregating population and VarE the environmental variance estimated from the uniform population. The assumptions for an unbiased estimate using this formula are that: i. all segregating loci contribute equally to the trait, ii. no linkage exist among loci affecting the trait, iii. no dominance, or the degree and direction of dominance of plus factors are similar for all loci, iv. all plus factors are contributed by one parent and all minus factors contributed by the other parent, v. no epistasis occurs among alleles at contributing loci and environmental, and vi. genotypic variances are independent and combine additively to give total variability. Lawrence and Frey (1976), suggested the use of the range of the F2 population in the estimation of effective factors when the parents did not represent the genotypic extremes for the segregating loci and when the variance of the F2 contained some non-additive components. Since cassava is a heterozygous crop, the Ft is genetically equivalent to an F2 of a homozygous crop (Magoon, 1967; Liu, 1998). Hence the formula used to estimate the minimum number of effective factors NE was; where R was the range of the F1 segregates in a cross, cr2K is the genetic variance(Lawrence and Frey, 1976). The genetic variance cr,, was estimated from the linear function of the observed phenotypic variance and the environmental 66 variance, which was estimated from the homozygous population (Falconer and Mackay, 1997). In this study, the cassava parents are homogenous vegetatively propagated cuttings and no two plants of a single cloned genotype have the same phenotypic effect. The mean estimate of the variances of the two parents in a cross was, therefore, used as the environmental variance of homozygous population (Wricke and Weber, 1986; Falconer and Mackay, 1997). Therefore, the genetic variance was where Varg was the genetic variance; VarF1 was the F-i variance; VarPi was the variance of the better parent P-i, VarP2 was the variance of the worse parent P2 and n the number of observations in a parental mean. The number of factors affecting the CMD resistance trait contributed by each parent in a cross in each environment and across environments was estimated by using the procedure of Lawrence and Frey which is an extension of the Castle-Wright formula. Where little or no genetic variance was obtained, the number of effective factors was not estimated. The number of favourable factors in the poorer parents were calculated as follows (Dixon et al., 1991) where np was the number of plus (favourable) factors contributed by the poorer parent, X h = the mean score of the best Ft progeny of a cross, X = mean score of the better parent of a cross, X p - mean score of the poor parent of the cross X n,= mean score of the poorest F-i progeny of a cross. The number of favourable 67 (30) factors in the better parent (nb) were obtained by subtracting np from NE, and because the number of factors estimated was the minimum, the number was approximated to the next larger interger. 3.3. Determination of DNA markers associated with resistance to CMD Bulk segregant analysis (BSA) and linkage mapping were used to determine molecular markers linked to resistance to CMD in a resistant landrace using RAPD, SSR, and AFLP marker systems. The BSA was carried out in three stages. In the initial screening, parental DNA and bulked DNA, were screened with 142 RAPD primers, 150 SSR, and 128 AFLP primer pairs. Primers that were polymorphic between the parents and the two bulks were selected for the second level screening. In the secondary screening, parents, bulks and the members of each bulk were screened with the selected primers. Primers that detected consistent differences between the resistant and susceptible DNA samples were then selected for the tertiary screening. The entire mapping population and the 23 cassava clones used in the diallel and NCD II studies were screened. 3.3.1 Genetic material and experimental procedure The mapping population consisted of 69 Fi progenies from a cross between the improved clone TMS I30555 (susceptible to CMD) and the resistant landrace TME 7 (Oko-lyawo). The Ft progenies were developed and evaluated in the NCD II experiments described in section 3.1.1 and 3.1.3 above. In March 2000, cuttings of the progenies and parents were planted in nursery beds and watered twice a week to produce young leaves for DNA extraction. The 23 cassava genotypes used as parents and checks in the diallel and NCD II experiments (Table 3.1) were included to confirm the potential of a DNA marker association with CMD resistance as a molecular marker for resistance to CMD in the landraces. 68 3.3.2 Resistance screening and selection of plants for bulk segregant analysis The plants were evaluated for symptom severity as described in section 3.1.3, at 6, 12, 20, and at 50 WAP during the 1998/99 season and at 6 and 12 WAP during the 1999 season in Ibadan. Disease severity scores of whole plants and shoot tips (from the shoot apex to the first fully expanded leaf) were assessed. At 12 WAP during the 1999 growing season, the first 10 leaves of each plant were evaluated for symptom severity. In March 2000, cuttings of the plants were made in a nursery bed to generate young leaves for DNA extraction. At 3 WAP, the shoot tips were excised for DNA extraction, and to enhance symptom severity. The plants were then assessed twice a week till the twelfth week for a final confirmation of resistance or susceptibility. The mean disease severity scores based on shoot tip, whole plant, and the first 10 leaves for the individual plants were determined from the data collected. Ten highly resistant (HR) Fi genotypes and ten highly susceptible (HS) genotypes of the mapping population were selected randomly to make up the two bulks for BSA. 3.3.3 DNA extraction The DNA extraction kit, DNeasy™ from Qiagen was used for DNA extraction. The extraction was done according to the manufacturers’ instructions (Qiagen, 1999). Briefly, 100 mg of fresh leaves from the shoot apices was collected into 1 ml eppendorf tubes and ground to a fine powder in liquid nitrogen. The extraction buffer (400 pi buffer AP1) plus 4 |j| of Rnase (100 mg/ml) was added. The slurry was mixed with the pestle used to grind the sample; the sample was incubated at 65°C for 10 minutes. The tubes were inverted three times during the incubation period to facilitate lyses of the cells. 69 After the incubation, 130 pi of the buffer AP2, which precipitates detergents, proteins and polysaccharides was added, mixed, and incubated on ice for 5 minutes, followed by centrifugation at 14,000 rpm for 5 minutes. The lysate was then applied to a QIAshredder spin column, and centrifuged for 2 minutes at 14,000 rpm to remove precipitates and cell debris. The flow-through was collected, and transferred to a fresh tube and 225 pi of buffer AP3 and 450 pi of ethanol was added and mixed gently to precipitate the DNA. The solution was then applied to the DNeasy mini spin column and centrifuged for 1 minute at 8,000 rpm to collect the DNA onto the matrix of the spin column. The column was then placed in a collection tube and washed twice with 500 pi buffer AW plus ethanol. The first wash was at 8,000 rpm for 1 minute and the second wash at 14,000 rpm for 2 minutes. The columns were allowed to dry at room temperature for about 30 minutes, then were placed in 1.5 ml eppendorf tubes. Pre-warmed eluting buffer (AE) was applied at 100 pi to each tube, incubated for 5 minutes at room temperature, and centrifuged for 1 minute at 8,000 rpm to elute the DNA. The elution step was repeated to give a total volume of 200 pi. After extraction, the DNA quality was assessed on a 0.8% agarose gel in 0.5X tris borate EDTA (TBE) buffer (Appendix II). The DNA concentration was estimated with a Hoefer TKO 100 mini fluorometer and the purity checked by spectrophotometry. 3.3.4 DNA quantification 3.3.4.1. Agarose gel electrophoresis Agarose solution of 0.8 % (w/v) concentration was prepared by weighing 1.6 g of gel- grade agarose (Sigma) into 200 ml of 0.5X TBE buffer. The emulsion was boiled in a microwave oven and allowed to cool to 50-55°C. For every 100 ml of agarose solution, 5 pi of ethidium bromide solution (10 mg/ml) was added. The solution was then poured into a horizontal gel electrophoresis tray (Owl E-D20) which was 70 mounted in a gel casting tray and fitted with two 50 teeth-combs (2 mm wide). After the gel had solidified, the tray was removed from the gel caster, and placed in its electrophoresis tank. The tank was filled with 900 to 1,000 ml of 0.5X TBE buffer, up to about 2 mm above the top of the gel, and then the combs were removed carefully from the moulded wells. The extracted DNA samples were diluted (1:10) in tris EDTA (TE) buffer (Appendix III) and 10 pi aliquots mixed with 2.5 pi of loading dye (Appendix IV). Three concentrations of Arabidopsis DNA, were loaded with the samples as checks. The Arabidopsis DNA (100 ng/pl Gibco) was also diluted at 1:10 and 5 jjI, 10 pi and 15 pi aliquots mixed with 1/3 volume of loading dye and loaded alongside the samples. The gel was run at a constant voltage (40V) till the marker dye was 3/4 down the length of the gel. The gel was then visualised under UV-light on a transilluminator, and documented with the aid of Grab-lt gel documentation computer software. The fluorescence intensities of the DNA samples were compared with those of the standards to give an indication of their concentration. The DNA samples were also checked for smearing. The presence of smearing indicated that the DNA had sheared during extraction; such a sample was discarded and fresh DNA of that sample extracted. 3.3.4.2 Estimation of DNA concentration and purity The Hoefer TKO 100 mini fluorometer was used to estimate the concentration of the isolated DNA samples. It allows direct estimations of DNA concentrations in ng/pl by incorporating a dye, Hoechsts 33258 that binds DNA only. The fluorescence of Hoechsts 33258 in the presence of DNA depends on the AT content (Hoefer Scientific, 1992). 71 Prior to sample assay, the instrument was first switched on for 30 minutes to stabilise its readings. A standard DNA, 100 ng of Tomato or Arabidopsis DNA (Gibco BRL) was used to calibrate the instrument before samples were analysed. Each DNA sample (2 pi) was analysed in 2 ml of freshly prepared assay buffer (Appendix III). Spectrophotometry readings at OO260 nm and OD280 nm were also taken to estimate the purity of the isolated DNA. The ratio of OD26o nm/OD2eo nm provides an estimate of the purity of the isolated DNA, which should be between 1.8 and 2.0 (Sambrook et al., 1989). 3.3.5 Determination of CMD virus strain causing symptoms in the mapping population The virus strain(s) causing the symptoms in the population was examined using the PCR method described by Zhou et al. (1997). DNA samples of the mapping population were tested with primers designed to identify the African cassava mosaic virus (ACMV), the East African cassava mosaic virus (EACMV) and both EACMV and the Uganda variant (UgV) of EACMV (Table 3.3). Each 25|jl reaction was made up of 2 |jl DNA (1:40 dilution), 2.5 |jl 10 X PCR reaction buffer, 2.5 |jl MgCI2 (25 mM stock), 2.5 pi 5% Tween-20, 2.0 |jl dNTP's, dATP, dCTP, dGTP and dTTP (2.5 mM/|jl stock), 1 pi each of the forward and reverse primers (5.0 pM/pl), 1U thermus aquaticus (Taq) polymerase and sterile ultra-pure water. The PCR reactions were performed in a 96-well thermal cycler PTC-200 DNA engine (MJ Research Inc. Watertown Massachusetts, USA.). The reaction cycles were as reported by Zhou et al. (1997). The first cycle consisted of 1 minute denaturing at 94°C, 2 minutes primer annealing at 52°C, and 3 minutes extension at 72°C. This was followed by 35 cycles of 1 minute at 94°C, 1 minute at 52°C, and 1.33 minutes at 72 72°C. The final cycle consisted of 5 minutes at 72°C. PCR products were analysed on a 1.2% agarose gel in TBE as described in section 3.3.4.1, alongside a 1 kb DNA ladder (Gibco BRL, Gathersburg, Mel, USA). 73 Table 3.3 Nucleotide sequences of DNA primers used in polymerase chain reaction for the detection of cassava mosaic begomoviruses strain in Fi mapping population Primer Primer Type Forward and reverse Sequence 1 ACMV Specific ACMV-F1 5’-TTC AGT TAT CAG GGC TCG TAA-3' ACMV-R1 5’-GAG TGC AAG TTG ACT CAT GA-3’ 2 ACMV Specific ACMV-AL1/F 5’ GCG GAA TCC CTA ACA TTA TC 3' ACMV-ARO/R 5’ GCT CGT ATG TAT CCT CTA AGG CCT G 3’ 3 EACMV specific UV-AL1/F1 5’ TGT CTT CTG GGA CTT GTG TG 3' EACMV-CP/R 5’ ACT CTA TGR GTA ATR CCY GA 3’ 4 EACMV or UgV EACMV UV-AL1/F1 5’ TGT CTT CTG GGA CTT GTG TG- 3' EACMV UV-AL1/R1 5' AAC CTA TCC CCG ATG CTC AT 3' 5 ug UV-AL1/F1 5’ TGT CTT CTG GGA CTT GTG TG 3’ ACMV-CP/R3 5’ TGC CTC CTG ATG ATT ATA TGT C 3' 74 3.3.6 Bulk segregant analysis DNA samples of bulks, parents and individual FiS were standardised to 10 ng/|jl. For the AFLP analysis, bulks were prepared from successfully digested, ligated, and preamplified templates. 3.3.6.1 Random amplified fragment length polymorphism (RAPD) analysis Equal aliquots (10 (jl) of each of the 10 DNA samples constituting a bulk group were combined in an eppendorf tube to make up the bulk DNA sample. A total of 142 RAPD primers (Operon, Alamanda) were used to screen the parentals and bulks. Each 25 pi reaction contained 25ng DNA, 2.5 pi of the primer (2 pmol/pl), 2.5 pi of dNTP’s (2.5mM stock), 2.5 pi MgCI2 (25mM/pl stock), 2.5 pi 10x Taq polymerase buffer (Promega), 2.5 pi of 5% Tween-20, 2.5 pi DNA (10 ng/pl), and 1U Taq polymerase enzyme (Promega) in sterile ultra-pure water. PCR was performed in the Perkin Elmer, GeneAmp thermocycler. The cycling profile used for amplification consisted of a 3 minutes initial denaturation at 94°C followed by 45 cycles of 94°C for 1 minute, 35°C for 1 minute, and 72°C for 2 minutes, a final extension for 7 minutes at 72°C, and then stored at 4°C until they were used. The PCR products were analysed on a 2% agarose gel in TBE as described in section 3.3.4.1 and 3.3.5. 3.3.6.2 Amplified fragment length polymorphism (AFLP) analysis AFLP analysis was performed with three rare cutting restriction enzymes (6 base pair recognition sites), EcoR1, Apa1 and Pst1 and two frequent cutting (4 base pair recognition sites) restriction enzyme Mse1 and Taq1, to generate EcoRMMsel, Apa1ITaq1, and Pst1/Taq1 DNA fragments. The DNA fragments were ligated onto their respective adapters and used as templates for plus 1 and plus 3 amplification reactions. The sequences of the adapters, core primers sequences and the plus 1 or plus 3 extensions for the different restriction enzymes are given in Table 3.4. 7 5 Table 3.4. Nucleotide sequences of AFLP adapters, primers and primer extensions Adapter and primer core sequences Mse1 Taq1 Apa1 EcoR 1 Pstl Adapters 5' GAC GAT GAG TCC TGA G 3' 5' TAC TCA GGA CTC AT 3' 5' GACGATGAGTCCTGAC 3 5' CGG TCA GGA CTC AT 3’ 5' TCGTAGACTGCGTACA GGCC -3' 5’ TGTACGCAGTCTAC 3' 5' CTC GTA GAC TGC GTA CC 3’ 5' AAT TGG TAC GCA GTC 3' 5- CTCGTAGACTGCGTACATGCA-3 5- TGTACGCAGTCTAC-3 Primers +1 +3 Mse1 + E ‘5- GATGAGTCCTGAGTAA + E -3’ C CAA, CAC.CAG, CAT, CTA, CTC, CTG, CTT Taq1 + E ‘5- CGATGAGTCCTGACCGA + E -3' A, C CAC, CAG, CAT, CTA, CTG, CTT, GAC, GTC Apa1 + E ‘5- GACTGCGTACAGGCCC + E -3’ A, C ATT, CTA, CTG, GCA, TTG EcoR1 + E ‘5- AGACTGCGTACCAATTC + E -3' A AAC, A AG, ACA, ACT, ACC, ACG, AGC, AGG Pst1 + E '5- GACTGCGTACATGCAG + E -3’ A ACA, ACC, ACT, AGC +1 or +3 nucleotide extension 76 For the EcoRMMsel analysis, the AFLP Analysis System I kit purchased from Life Technologies (Gibco BRL, Gathersburg, Mel, USA) was used. This kit is designed for use with plants having genomes ranging in size from 5. 3 x 10 8 to 6.3 x 109 bp and is suitable for cassava which has a genome size of 772 mega base-pairs (7.7 x 10 8) in the haploid genome (Awoleye et al., 1994). For the Apa1!Taq1 and PstHTaql analysis, restriction enzymes, buffers and oligonucleotides were purchased separately from different companies (Roche and Phamarcia) and the reaction performed according to the methods described by Vos et al. (1995) and Wong et al. (1999). Restriction of genomic DNA EcoR1 !Mse1 The digestion mixes were set up in sterile 200 |j| PCR tubes. Each 25 pi was made up of 5 |jl of a 5X reaction buffer, 250 ng of each of DNA samples, 2.5 pi of £coR1 and Mse1 enzyme mix (1 U/jjI) and AFLP-grade water. The tubes were flicked gently to mix the reaction mix, then spun down with a mini strip-tube centrifuge to collect the content. The digestion reaction was performed with the PTC-200 DNA Gene engine at 37°C for three hours. The restriction enzymes were then inactivated at 70°C for 15 minutes, allowed to cool at room temperature, and then placed on ice prior to the ligation reaction. Apa1ITaq1 and Pst1ITaq1 Each 50 |j| restriction reaction contained 500 ng DNA, 5 (jl 10X reaction buffer for the specific enzymes, 0.05 mM DTT, 10 ng BSA and 5 U each of enzyme. The Taq1 digestion preceded the Apa1 or Pst1 digestion at 65°C for three hours, after which the Apa1 or Pst1 enzyme together with their incubation buffer was added and 77 digestion continued at 37°C overnight. All reactions were performed with the PTC- 200 DNA Gene engine. Ligation of adapters EcoRMMsel A 24 pi of adapter ligation solution plus 1 |jl T4 DNA ligase (1 U) was added to each digest giving a total reaction volume of 50pl. The tubes were mixed gently at room temperature, centrifuged briefly to collect the contents, and incubated at 20°C in the PCR machine for 2 hours. After the ligation, 10 pi aliquots of each sample was checked on a 1 % agarose gel to ensure complete digestion then stored at -20°C. The digestion and ligation were repeated for samples that showed evidence of incomplete digestion. AoaHTaql and Pst1ITaa1 Prior to ligation, the adapters were prepared by adding equimolar amounts of their respective reverse and forward single strands in sterile ultra-pure water to the required working concentration. The double stranded adapters were verified on 1% agarose gel before use. Each 10 pi ligation reaction mix was made up of 5 pmol of either Apa1 or Pst 1 adapter and 50 pmol of the Taq1 adapter, 1 ul 10X one-phor-all buffer (Pharmacia), 0.05 mM DTT, 10 ng BSA, 12 mM ATP, 1 U T4 DNA ligase and 2.5 U each of the restriction enzymes were added to each 50 pil restriction reaction. The reactions were then incubated at 37°C for three hours and the enzymes inactivated at 85°C for 30 minutes. The samples were also checked on agarose gel as described for the Eco R1 IMsel reactions to ensure complete digestion and stored at -20°C. Preamplification Reactions FrnRHMsel 78 The preamplification reaction was performed with 5 (Jil of the undiluted restriction ligation, 40 pi of preamplification mix, 5 pi 10X PCR buffer plus MgCI2 (both supplied with the kit) and 1 U Taq DNA polymerase (Roche). The use of undiluted template for preamplification or passing the ligation mix through a column is known to improve the quality of selective amplification (Chavarriaga-Aguirre et al., 1999). The 51 pi reactions were covered with 10 pi of mineral oil and the preamplification was performed for 20 cycles at 94°C for 30 seconds, 56°C for 60 seconds, and 72°C for 60 seconds in the PTC-200 Gene engine. At the end of the reaction, the block temperature was maintained at 4°C until the samples were removed from the PCR machine. After the PCR reaction, 10 pi aliquots were run on 3% agarose gel to ensure amplification. The resistant and susceptible DNA bulks were then prepared by combining 10 pi each of the preamplified DNAs of the members of each bulk. The DNA templates for the selective amplification were then prepared as 1:20 dilutions of the preamplified DNA with TE buffer (supplied with the kit). ApaHTaal and Pst1ITaq1 The Apa1ITaq1 fragments were preamplified with either, Apa1+CITaq1+A primers, Apa1+NTaq1+C primers, or Apa1+CITaq1+C primers. The Pst1ITaq1 fragments were amplified with Pst1+A/Taq1+C primers. Each 50 pi reaction contained 5 pi of the restriction/ligation reaction, 75 pmol each of the Pst1+1 or Apa1+1 and Taq1+1 primers, 2 pi dNTP’s (5 mM stock, Promega), 5 pi 10X PCR buffer (Promega), 35 mM MgCI2 (25 mM stock, Promega) and 1 U of Taq polymerase. The preamplification PCR reaction profile for the Apa1/Taq1 was 94°C for 2 minutes, followed by 35 cycles of 94°C for 30 seconds, 50°C for 30 seconds, and 72°C for 60 seconds. At the end of the reaction, the block temperature was maintained at 4°C until the samples were removed from the PCR machine. The Pst1ITaq1 preamplification was 79 performed with the profile used for the EcoRMMsel reactions. The PCR reactions were then checked on agarose, diluted, and used to prepare bulk DNA as described for the EcoR'i/Msel reactions. Selective amplification reactions EcoRMMsel Mse1 and £coR1 primers with a total of eight plus 3 nucleotide extensions each (Table 3.4) were used in all possible combinations (64) to screen the parents and bulked DNA. Each PCR reaction contained 2.5 pi of the 1:20 dilution, 2.0 pi I of 10X PCR buffer (plus 15 mM MgCI2l), 0.6 pi MgCI2 (25 mM stock), 0.5 pi EcoR1 primer (10 ng/pl), 4.5 pi of Mse 1 primer mix (6.7 ng/ul with 0.9 mM dNTP’s) and 0.5 U Tag polymerase (Roche). The selective amplification was carried out for one cycle at 94°C for 30 seconds, 65°C for 30 seconds, and 72°C for 60 seconds. This was followed by 12 cycles where the annealing temperature was lowered by 0.7°C for each cycle, giving a touch down phase of 13 cycles, then continued with 23 cycles at 94°C for 30 seconds, 56°C for 30 seconds, and 72°C for 60 seconds. At the end of the reaction, the block temperature was maintained at 4°C until the samples were removed from the PCR machine. Aoa1/Taq1 and Pst1/Taa1 The Apa1+CITaq1+A templates of the parents and bulks were screened with all possible combinations of the Apa1+3 and Taq1+3 primers (Table 3.4). The Apa1+AITaq1+C templates were screened with six of the primer combinations involving Apa1+ATT and Taq1+CAC, Taq1+CAG, Taq1+CAJ, Taq1+CTA, Taq1+ CTG or Taq1+ CTT. The Apa1+CITaq1+C templates were screened with 12 of the 80 primer combinations involving Apa1 + CTA or CTG and Taq1 + CAC, CAG, CAT, CTA, CTG or CTT. The Psf1+A/7ag1+C templates were screened with 24 of the primer combinations involving Pst1+ ACA or Pst1+ ACC and Taq1+CAC, Taq1+CAG, Taq1+CAT, Taq1+CTA, Taq1+CTG or Taq1+CTT. Each 20ul PCR reaction mix contained 2.5 pi of the 1:20 diluted preamplification reaction, 2.0 pi 10X PCR buffer, 2.0 pi MgCI2 (25 mM/pl), 1 pi Apa1 or Pst1 plus three primers (5 ng/pl stock), 1 pi Taq1 primers (30 ng/pl stock), 0.8 pi dNTP’s (5 mM stock) and 0.5 U Taq polymerase (Promega). The selective amplification for the Apa1ITaq1 templates was carried out for one cycle at 94°C for 60 seconds, 65°C for 30 seconds, and 72°C for 60 seconds. This was followed by 12 cycles where the annealing temperature was lowered by 0.7°C for each cycle, giving a touch down phase of 13 cycles. This was followed by 23 cycles at 94°C for 30 seconds, 56°C for 30 seconds, and 72°C for 60 seconds. At the end of the reaction, the block temperature was maintained at 4°C until the samples were removed from the PCR machine. The profile for the PstHTaql was the same as that used for the EcoR1/Mse1 reactions. The PCR products were analysed immediately or stored at -20°C for a few days before analysis. 3.3.6.3 Simple Sequence Repeats (SSR) analysis A total of 186 SSR previously reported (Chavarriaga-Aguirre et al., 1998; Mba et al, 2001) were used to screen the parents and two bulks. Each 25 pi reaction contained 2.5 pi DNA (10 ng/ul), 2.5 pi each of the forward and reverse primers (2.0 uM each), 1 pi dNTPs (5 mM stock), 2.5 pi 10X PCR buffer, 1 or 1.5 pi MgCI2 (25 mM) and 0.75-0.8 units of Taq polymerase. The PCR profile involved an initial denaturation for 5 minutes at 94°C. The first cycle then had a denaturation at 94°C for 30 seconds, annealing at 65°C for 1 minute, and extension at 72°C for 1 minute. This was followed by 10 cycles, which were similar to the first cycle, but the annealing 81 temperature was lowered by 0.7°C for each cycle. Thereafter, the annealing temperature was maintained at 55°C for 1 minute, for 25 cycles. The final extension was 72°C for 7 minutes. At the end of the reaction, the block temperature was maintained at 4°C until the samples were removed from the PCR machine. 3.3.6.4 Sequencing gel electrophoresis AFLP selective amplification products and SSR amplification products were analysed on 6% polyacrylamide (Appendix IV) sequencing gels, using either an Owl S2S sequencing apparatus with gel size 35 X 45 cm or a Gibco Life Technologies Model S2 with gel size 31.0 X 38.5 cm. Prior to sequencing gel electrophoresis, a pair of clean glass plates (a long and short plate) was washed with soap, followed by several rinses with tap water, then a 10-30 minute rinse in deionised water with agitation. When the plates were dry, they were cleaned with absolute ethanol. One plate was treated with 400 ul of a siliconizing agent (Appendix IV) for 5 minutes then the excess was wiped off gently with about 2 ml ethanol. The other plate was also swabbed with alcohol then treated with 1 ml of a gel binding solution (Appendix IV). The plates were then assembled on a gel caster with a pair of 0.4 mm spacers placed between them and clamped along their vertical sides. For the 31.0 X 38.5 cm gel, about 70 ml of the 6% polyacrylamide gel (Appendix IV) was poured between the two plates and for the 35 X 45 cm gel about 90 ml was used. When the gel had filled the plates, the flat surface of a 0.4 mm shark-tooth comb was inserted, clamped, and left to polymerise for at least 1 hour. When the gel had polymerised, the clamps and comb were removed; the gel cassette was washed with deionised water and mounted on the appropriate sequencing gel apparatus and the upper and lower trays of the tank were filled with about 500 ml of 1X TBE. With a 50 ml syringe, the moulded well was washed with the running buffer to remove gel pieces and excess urea. The tank was closed and the gel prewarmed 82 at 60 W for at least 30 minutes till the temperature of the gel was 50°C. After pre­ warming, the wells were flushed again with buffer to get rid of urea, and then the combs were fixed in the wells. Thereafter, the samples were loaded (within 15 minutes). Prior to loading, 1/3 volume of formamide denaturation dye (Appendix IV) was added to each PCR product, denatured for 3 minutes at 95°C, and immediately placed on ice. About 8 |jl of denatured sample was applied to the combs. A 50 bp marker (Promega) with 16 DNA fragments ranging from 50 to 800 bp in exactly 50 bp increments was loaded alongside the samples. For the 31.0 X 38.5 cm gel, 48 samples including the marker were loaded per gel and 72 for the 35 X 45 cm gel. The electrophoresis was continued at 60 W with the gel temperature at 50°C for 1.5-2 hours for the 31.0 X 38.5 cm gel, and 2.5-3 hours for the 35 X 45 cm gel. 3.3.6.5 Silver staining The AFLP bands were visualised using the commercially available silver staining kit (Promega part No. Q4132). The silver staining procedure involves three stages. A fixing stage removes electrophoresis buffer and urea from the gel and prevents diffusion of small extension products. This is followed by staining the gel in a solution of AgN03 (0.1 %). It is then developed in a staining solution made of Na2C03 (3 %) in the presence of formaldehyde and sodium thiosulphate which reduces the silver into metallic silver (Promega, 2000). The procedure was carried out according to the manufacturer’s instructions. Reagents used for silver staining are listed in Appendix I. After electrophoresis, the combs and spacers were removed and the two plates separated. The gel was placed (gel side up) in a plastic tray and enough fixer was added to completely cover the gel, and agitated gently for 20 minutes. This was followed with three washes in deionised water for 3 minutes each. The plate was then transferred to another tray containing enough stainer to completely cover the gel 83 and agitated for 30 minutes on a shaker. The gel was then washed briefly (5 to 10 sec) in deionised water to wash off excess silver, and transferred to a tray containing cold developer. The gel was then developed until the bands were visible, and transferred back into the fixer for 2 minutes to stop the reaction. It was then washed twice in deionised water, placed in an upright position, and allowed to dry. 3.3.6.6 Gel analysis, marker scoring and nomenclature Agarose gels were documented under UV light with the Grabit gel documentation software and then scored with the Gelworks advanced documentation system. Polyacrylamide gels were also documented directly from the glass plates with the Grabit gel documentation software or by scanning. The sequencing gels were scored directly on the glass plate with the aid of a light box. For the primary screening of the parents and bulks, the total number of bands and the number of polymorphic bands, i.e., bands clearly present in one parent or bulk and absent in the other, were recorded. In genotyping the mapping population, only the polymorphic bands were scored. Each lane containing DNA from a specific F-, plant was scored for the presence or absence of polymorphic bands based on its similarity with the parents. When the polymorphic band was present in TMS I30555 (P-i) and absent in TME7 (P2), the marker was scored as ‘1’ for F-, plants that possessed the marker band, while those F-, plants that lacked the marker band were scored as ‘2’. Conversely, when the polymorphic band was absent in P-, but present in P2, the marker was scored as T for Ft plants that lacked the band; and '2’ for Ft individuals that possessed the band. Faint bands that could not be scored with confidence were treated as missing data and were assigned scores of 'O'. When the NCDII and diallel parental and check clones were scored, polymorphic markers were scored as “1” for present or ”0” for absent. 84 m t .H i The marker sizes were estimated by extrapolating their sizes from the migration distance of the 50 bp ladder for the AFLP and SSR gels and from the 1 kb marker for the RAPD gels. Markers were named according to the primer from which the marker was obtained, and its relative molecular weight. For example, AFLP marker E- ACC/M-CTC-250 indicated that the marker was amplified with the primer combination of £coR1+ACC and Mse1+CTC and its molecular weight relative to the 50 bp marker was 250 bp. SSR markers were identified based on their assigned lab number and their expected molecular weight. When more than one allele was present, the alleles were identified with alphabets. 3.3.7 DNA markers associated with CMD resistance in Fi population 3.3.7.1 Genetic linkage mapping After the initial screening for the BSA, primers that were polymorphic between the bulks and parents were used to genotype the entire mapping population. The EcoRMMsel +3 primers (E-ACC/M-CTC, E-AAC/M-CAT, E-ACA/M-CAC, E-AAG/M- M-CTC, E-ACA/M-CAT, E-AAG/M-CTG, E-ACT/M-CAG, E-ACC/M-CAT, and E- ACC/M-CTC) were used to genotype the parents and Fi population. Three PstHTaql primers P-ACC/T-CTA, P-ACT/T-CTT, and P-ACC/T-CAG and 15 SSR (Table 3.5.) primers were also used to genotype the parents and Fi population, to generate data for linkage analysis. PCR reactions, gel electrophoresis, developing and documented were done as described in sections 3.3.6.4, 3.3.6.5 and 3.3.6.6 above. For the SSR analysis, PCR products of the mapping population were multiplexed in sets of three for each gel according to their base pair (bp) expected product sizes, with the products with a smallest bp loaded first and that with the largest loaded last.The CMD severity classes of the progenies were treated as a putative phenotypic locus designated CMDRL and included in the linkage analysis with the DNA markers such that the relationship between resistance to CMD and the markers 8 5 could be identified on the map. Plants that were classified as resistant (CMD scores of 1 and 2) were treated as identical to the resistant parent and were assigned a score of 2 while those that were classified as susceptible (CMD scores 3 to 5) were assigned a score of 1. The markers were subjected to chi-square analysis to test their conformity to a 1:1 inheritance ratio to identify single-dose restriction fragments (SDRFs). SDRFs are DNA markers that are present in one parent and absent in the other and segregate in a 1:1 ratio (absence: presence) in the progeny. They represent the segregation equivalent of an allele at a heterozygous locus in a diploid or an allopolyploid genome or a simplex allele in an autopolyploid (Wu et al., 1992). SDRFs are suitable for linkage analysis in an F-i population when there is the presence of a number of unique segregating polymorphisms (heterozygosity) and normal meiosis in either or both parents. It results in two separate linkage maps based on male and female (Fregene et al., 1997; Williams, 1998). The remaining markers, which did not conform to the 1:1 ratio, were then tested for their conformity to 3:1 ratio, which is expected for the segregation of double dose markers on homoelogous chromosomes of an alio or autopolyploid (Wu et al., 1992). Double dose markers represent the duplex (double simplex condition) at a heterozygous locus. Different strategies for linkage analysis were tested. These were generating separate female and male maps which is valid for heterozyous species (Fregene et al., 1997; Williams, 1998); single combined map, inclusion of inverted scores of the female and male mapping data before linkage analysis (Fregene et al., 1997), and tested for both separate and combined maps. The data were analysed with the MAPMAKER version 3.0 computer programme (Lander et al., 1987), using the markers generated in the study. Three CMD responses; mean shoot tip severity score (MST), mean whole plant severity score 86 (MWP) and mean leaf severity score (MLS) were included in the data set as three quantitative traits. The Kosambi mapping function (Kosambi, 1944) was employed for converting recombination fractions to map distances in centiMorgans (cM). A two- point analysis was first performed to assign loci to groups using minimum lod scores of 3.0 and maximum recombination fraction frequency of 0.33. This function inferred linkage between markers. The multipoint likelihood function was used to compare loci order in a linkage group. Initially a subset of markers was compared to determine the best order. Then the remaining markers in the group were tried in every interval using the 'multipoint/try' function. Locus order was then produced using the MAP function of MAPMAKER. Unlinked markers were further assessed by 'two-point/pairwise' function with reference to the already ordered linkage groups. The linkage groups were then assigned onto chromosomes and a framework map generated for further analysis. 8 7 Table 3.5 Description of cassava SSR loci and their primer pairs used in genetic linkage analysis Lab No SSR loci Type of repeat Forward primer Reverse primer Size (bp) MgCI2 SSR6 SSRY6 (CA) 7 (N) 51 (CA) 17 (N) 47 (CA) 15 TTTGTTGCGTTT AGAAAG GT GA AACAAATCATTACGATCCATTTGA 298 1.5 SSR7 SSRY7 (CT) 26 TGCCT AAGGAAAATT CATT CAT TGCT AAGCT GGT CAT GCACT 250 1.5 SSR30 SSRY28 (CT) 26 (AT) 3 AC( AT) 2 TTGACATGAGTGATAI I I ICTTGAG GCTGCGTGCAAAACTAAAAT 180 1.5 SSR41 SSRY40 (GA) 16 TGCATCATGGTCCACTCACT CATTCI I I ICGGCATTCCAT 231 1.5 SSR50 SSRY49 (GA) 25 T GAAAAT CTC ACT G GC ATTATTT TGCAACCATAGTGCCAAGC 300 1.5 SSR52 SSRY51 (CT) 11 CG(CT) 11 (CA) 18 AG GTTG G AT GCTT G AAG GAA GGAT GCAGGAGTGCTCAACT 298 1.5 SSR103 SSRY91 (GA) 16 GTCTGCATGGCTCGATGAT TGCCTGCTTCATATGTT'I I TG 300 1 SSR108 SSRY95 (CT) 19 (CA) 16 CC (CA) 2 CC( CA) 3 CAT G ATTT G G ATTTT GG AAT GA CAAAAGAAGCAACCTT CAGCA 282 1 SSR115 SSRY102 (GT) 11 TTGGCTGCTTTCACTAATGC TT GAACACGTT GAACAACCA 179 1.5 SSR119 SSRY106 (CT) 24 GGAAACTGCTTGCACAAAGA CAGCAAGACCAT CACCAGTTT 270 1.5 SSR124 SSRY110 (GT) 12 TT GAGTGGT GAAT GCGAAAG AGTGCCACCTTGAAAGAGCA 247 1.5 SSR132 SSRY113 (GA) 19 TTTGCTGACCTGCCACAATA T CAACAATT G GACTAAG CAG C 187 1.5 SSR157 SSRY135 (CT) 16 CCAGAAACT GAAATGCAT CG AACATGTGCGACAGTGATTG 253 1.5 SSR70 SSRY170 (TA) 5 (N) 71 (CT) 24 T CT CGATTTGGTTT GGTT CA TCATCCTTGTTGCAGCGTTA 299 1.5 SSR105 SSRY179 (GA) 28 CAGGCTCAGGTGAAGTAAAGG GCGAAAGTAAGTCT ACAACTTTT C TAA 226 1.5 (Adapted from Mba etal., 2001) 88 3-3.7.3 Quantitative trait loci (QTL) analysis Simple phenotypic correlation and regression analysis were used to determine the association between the markers and the CMD responses. The CMD responses, (mean shoot tip, mean whole plant, and mean leaf severity scores at 12 WAP) were the dependent variables while the marker data were the independent variables for the regression analysis. Associations between markers and the CMD response traits were declared significant on a per marker basis at a significance threshold of p = 0.005 (Lander and Botstein, 1986). This stringent threshold was adopted to avoid type 1 error (Dudley, 1993). 3.3.8 DNA markers associated with CMD resistance in various sources of resistance Following the segregation and QTL analysis, markers associated with resistance to CMD were tested on the 23 cassava genotypes used as parents and checks in the NCD II and Diallel experiments described in section 3.1 above. 89 Chapter Four Results 4.1 Analysis of variance and combining ability analyses 4.1.1. 3X18 North Carolina Design II mating scheme The analysis of variance for CMD symptom severity in the NCD II mating scheme is presented in Table 4.1. The results showed significant variation (p<0.01) among environments, replicates nested in environments and genotypes in each of the individual environments and across environments. Significant variation among the genotypes was due to the variation among checks and among test genotypes. The orthogonal contrasts between the checks and test genotypes were significant (p<0.01) in the individual environments but not across environments. The significant variation among the test genotypes was due to significant variation (p<0.01) among the parents and crosses in Ibadan 1998 and 1999 and across environments. In the Mokwa 1998 environment, significant variation among the test genotypes was due to the parents only. Variation among the 21 parents was due to the male parents, which varied significantly (p<0.01) in each environment and across environments, as well as to the female parents, which was significant only in the individual environments. The orthogonal contrast female versus male parents was significant (p<0.05) in Mokwa 1998 and across environments. 90 Table 4.1 Analysis of variance for CMD severity at 12 weeks after planting (WAP) among all genotypes in 3X18 NCD II mating scheme, evaluated in three environments Source of Variation Across Environment df MS df (t) Individual Mokwa 1998 MS Environments Ibadan Ibadan 1998 1999 MS MS Environment (E) 2 59,17“ Replicates within E 3 1.40** 1 1.18** 2.98** 0.05’ * Genotypes in population (G) 78 1.39** 78(77) 0.36** 0.49** 1.06** Checks (Ck) 3 7.72** 3(2) 2.46** 2.71** 3.48** Test Genotypes (TG) 74 1.12** 74 0.30** 0.38** 0.96** Ck versus TG 1 1.32 1 2.57** 1.97** 1.56** Parent (P) 20 3.17** 20 0.91** 0.89** 1.90** Female (F) 2 164 2 0.55** 0.38* 1.24** Male (M) 17 3.51** 17 0.99** 0.99** 2.08** F versus M 1 0.46* 1 0.23* 0.07 0.19 Susceptible (S) 5 1.08 5 0.36** 0.68** 0.51** Resistant (R) 14 0.29 14 0.11** 0.16* 0.63** R versus S 1 53.98** 1 15.15** 12.55** 26.47** Crosses (C) 53 0.31** 53 0.04 0.19** 0.37** F (GCA) 2 1.40 2 0.18** 2.28** 0.42** M (GCA) 17 0.57* 17 0.06* 0.14 0.91** F x M (SCA) 34 0.11* 34 0.02 0.09 0.13** P vs C 1 3.10 1 1.77** 0.54* 13.22** GxE 155 0.28** Ck xE 5 0.07 TG x E 148 0.26** Ck vs TG x E 2 2.71** P xE 40 0.26** F xE 4 0.26** M xE 34 0.28** F vs M x E 2 0.14 S x E 10 0.36** R xE 28 0.36** R vs S x E 2 1.01** C xE 106 0.15** F (GCA) x E 4 0.54* M (GCA) x E 34 0.27* F x M (SCA) x E 68 0.07 P vs CxE 2 6.20** Error (genotypes) 233 0.07 78(77) 0.04 0.09 0.07 Error (crosses) 159 0.06 53 0.03 0.08 0.08 GCA 0.76 1.00 0.94 0.84 GCA + SC/1 (t) df of genotypes and checks in Mokwa 1998 was 77 where check, 91/02324 was missing * Significantly different from zero at the 0.05 probability level ** Significantly different from zero at the 0,01 probability level 91 Further partitioning of the parents into resistant and susceptible parents revealed significant differences among parents for each group in the individual environments but not across environments. The orthogonal contrast between resistant and susceptible parents, however, was significant (p<0.01) in all three environments and across environments The reaction of the Ft crosses to CMD symptom severity was significant (p<0.01) in Ibadan 1998 and 1999, and across environments. This was attributed to the GCA effects of the females in the Ibadan 1998 and 1999 environments and the GCA effects of the males in Ibadan 1999 and across environments. The SCA effect (female by male interaction) was also significant across environments and in Ibadan 1999. Although there was no significant variation among the crosses in Mokwa 1998, the GCA effect of the females and males was significant. The orthogonal contrast of the parent versus crosses, which is a test of average heterosis, was significant in the individual environments but not across environments. The significant GXE interaction (p<0.01) was due to all the component interactions except for the check by environment, the parent contrast, female versus male by environment, and the SCA effect (female by male) by environment interactions. The relative importance of GCA and SCA estimated by the genetic ratios, GCA-------------were 0.76, 1.0, 0.84 and 0.96 across environments, in Mokwa 1998, GCA + SCA Ibadan 1998 and Ibadan 1999 environments respectively. Heritability based on progeny means across environments was 0.516 with lower and upper 90% confidence interval of 0.268 and 0.668. 92 4.1.2. 4X4 Diallel mating scheme The analysis of variance for the 4X4 diallel involving the resistant genetic stock clone 58308 and three improved cassava clones (130001, I30555 and I30572) also used as females in the 3X18 NCD II mating is presented in Table 3.2. There was significant variation (p<0.01) among environments but replicate within environment was significant only in Mokwa 1998. The variation due to test genotypes was also significant (p<0.01) in the individual environments and across environments. This was attributed to the significant variation due to checks and test genotypes. The contrast check versus test genotypes was significant in the Mokwa 1998 environment only. The results further showed that significant variation (p<0.01) among the test genotypes in all the three environments and across environments was due to variation among parents and crosses. The test for average heterosis, parent versus cross contrast, was significant in all of the three environments but not across environments. The Griffing’s analysis of variance (based on method 1 for a complete diallel, and model 1 for fixed genotypes) for the crosses within and across environments revealed that GCA effects of the parents were significant in the individual environments but not across environments. SCA effects also contributed significantly to the variation among crosses in the Ibadan 1999 and Mokwa 1998 environments. Reciprocal effects were significant only in the Ibadan 1998 environment and this was due to significant maternal effect. 93 Table 4.2 Analysis of variance for CMD severity at 12 weeks after planting (WAP) among all genotypes in 4X4 diallel mating scheme, evaluated in three environments Source of Variation ' Across Individual Environments Environment Mokwa Ibadan Ibadan 1998 1998 1999 df MS df MS MS MS Environment (E) 2 9.00** Replicates within E 3 0.14 1 0.25** 0.08 0.08 Genotypes in population (G) 23 1.47** 23(21)$ 0.46** 0.67** Q.78** Checks(Chk) 3 6.74** 31 2.72** 3.26** 1.95** Test Genotypes (TG) 19 0.65** 19(18) $ 0.14** 0.27** 0.62** Parent (P) 3 1.08* 3 (2 )* 0.11** 0.51** 0.81* Cross (C) 15 0.37* 15 0.15** 0.17** 0.37* GCA 3 0.79 3 0.47** 0.39** 0.51* SCA 6 0.40 6 0.1** 0.04 0.56* Reciprocal 6 0.13 6 0.05 0.19* 0.12 Maternal 3 0.11 3 0.06* 0.36** 0.06 Non-matemal 3 0.15 3 0.03 0.02 0.17 P v s C 1 2.36 1 0.09* 1.33** 3.73** Chk vs TG 1 0.36 1 1.69** 0.14 0.56 GxE 43 0.21* Chk xE 5 0.17 TG x E 36 0.19** P x E 4 0.11 C x E GCA x E SCA x E R x E M x E N x E 30 3 6 6 3 3 0.16** 0.47** 0.23 0.21 0.35* 0.06 P vs C x E 1 1.16** Chk vs TG x E 1 1.09** Error (genotypes) 66 0.11 23(21 )* 0.02 0.08 0.23 Error (cross) 45 0.08 15 0.02 0.05 0.16 2 GCA 0.19 0.39 0.76 0.10 2GCA + SCA ( ) df of genotypes, test genotypes and parents in Ibadan 1998 are 21, 18 and 3 respectively where 58308 was missing t df of genotypes, test genotypes checks and parents in Mokwa 1998 are 20, 17 and 2 respectively where 58308 and 91/02324 were missing * Significantly different from zero at the 0.05 probability level ** Significantly different from zero at the 0.01 probability level 94 Despite the lack of significance of the reciprocal effect in Mokwa 1998, significant maternal effects were detected. The partitioning of the reciprocal effects into its components revealed significant maternal effect in the Mokwa 1998 and Ibadan 1998 environments. The significant genotype by environment interaction (p<0.05) was due to the interactions between environment and test genotypes, crosses, GCA effect and maternal effects, parent versus cross and the check versus test genotypes. The relative importance of GCA and SCA estimated by the genetic ratios, 2Xj (2A---------------were 0.19, 0.39, 0.76 and 0.10 for the combined environment analysis, 2 GCA + SCA the Mokwa 1998, the Ibadan 1998 and the Ibadan 1999 environments respectively. Heritability based on progeny means was 0.568 with lower and upper 90% confidence interval of 0.028 and 0.786. 4.1.3 Combining ability effects of 3X18 NCD II mating scheme The GCA effects and means of the parental clones in the three environments and across environments are presented in Table 4.3. Mean disease severity scores of the clones varied across and within each environment. In this study, negative GCA effects were desirable for resistance. Significant and negative values of the clones indicated contribution towards resistance while significant and positive values indicated a contribution towards susceptibility. Similarly, negative SCA values of a cross indicated that the cross was more resistant than average and significant and positive SCA effect indicated a cross was more susceptible than average. 95 Table 4.3 Mean* CMD severity scores and GCA effects for CMD symptom severity of parents in the 3X18 NCD II mating scheme, evaluated in three environments Parent Across environments Mean+ GCA Mokwa 1998 Mean GCA Ibadan 1998 Mean GCA Ibadan Mean 1999 GCA Female 130001 1.20 0.13 1.00 0.08 1.15 0.29* 1.45 0.02 I30555 2.23 -0.04 1.95 -0.02 1.85 -0.11 2.88 0.02 I30572 1.55 -0.09 1.10 -0.06 1.94 -0.17* 1.60 -0.04 Mean 1.66 1.35 1.65 1.98 SE 0.12 0.06 0.25 0.02 0.22 0.04 0.18 0.04 LSD 0.258 0.165 0.165 0.083 Male 14(2)1425 1.96 -0.06 1.20 -0.04 1.87 0.16 2.80 -0.31* 160142 1.63 -0.20 1.00 -0.19* 1.30 -0.06 2.60 -0.34* I90257 1.48 -0.26* 1.30 -0.14 1.54 -0.01 1.60 -0.63* TME1 1.48 -0.09 1.10 -0.04 1.58 -0.07 1.75 -0.17 TME2 2.68 0.55* 2.85 0.23* 2.13 0.44* 3.06 0.98* TME4 1.18 -0.02 1.00 0.02 1.21 -0.08 1.34 -0.01 TME5 1.20 0.06 1.10 0.04 1.20 0.14 1.30 0.01 TME6 1.37 -0.07 1.25 -0.04 1.48 -0.19 1.38 0.01 TME7 1.12 -0.03 1.10 0.05 1.11 -0.05 1.15 -0.07 TME8 1.36 -0.12 1.30 0.00 1.08 -0.06 1.70 -0.29* TME9 1.15 -0.05 1.05 0.14 1.25 -0.03 1.15 -0.27* TME10 2.76 0.05 1.93 -0.01 2.78 0.09 3.55 0.07 TME11 1.44 -0.09 1.29 0.12 1.73 -0.22 1.30 -0.18 TME12 1.39 -0.09 1.30 0.03 1.43 -0.12 1.45 -0.20 TME14 1.37 0.06 1.10 0.03 1.30 0.09 1.70 0.06 TME31 2.77 0.20 3.05 -0.03 2.17 -0.04 3.10 0.67* TME41 3.15 0.01 2.55 -0.10 3.05 -0.11 3.85 0.24* TME117 3.46 0.17 2.65 -0.05 3.49 0.09 4.25 0.48* Mean 1.83 1.56 1.76 2.17 SE 0.12 0.06 0.25 0.02 0.22 0.04 0.18 0.04 LSD 0.238 0.235 0.235 0.144 * Significantly different from zero at the 0.05 probability level ** Significantly different from zero at the 0.01 probability level *mean disease score < 2 = resistant; mean disease scores > 2=susceptible. 96 Negative GCA effects were estimated for the resistant clone TMS 130572 in all environments but its effect was significant only in the Ibadan 1998 environment. Although the resistant clone TMS 130001 had positive GCA in all environments and across environments, its effect was significant only in Ibadan 1998. All the improved clones used as males TMS 14(2)1425, TMS 160142 and TMS I90257 had negative and significant GCA effects in Ibadan 1999. Negative and significant GCA effects were also estimated for clones TMS 160142 in the Mokwa 1998 environment and for TMS I90257 across environments. Negative and significant GCA effects were also detected for the resistant landraces TME8 and TME9 in the Ibadan 1999 environment. The susceptible landrace TME2 had significant and positive GCA in all three environments and across environments, while the susceptible landraces TME31, TME41 and TME117 had significant and positive GCA effects only in the Ibadan 1999 environment. The susceptible improved clone TMS I30555 used as female, did not contribute significantly to susceptibility in any environment. The correlation analysis between parental means and GCA revealed significant and positive linear relationships in the Ibadan 1999 environment (r=0.551, p<0.05) and across environments (r=0.507, p<0.01). The mean disease severity scores and specific combining ability effects of the crosses for environments with significant variation for the effect are presented in Table 4.4. Negative SCA effects were also desirable for resistance. The most resistant cross over environments was TMS I30572 X TMS I90257 which had the lowest mean of 1.48 and largest negative SCA effect of -0.43 while the TMS I30555 X TME2 with a mean of 2.57 and largest positive SCA of 0.63 was the most susceptible cross across environments. 97 Table 4.4 Mean CMD severity scores and SCA effects for CMD symptom severity of crosses in 3X18 design II crosses in environment with significant variation for SCA Across Environments Ibadan 1999 Cross Mean SCA Mean SCA 130001 X 14(2)1425 2.09 -0.03 2.34 -0.17 130001 X 160142 1.84 -0.28* 2.12 -0.36* 130001 X I90257 1.96 -0.16 2.48 0.30 130001 X TME1 1.95 -0.17 2.25 -0.40* 130001 X TME10 2.2 0.08 2.99 0.10 130001 X TME11 1.92 -0.20 2.45 -0.19 130001 X TME117 2.12 0.00 2.98 -0.31 130001 X TME12 1.91 -0.21 2.64 0.02 130001 X TME14 2.28 0.16 3.23 0.35* 130001 X TME2 2.7 0.58* 3.81 0.02 130001 X TME31 2.36 0.24* 3.72 0.23 130001 X TME4 2.2 0.08 2.89 0.08 130001 X TME41 2.21 0.09 3.25 0.19 130001 X TME5 2.09 -0.03 2.66 -0.16 130001 X TME6 2.09 -0.03 2.8 -0.03 130001 X TME7 2.2 0.08 2.92 0.19 130001 X TME8 2 -0.12 2.47 -0.06 130001 X TME9 2.04 -0.08 2.75 0.20 I30555 X 14(2)1425 2 0.05 2.78 0.27 130555 X 160142 1.74 -0.21 2.65 0.17 I30555 X I90257 1.74 -0.21 2.04 -0.15 I30555 X TME1 1.8 -0.09 2.56 -0.09 I30555 X TME10 2.14 0.13 2.98 0.09 I30555 X TME11 2.04 0.09 3.03 0.39* I30555 X TME117 2.17 0.21 3.4 0.10 I30555 X TME12 1.69 -0.24* 2.49 -0.13 I30555 X TME14 1.97 0.02 2.78 -0.10 I30555 X TME2 2.57 0.62* 3.89 0.09 I30555 X TME31 2.06 0.11 3.36 -0.14 I30555 X TME4 1.78 -0.17 2.51 -0.30 I30555 X TME41 1.83 -0.13 2.85 -0.22 I30555 X TME5 2.23 0.28* 3.14 0.30 I30555 X TME6 1.93 -0.02 2.96 0.13 I30555 X TME7 1.84 -0.13 2.58 -0.20 I30555 X TME8 1.75 -0.20 2.44 -0.09 130555X TME9 1.83 -0.13 2.4 -0.15 SE LSD 0.10 0.09 0.216 0.20 0.16 0.317 * Significantly different from zero at the 0.05 probability level ** Significantly different from zero at the 0.01 probability level Negative and significantly different from zero at the 0.05 probability level 98 Table 4.4 Cont Cross Across Environments Mean SCA Ibadan 1999 Mean SCA I30572 X 14(2)1425 1.69 -0.22 2.36 -0.09 I30572 X 160142 1.79 -0.11 2.62 0.19 I30572 X 190257 1.48 -0 43* 11 .98 -0.15 130572 X TME1 1.95 0.05 ..3.08 0.49* 130572 X TME10 1.77 -0.13 2.63 -0.20 130572 X TME11 1.74 -0.17 2.37 -0.21 130572 X TME117 2.19 0.29* 3.45 0.21 130572 X TME12 2.08 0.18 2.66 0.10 130572 X TME14 1.91 0.04 2.58 -0.24 130572 X TME2 2.34 0.44* 3.63 -0.11 130572 X TME31 2.15 0.24* 3.34 -0.10 130572 X TME4 1.93 0.02 2.97 0.22 130572 X TME41 1.97 0.06 3.03 0.03 130572 X TME5 1.84 -0.06 2.63 -0.14 130572 X TME6 1.73 -0.17 2.66 -0.11 130572 X TME7 1.86 -0.04 2.68 0.01 130572 X TME8 1.86 -0.04 2.61 0.14 130572 X TME9 1.96 0.05 2.44 -0.05 SE 0.10 0.09 0.20 0.16 LSD 0.216 0.317 * Significantly different from zero at the 0.05 probability level ** Significantly different from zero at the 0.01 probability level Negative and significantly different from zero at the 0.05 probability level 99 Significant and negative SCA was detected for TMS 130001 X TMS 160142 in Ibadan 1999 and across environments, for TMS 130001 X TME1 in Ibadan 1999 and also for TMS I30555 XTME12 across environments. Significant and positive SCA effects were estimated for all the three crosses involving TME2, TMS 130001 X TME31, TMS I30555 X TME5, TMS I30572 X TME117 and TMS I30572 X TME31 across environments, and TMS 130001 X TME14, TMS I30555 X TME11 and TMS I30572XTME1 in Ibadan 1999. 4.1.4 Combining ability and maternal effects in the 4X4 Diallel mating scheme The parental means, GCA and maternal effects of the four parents used in environments with significant variation for these effects are given in (Table 4.5). Significant and negative GCA effects were detected for the clones TMS I30572 and clone TMS 130001 in Mokwa 1998 and also for TMS I30572 in Ibadan 1998. Significant and positive GCA effects were also detected for the clones TMS I30555 in Mokwa 1998 and 58308 in Ibadan 1998. The maternal effect of 58308 was also significant and positive in the Mokwa 1998 and Ibadan 1998 environments. In Mokwa 1998 significant and negative SCA effects were detected for the crosses TMS 130001 X TMS I30555, TMS I30555 X TMS I30572 and in Ibadan 1999 environment, for the crosses TMS 130001 X 58308 and TMS I30555 X TMS I30572 (Table 4.6). The self crosses TMS 130001 X TMS 130001, TMS I30555 x TMS I30555, and 58308 X 58308 had significant and positive SCA effects in Mokwa 1998 environment, and in Ibadan 1999, TMS 130001 X TMS 130001, 58308 X 58308 and I30572 X 58308 were also significant and positive. 100 Table 4.5 Mean, GCA and maternal (MAT) effects for CMD symptom severity of parents in 4X4 diallel crosses in environments with significant variation for their effects Mokwa 1998 Ibadan 1998 Ibadan 1999* C lone Mean GCA MAT Mean GCA MAT Mean GCA MAT 130001 1.00 -0.18* -0.02 1.01 -0.07 -0.13 1.34 0.00 I30555 1.95 0.21* -0.03 2.01 0.10 -0.12 2.35 0.23 130572 1.10 -0.10* -0.03 1.63 -0.18* 0.09 1.95 -0.03 58308 X 0.07 0.09* X 0.16* 0.17* 1.13 -0.20 Mean 1.35 1.55 1.69 SE 0.25 0.20 0.34 LSD 0.086 0.086 0.136 0.136 0.245 LSD gi-gj 0.156 0.237 0.411 LSD mj_mj 0.156 0.237 x Not estimated due to missing plot * Maternal e ffect not s ign ifican t in Ibadan 1999 environm ent * S ign ificantly d iffe ren t from ze ro at the 0.05 probability level ** S ignificantly d iffe ren t from zero a t the 0.01 probability level 101 Table 4.6 Mean CMD severity scores and SCA effects for CMD symptom severity of crosses in 4X4 diallel in environments with significant variation for their effects Cross M o kw a 1998 Ibadan 1999 Mean SCA Mean SCA 130001 X 130001 1.23 0.18* 3.14 0.64* 130001 X I30555 1.24 -0.12* 2.40 -0.08 130001 X I30572 1.15 0.03 2.51 -0.02 130001 X 58308 1.18 -0.09 2.08 -0.54* I30555 X I30555 2.16 0.34* 3.07 0.13 I30555 X I30572 1.35 -0.18* 2.33 -0.34* I30555 X 58308 1.42 -0.04 2.80 0.29 I30572 X I30572 1.24 0.04 2.43 0.01 I30572 X 58308 1.36 0.11 2.66 0.35* 58308 X 58308 1.55 0.29* 2.00 0.35* Mean 1.40 2.50 SE 0.11 0.38 LSD sii 0.156 0.410 LSD „ 0.116 0.308 LSD Sjj-Sjj 0.208 0.548 LSD sii-sij 0.224 0.612 LSD sii-sjk 0.180 0.474 LSD stj-sik 0.180 0.474 LSD sij-skl 0.146 0.388 * S ignificantly d iffe ren t from ze ro at the 0.05 probability level ** S ignificantly d iffe rent from zero at the 0.01 probability level Negative and s ign ificantly d iffe rent from zero at the 0.05 probability level 102 The significant reciprocal effect in Ibadan 1998 was due to the negative and significant effect of three reciprocal crosses 58308 X 130001, I30572 X I30555 and 58308 XI30555. 4.1.5 Heterosis for resistance to CMD in F! crosses In the NCD II experiment, average values for mid-parent and high-parent heterosis were generally positive and indicated that the F-i crosses were generally more susceptible than their mid or high parents except in the Mokwa 1998 environment where the Ft differed from the mid-parent value by -4.95%. Significant and negative heterosis was estimated in only two crosses involving clone TMS I30555 as the female parent, and TME117 and TME41 as the male parents. The most heterotic cross, which contributed significantly to the average midparent heterosis, was TMS I30555 X TME41. It had negative and significant mid parent heterosis of -31.94% across environments and in Mokwa 1998, both mid-parent and high-parent heterosis of -48.44% and -40.51% were negative and significant. The next most heterotic cross was TMS I30555 X TME117, which also had negative heterosis in all environments. Its mid-parent heterosis of -50.00% and high-parent heterosis o f-41.03% were significant in Mokwa 1998. Generally, the crosses in the diallel experiment had positive heterotic effects for both mid-parent and high-parent heterosis and were generally more susceptible than their mid-parent or high-parents. Negative heterosis was detected only in the self cross TMS I30572 X TMS I30572 in Ibadan 1998 and for most crosses in the Mokwa environment, however, none of these negative heterotic effects were significant. 103 4.2 Genetic relationships among the various sources of resistance to CMD 4.2.1 Relationships among sources of resistance based on area under the disease progress curve (AUDPC) for CMD Estimated AUDPC values of the cassava clones used as parents and checks and their respective rankings in each and across environments for CMD resistance responses are shown in Table 4.7. Clone 91/02324 and TME9 had the smallest AUDPC values for the shoot-tip and whole plant severity and were ranked 1 (i.e. best resistant clones) across environments. In Ibadan 1998 and 1999, TME9 consistently had a rank of 1 for shoot tip severity but ranked 4 for whole plant in Mokwa 1998. The clone 91/02324 also ranked 3 for whole plant severity in the Ibadan 1999 environment. The next best clone was the resistant improved line TMS 130001, which ranked 3 across environments for both shoot tip and whole plant, and also ranked 1 for shoot tip severity in the individual environments and 4 for whole plant severity in Ibadan 1998. In the Ibadan 1998 environment, 160142 also ranked 1 for shoot tip and whole plant severity and in Mokwa 1998, for shoot tip severity. In the Ibadan 1998 environment, the resistant landraces TME5, TME7, TME8 TME11 TME12 and TME14 ranked 1 for shoot tip severity, while in Ibadan 1999, resistant landraces, TME4 and TME5 ranked 1 for shoot tip severity in Ibadan. Based on their AUDPC values and ranks across environments and in Ibadan 1998 and 1999, the results further showed that the resistant landraces were generally more resistant than improved clones TMS I30572 and TMS 14(2)1425. However, in Mokwa 1998, TMS I30572 had lower ranks for both shoot tip and whole plant severity, indicating that it was more resistant than resistant landraces, TME5, TME7, TME8, TME9, TME12 and TME14. 104 Table 4.7 Area under the disease progress curve (AUDPC) of cassava clones based on mean CMD severity scores and their ranks for CMD resistance. Clone Across environments Mokwa1998 Ibadan 1998 Ibadan 1999 AUDPC Rank AUDPC Shoot Tin Severitv Rank AUDPC Rank AUDPC Rank 58308 X X X X X X 12.00 1 91/02324 20.00 1 X X 20.00 1 12.00 1 TMS 130001 22.60 3 12.00 1 20.00 1 12.00 1 TMS I30555 33.40 17 27.73 17 20.00 1 32.80 19 TMS I30572 29.31 15 13.28 7 28.23 16 18.50 15 TMS 14(2)1425 31.79 16 18.97 15 30.46 17 26.40 17 TMS 160142 27.91 13 12.00 1 20.00 1 22.30 16 TMS 190257 24.75 8 16.43 13 23.88 13 16.00 12 TME1 26.49 11 17.17 14 23.52 12 16.30 13 TME2 39.88 18 40.47 19 35.64 18 30.80 18 TME4 25.75 10 12.00 1 22.52 11 12.00 1 TME5 24.66 7 15.59 10 20.00- 1 12.00 1 TME6 26.87 12 12.00 1 26.68 14 14.30 9 TME7 23.3 4 14.55 8 20.00 1 16.80 14 TME8 23.36 5 15.45 9 20.00 1 14.10 8 TME9 20.00 1 12.00 1 20.00 1 12.00 1 TME10 45.88 19 26.17 16 42.32 20 41.70 20 TME11 28.03 14 12.00 1 27.14 15 15.20 11 TME12 25.32 9 16.30 12 20.00 1 14.30 9 TME14 23.65 6 16.03 11 20.00 1 12.00 1 TME31 51.20 20 50.61 20 40.78 19 47.50 21 TME41 53.27 21 38.10 18 44.56 21 56.90 22 TME117 68.05 22 51.29 21 65.09 22 63.32 23 Whole Plant Severitv AUDPC Rank AUDPC Rank AUDPC Rank AUDPC Rank 58308 X X X X X X 12.00 1 91/02324 20.00 1 X X 20.00 1 14.71 3 TMS 130001 24.51 3 12.00 1 23.60 4 18.09 4 TMS 130555 38.52 17 33.79 16 35.12 16 49.63 19 TMS 130572 33.09 14 15.11 5 35.65 17 24.42 11 TMS 14(2)1425 35.59 16 20.75 15 33.82 15 43.73 17 TMS 160142 33.11 15 12.00 1 27.78 10 38.62 16 TMS 190257 27.37 6 19.12 10 27.53 9 23.92 9 TME1 31.56 13 19.10 9 29.66 13 26.47 13 TME2 45.19 18 47.98 19 42.40 18 45.29 18 TME4 28.95 9 12.00 1 26.19 8 23.34 8 TME5 27.78 8 16.77 8 24.43 7 21.93 7 TME6 30.39 10 19.85 11 28.73 11 24.30 10 TME7 27.05 5 16.61 7 23.60 4 20.52 5 TME8 27.64 7 19.93 12 20.00 1 28.32 15 TME9 20.00 1 13.81 4 20.00 1 12.00 1 TME10 51.66 19 38.99 17 50.81 20 55.42 20 TME11 31.41 12 20.38 14 32.27 14 21.27 6 TME12 31.19 11 20.18 13 29.44 12 25.98 12 TME14 25.91 4 16.33 6 24.30 6 27.07 14 TME31 54.72 20 54.48 21 45.78 19 57.5 21 TME41 57.28 21 44.93 18 55.47 21 70.48 22 TME117 70.38 22 52.73 20 70.58 22 72.52 23 x Not estimated due to missing plot 105 Table 4.8 Spearman’s rank correlation coefficient for AUDPC values for shoot-tip and whole plant CMD symptom severity Across environm ents Mokwa 1998 Ibadan 1998 Ibadan 1999 Across Env Shoot-tip 0.678** 0.835** 0.886** M okwa 1998 0.678** 0.573** 0.744** Ibadan 1998 0.835** 0.573** 0.727** Ibadan 1999 0.886** 0.744** 0.727** Across Env W hole P lant 0 .754** 0.951** 0.880** Mokwa 1998 0.754** 0.757** 0 .741** Ibadan 1998 0.951** 0.757** 0 .792** Ibadan 1999 0.880** 0.741** 0.792** 106 As was expected, the susceptible improved clone (TMS 130555) and the susceptible landraces (TME2, TME10, TME31, TME41 and TME117) had the largest AUDPC values and had ranks higher than the resistant clones. With the exception of whole plant severity in Mokwa 1998, TME117 was the least resistant and most susceptible clone. TMS I30555 was the least susceptible followed by TME2 or TME10 depending on the environment. The rank correlation coefficients among the AUDPC values showed significant and positive linear associations among the AUDPC values in the different environments for both shoot-tip and whole plant CMD symptom severity (Table 4.8). The correlation coefficient (r) ranged between 0.57 to 0.89 for shoot tip CMD severity and between 0.74 to 0.95 for whole plant CMD severity. 4.2.2. Relationships among sources of resistance based on principal component analysis (PCA) of CMD responses Eigenvalues (proportion of the total variance) and eigenvectors for the first three principal components (PC1, PC2 and PC3) for the CMD resistance responses among the 23 parental and check clones are presented in Table 4.9. The eigenvalues for the first three principal components accounted for 78.87% of the total variation. 107 Table 4.9 Eigenvectors and eigenvalues as a proportion of total variance of the first three principal components for the CMD resistance responses of the 23 parental and check clones Environment CMD Response PC1 PC2 PC3 Mokwa 1998 MST6WK 0.218 -0.058 0.238 MCMD6WK 0.238 0.003 0.186 MST12WK 0.232 -0.075 0.132 MCMD12WK 0.228 -0.058 0.176 ST6WKI 0.082 0.389 0.205 CMD6WKI 0.163 0.047 0.304. ST12WKI 0.231 -0.047 0.142 CMD12WKI 0.201 -0.118 -0.032 Ibadan 1998 MST6WK 0.229 -0.006 -0.198 MCMD6WK 0.227 -0.002 -0.230 MST12WK 0.228 -0.008 -0.196 MCMD12WK 0.225 0.001 -0.236 MST20WK 0.218 -0.089 -0.013 MCMD20WK 0.238 -0.052 -0.015 ST6WKI 0.048 0.387 -0.230 CMD6WKI 0.099 0.339 -0.185 ST12WKI 0.041 0.411 -0.239 CMD12WKI 0.106 0.323 -0.169 MST20WKI 0.101 0.320 0.126 MCMD20WKI -0.007 0.317 0.362 Ibadan 1999 MST6WK 0.239 -0.067 -0.076 MCMD6WK 0.237 -0.030 0.014 MST12WK 0.237 -0.082 0.087 MCMD12WK 0.231 -0.043 0.099 ST6WKI 0.195 -0.053 -0.146 CMD6WKI -0.123 0.153 0.279 ST12WKI 0.151 0.161 0.252 CMD12WKI -0.138 0.024 0.111 Eigenvalues (%) 57.33 13.78 7.76 Mean shoot-tip CMD severity at 6 WAP Mean shoot-tip CMD severity at 12 WAP Mean whole plant CMD severity at 6 WAP Mean whole plant CMD severity at 12 WAP Mean shoot-tip CMD severity at 20 WAP Mean whole plant CMD severity at 20 WAP Mean shoot-tip CMD incidence at 6 WAP Mean shoot-tip CMD incidence at 12 WAP Mean whole plant CMD incidence at 6 WAP Mean whole plant CMD incidence at 12 WAP MST6WK MST12WK MCMD6WK MCMD12WK MST20WK MCMD20WK ST6WKI ST12WKI CMD6WKI CMD12WKI 108 The first PC (PC1), which accounted for 57.33% of the total variation gave higher weights to severity of shoot tip and whole plant at 6 and 12 WAP in Mokwa, 6, 12 and 20 WAP in Ibadan 1998, and 6 and 12 WAP in Ibadan 1999. In the Mokwa 1998 environment, PC1 also gave a high weight for shoot tip incidence at 12 WAP. The second principal component (PC2), which accounted for 14% of the total variation, gave higher weights to shoot tip and whole plant incidence at 6, 12 and 20 WAP in the Ibadan 1998 environment. The third principal component (PC3) also gave high weightings to shoot tip severity at 6 WAP and whole plant incidence at 6 WAP in Mokwa 1998. In the Ibadan 1998 environment, whole plant severity at 6 and 12 WAP, shoot tip incidence at 6 and 12 WAP, and whole plant incidence at 20 WAP also had high weightings in PC3, but PC3 accounted for only 8% of the total variation. The scatter-plot of the first versus second principal component scores for the CMD response of the various sources of resistance in the three environments is presented in Fig 4.1. Since 58308 and 91/02324 were not present in all environments they were excluded from the analysis. The results showed that, generally, the resistant clones grouped together and were different from the susceptible. The first resistant group was made up of the resistant landraces TME8, TME9, TMS 130001, TME7 and TME12. TMS I90257clustered together with TME1 and TME6. The resistant landraces TME5 and TME14 formed another group, and TME11 clustered together with TMS 160142 and TMS I30572. The resistant landraces TME4 and the improved clone TMS 14(2)1425 formed single groups. TMS I4(2)1425’s group was close to the largest susceptible group made up of clones TME10 only, TME31 and TME2. The susceptible landrace TME117 formed a single group away from all the other susceptible groups. TME41 and TMS I30555 also formed single susceptible groups. 109 CO CM