Comparative Genomic Analysis of Mycobacterium Tuberculosis Complex Strains From Ghana
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
University of Ghana
Abstract
Background: Tuberculosis (TB), caused by a group of acid fast bacilli together called the Mycobacterium tuberculosis complex (MTBC) is one of the leading causes of adult mortality globally albeit treatable. TB in humans is mainly caused by the globally distributed M. tuberculosis sensu stricto (Mtbss) and West Africa restricted M. africanum (Maf), however, global efforts for TB control are Mtbss-biased to the neglect of Maf.. However, Maf is an important pathogen in West Africa especially Ghana (harbouring the two lineages (L) 5 and 6 in substantial numbers) where it causes about 20% of TB. This thesis sought to determine the genomic diversity among the dominant MTBC lineages (L4, L5 and L6) in Ghana and the functional implications of the observed genomic variability.
Methodology: A total of 297 clinical MTBC genomes (L4, L5 and L6) from Ghana were whole genome sequenced to establish the population structure of Maf and to explore the genomes of the three lineages for differential fixation of mutations, selection pressures among conserved genes as well as comparative genomic erosions stratified by lineages with their potential phenotypic implications. In addition, this study used 1490 prospectively isolated and genotyped MTBC strains to determine the potential influence of the strain genetic background on emergence and fixation of drug resistance conferring mutations
Results: L6 was found to have accumulated significantly more fixed single nucleotide polymorphisms (SNPs) (median = 1037; p < 0.05) compared to L4 (median = 822) and L5 (median = 928) in addition to significantly higher pairwise nucleotide diversity (π).
The whole genome phylogeny of Maf in Ghana upon clustering showed 5 and 3 clusters of L5 and L6 respectively. Average nucleotide identity (ANI) between any two lineages (L4/L5, L4/L6 and L5/L6) was always greater than 99%. The π of the functional genes was all significantly higher among the L6 compared to L4 and L5 with prominence among T cell epitopes (0.000061, 0.000063 and 0.000149 for L4, L5 and L6). Apart from T cell epitopes and regulatory proteins, all the gene families were under purifying selection thus the ration of change in nonsynonymous to change in synonymous mutations less than one (dN/dS <1.0) irrespective of the lineage. T cell epitopes of L4 and L5 were under purifying selection; thus the ratio of change in nonsynonymous to change in synonymous mutations (dN/dS) less than one (L4: median dN/dS = 0.57 and median dN/dS = 0.64 for fL4 and L5 respectively) compared to positive selection in L6 (median dN/dS = 1.53). Most T cell epitopes were conserved with only 10.4% (128/1226) mutated with lineage-specific variations. Genomic erosion was predicted to be more pronounced in L6 (3992 genes present) compared to L5 (4035 genes present) with predicted lineage-specific impaired expression of proteins such ESAT-6, mpt64 and a number of mce3 proteins. This study found Ghana genotype of L4 to be associated with DR-TB in Ghana (p < 0.05) compared to the other dominant genotypes. INH resistant Mtbss and Maf were more likely kat G S315T and inhApro -15C/T mutants respectively. This study identified potential genotype-specific epistatic interaction DR-conferring and compensatory mutations.
Conclusion: Maf L6 and L5 are genomically two different biovars of the MTBC, so different from each other as they are from L4. The observed genomic difference among the lineages has potential implications for virulence and applicability of TB control tools.