Tarr et al. BMC Genomics (2018) 19:894 https://doi.org/10.1186/s12864-018-5257-x RESEARCH ARTICLE Open Access Schizont transcriptome variation among clinical isolates and laboratory-adapted clones of the malaria parasite Plasmodium falciparum Sarah J. Tarr1*, Ofelia Díaz-Ingelmo1, Lindsay B. Stewart1, Suzanne E. Hocking1, Lee Murray1, Craig W. Duffy1, Thomas D. Otto2,3, Lia Chappell3, Julian C. Rayner3, Gordon A. Awandare4 and David J. Conway1* Abstract Background: Malaria parasites are genetically polymorphic and phenotypically plastic. In studying transcriptome variation among parasites from different infections, it is challenging to overcome potentially confounding technical and biological variation between samples. We investigate variation in the major human parasite Plasmodium falciparum, generating RNA-seq data on multiple independent replicate sample preparations of merozoite-containing intra-erythrocytic schizonts from a panel of clinical isolates and from long-term laboratory-adapted clones, with a goal of robustly identifying differentially expressed genes. Results: Analysis of biological sample replicates shows that increased numbers improve the true discovery rate of differentially expressed genes, and that six independent replicates of each parasite line allowed identification of most differences that could be detected with larger numbers. For highly expressed genes, focusing on the top quartile at schizont stages, there was more power to detect differences. Comparing cultured clinical isolates and laboratory- adapted clones, genes more highly expressed in the laboratory-adapted clones include those encoding an AP2 transcription factor (PF3D7_0420300), a ubiquitin-binding protein and two putative methyl transferases. In contrast, higher expression in clinical isolates was seen for the merozoite surface protein gene dblmsp2, proposed to be a marker of schizonts forming merozoites committed to sexual differentiation. Variable expression was extremely strongly, but not exclusively, associated with genes known to be targeted by Heterochromatin Protein 1. Clinical isolates show variable expression of several known merozoite invasion ligands, as well as other genes for which new RT-qPCR assays validate the quantitation and allow characterisation in samples with more limited material. Expression levels of these genes vary among schizont preparations of different clinical isolates in the first ex vivo cycle in patient erythrocytes, but mean levels are similar to those in continuously cultured clinical isolates. Conclusions: Analysis of multiple biological sample replicates greatly improves identification of genes variably expressed between different cultured parasite lines. Clinical isolates recently established in culture show differences from long-term adapted clones in transcript levels of particular genes, and are suitable for analyses requiring biological replicates to understand parasite phenotypes and variable expression likely to be relevant in nature. Keywords: Eukaryotic microbial genomics, Biological replicates, Clinical samples, Culture adaptation, RNA-seq, Transcriptomic methods, Cell invasion, Immunity * Correspondence: sarah.tarr@lshtm.ac.uk; david.conway@lshtm.ac.uk 1Pathogen Molecular Biology Department, London School of Hygiene and Tropical Medicine, London, UK Full list of author information is available at the end of the article © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Tarr et al. BMC Genomics (2018) 19:894 Page 2 of 13 Introduction in analysis [28, 29], and some random variation may be Variation in gene expression is key to survival and partly overcome statistically if it is possible to analyse large reproduction of microbes, affecting diverse phenotypes numbers of infections [30, 31]. However, P. falciparum such as sexual differentiation [1], adaptation to different parasites at the intra-erythrocytic schizont stage, contain- growth conditions [2] and evasion of host immunity [3]. ing merozoites ready to be released to invade other eryth- Malaria pathogenesis occurs as parasites undergo cycles rocytes, are normally sequestered in organ capillaries and of invasion and asexual replication inside erythrocytes. not present to be analysed in blood samples. This stage of Towards the end of each cycle, schizonts develop which the parasite may be studied by ex vivo development in the contain multiple merozoites that need to burst from the first cycle of following isolation from patients, and analysis host cell and invade new erythrocytes. The parasite tran- of matured schizont stages has shown variable expression scriptional program of the major human malaria parasite of particular genes [7, 13, 14], although the precision of Plasmodium falciparum is highly synchronised through- such analyses are limited without biological replicate out the asexual replication cycle in erythrocytes [4], but measurements. some variation exists between parasite clones [5, 6]. Here we present gene expression profiles of schizont-stage Analysis of naturally occurring polymorphism in P. fal- malaria parasites from recently culture-established clinical ciparum has shown that selection maintains multiple al- isolates, and laboratory-adapted clones previously cultured leles of many merozoite-stage genes [7, 8], some of for many years. We conduct RNA-seq analysis with multiple which encode targets of naturally acquired immunity replicates of each parasite line, and show that high numbers [9–11]. Genes expressed at this stage can also exhibit ap- of biological replicates improves the true-positive discovery parently pronounced variation in transcription among rate for identifying differentially expressed genes. This iden- parasites [7, 12–14], but most analyses have been con- tifies schizont-stage genes that have differing transcript levels ducted on sample preparations that do not incorporate between the long-term laboratory-adapted clones and cul- many biological replicates, so that the extent of variation tured clinical isolates, as well as those variably expressed among different parasite lines is unclear. among the clinical isolates. The results confirm variable Accurate quantitation of differential gene expres- transcription in particular genes encoding ligands involved sion between organisms or groups of organisms in in erythrocyte invasion, and variation in expression of genes whole-transcriptome analyses requires biological rep- that have been less characterised was quantitatively validated licate sampling [15, 16]. The importance of minimis- by targeted RT-qPCR assays, allowing analysis to be ex- ing sampling error in order to detect real differential tended to additional samples from which material is more gene expression is widely recognised [17], and some limited. The expression levels of these genes in a panel of bioinformatic tools can guide the determination of clinical isolates cultivated to schizont stage in the first ex replicate numbers appropriate to experimental designs vivo cycle were similar on average to those in continuously [18, 19]. In transcriptomic studies of malaria parasites, cultured clinical isolates. This encourages use of recently cul- achieving large numbers of replicate sample preparations ture-established parasite lines for studies requiring biological is difficult, particularly from clinical isolates [5], and most replication. detailed understanding of transcription has been derived from a small number of parasite lines that have been grown in culture for many years. Although Results long-term culture-adapted parasites are phenotypically Biological replicate sampling improves detection of and transcriptionally diverse [6, 12, 20], it is not clear to differentially expressed genes what extent they reflect the diversity of parasites causing RNA-seq paired-end short-read sequences from all sam- clinical malaria. Examination of genome sequences of ples were mapped to the P. falciparum strain 3D7 refer- culture-adapted parasites has identified premature stop ence genome sequence, and regions of genes with high codon mutants affecting some transcription factor genes levels of allelic polymorphism were masked from ana- and cell signalling protein genes [21, 22], as well as par- lysis in order to minimize mapping bias among the dif- ticular gene deletions and amplifications not seen in para- ferent parasite lines (Additional file 1). Genes previously sites sampled directly from patients [23–25]. described as deleted in some parasites were not ex- Studies of parasite transcripts in clinical isolates face a cluded, as most individual deletions are very rare [32] challenge of interpreting the mixture of developmental and exclusion would result in unnecessary loss of data stages that may be present in a blood sample, with para- for genes unaffected in the isolates sampled. After ex- sites at different stages of the asexual cycle as well as para- cluding the subtelomeric large gene families var, rifin sites that have committed to sexual development [26–28]. and stevor for which short read sequence mapping is not Some of the unknown variation in parasite stage distribu- generally locus-specific, a total of 5134 gene loci were tion may be accounted for using deconvolution methods analysed. Tarr et al. BMC Genomics (2018) 19:894 Page 3 of 13 We assessed the impact of numbers of biological repli- and 111 genes, for two, four, six and eight replicates re- cates on detection of genes with differing transcript spectively. The median false-positive rates were very low levels between parasite lines, by first analysing RNA-seq in all cases (among the genes that did not show differ- data for the 3D7 and D10 laboratory clones, focusing on ences with 10 replicates, the proportions showing appar- schizonts purified by density gradient centrifugation ent differences with two, four, six and eight replicates from ten different culture preparations of each clone. were 0.001, 0.004, 0.005 and 0.004 respectively). Analysing data from all ten replicates identified 123 We also examined the sensitivity to detect differential genes with adjusted P values < 0.01 and log2 differences expression when focusing specifically on genes that are of > 2 in relative transcript levels between the two clones highly expressed in schizont-stage samples, as these in- (equivalent to at least 4-fold differences, Additional file 2). clude genes most likely to be functionally important for We assessed what proportions of these 123 genes were the merozoite invasive stage. The FPKM values for each captured as being differentially expressed in 100 random sample in each of the comparisons were averaged and samples of two, four, six and eight replicates within each the genes were ranked by their maximum expression group. The true-positive proportion of differentially level. The top quartile of most highly expressed genes expressed genes detected (using the same statistical were analysed for differences between clone 3D7 and criteria of absolute log2 fold difference > 2 and adjusted D10, in all ten replicates and subsamples of lower repli- P value < 0.01) increased with the number of replicates cate numbers, with 100 randomisations of each. In com- within each group, median true-positive proportions be- parison with all ten replicates, there were median ing 0.28, 0.52, 0.67 and 0.79, for two, four, six and eight true-positive rates of 0.45, 0.63, 0.78 and 0.86 for com- replicates respectively (Fig. 1). These true-positives parisons respectively containing two, four, six and eight accounted for the majority of genes indicated as differ- replicates, higher proportions than for the analysis of all entially expressed, out of median totals of 42, 81, 108 genes (Fig. 1). Based on these data, it is evident that in- creasing biological replicate numbers for parasite RNA-seq improves discovery, and under our experimen- tal conditions the use of six independent biological repli- cates detects the majority of genes shown to be differentially expressed using ten replicates. Comparison of gene expression in cultured clinical and laboratory-adapted isolates In addition to the ten biological replicate preparations from each of clones 3D7 and D10 analysed above, another nine biological replicate preparations of schizonts from clone 3D7 were analysed, as well as seven biological replicates for each of two other laboratory-adapted clones (HB3 and Dd2), and five or six replicates of each of six cultured clinical isolates from Ghana (isolates 217, 278, 280, 286, 293 and 296: Additional file 3). Despite being enriched for mature schizont stages, multiple biological replicates cannot each have exactly the same distribution of parasite developmental stages, so to assess stage variability the RNA-seq read depth for each gene (FPKMs) for each replicate sample were corre- lated with FPKM values in reference RNA-seq data for seven cultured time points over a ~ 48 h P. falciparum cycle [33]. This identified only a small number of replicate sam- ples that did not have a maximum Spearman’s correlation Fig. 1 Increasing numbers of sample replicates improves identification with parasites at schizont stages (either 40 or 48 h of schizont-stage genes varying in expression between different P. falciparum lines. Assessment of the proportion of genes captured as post-invasion), and these were excluded from further ana- being differentially expressed between two different parasite clones lyses (Additional file 3). Although there is variation evident (3D7 and D10), by taking 100 random samples of two, four, six and among the remaining replicates that were analysed, within eight replicates of each (out of ten initially analysed replicates that each isolate the FPKMs among replicates correlated highly, identified 123 genes with log2 differences of > 2 in relative transcript with average Spearman’s ρ > 0.95 for pairwise correlations levels between the two clones) across all genes (Fig. 2). Tarr et al. BMC Genomics (2018) 19:894 Page 4 of 13 effects of fine differences in schizont maturity among samples, expression levels of the ten most differentiated genes were compared between samples having overall highest correlations with either the 40-h or the 48-h time points of a reference transcriptome dataset [33]. This showed that the gene expression distributions were not due to differences in estimated maturity, and that differ- ences between clinical isolates and laboratory clones remained when only analysing replicates having peak cor- relation with the 40-h time point (Additional file 5). Eight of the ten highly expressed genes showing a differ- ence between the groups had higher transcript levels in the long-term laboratory-adapted clones than in the cul- tured clinical isolates (Table 1). These include an AP2 transcription factor gene PF3D7_0420300, as well as two predicted methyltransferase genes (PF3D7_0422900 and PF3D7_0522300), and gene PF3D7_0104300 encoding Fig. 2 Distribution of correlations of FPKM expression values among ubiquitin-binding protein 1 which is involved in protein biological sample replicate transcriptomes for each isolate or clone. turnover [36] (Fig. 3). The genes that had higher transcript The final numbers of replicates, and median Spearman’s correlation levels in clinical isolates were PF3D7_1036300 which en- coefficients among replicates were: 3D7 n = 19, ρ = 0.89; D10 n = 10, codes the merozoite surface protein MSPDBL2 (Fig. 3), ρ = 0.93; Dd2 n = 6, ρ = 0.95; HB3 n = 7, ρ = 0.89; 271 n = 3, ρ = 0.94; 278 n = 5, ρ = 0.91; 280 n = 6, ρ = 0.93; 286 n = 6, ρ = 0.97; 293 n = 6, and PF3D7_1302100 which encodes the gamete antigen ρ = 0.93; 296 n = 3, ρ = 0.90 27/25. MSPDBL2 has previously been shown to be only expressed in a minority of schizonts in culture, with clone HB3 having more positive schizonts than other In an overall comparison of the schizont preparations laboratory-adapted clones [7], consistent with the higher from four long-term laboratory-adapted clones and six transcript level in HB3 seen here and previously. Recent cultured clinical isolates, 132 genes (2.6% of those ana- experimental analysis has shown mspdbl2 transcript levels lysed) appeared to have different transcript levels be- to be markedly increased in parasite cultures when func- tween the two groups, each having log2 fold difference > tion of heterochromatin protein 1 (HP1) is suppressed by 2 and adjusted P value < 0.01 (Additional file 4). Among gametocyte development protein 1 (GDV1) leading to the genes within the top quartile of expression values gametocytogenesis [37, 38]. Although gamete antigen 27/ overall, 18 genes (1.4%) had such a difference between 25 had higher transcript levels in clinical isolates and was the groups, a lower proportion than among the rest of previously shown to be expressed in early gametocytes the genes (Odds Ratio 0.52, Fisher’s exact P = 0.006). [38, 39] another gene PF3D7_1327300 that is transcribed Examining the 18 genes with differing transcript levels in gametocytes [40] showed lower expression in the clin- among the top quartile of expressed genes, several were ical isolates (Table 1). seen to contain a deletion in one of the parasite lines or We investigated whether there was an overall association to be members of multigene families. For example, between the genes differentially expressed and targets of PF3D7_1371600 (ebl-1) and PF3D7_0935700 (near the HP1 regulation [37, 41]. Of the HP1-regulated genes previ- end of chromosome 9) are deleted in the HB3 strain and ously identified [37], 118 genes are among those analysed D10 strain respectively (Fig. 3), as previously docu- here (var, rifin and stevor gene families were excluded). Of mented [34, 35]. After excluding these genes and others the 132 genes differentially expressed between the two for which one or more of the samples were presumed to groups, 24 (18.2%) were targets of HP1 regulation, a higher contain genetic deletions, and excluding members of proportion than among the rest of the genes (Odds Ratio sub-telomeric multi-gene families, ten highly expressed 11.3, Fisher’s exact P = 1.3 × 10− 5). single-locus genes showed differences between labora- tory and clinical isolates (Table 1). Differential expression of schizont-stage genes among There was considerable variation in the measurements cultured clinical isolates for individual genes among the different biological repli- Overall, 214 genes (4.2%) showed a log2 fold difference > 2 cate preparations of each parasite clone or isolate, with for at least one of the 15 pairwise comparisons among cul- five to ten-fold ranges of levels being not unusual (Fig. 3), tured clinical isolates. Of those genes in the top quartile of illustrating the importance of sampling and analysing expression levels, 39 genes (3.0%) showed a difference for multiple biological replicates. To check for potential at least one pairwise comparison, a lower proportion than Tarr et al. BMC Genomics (2018) 19:894 Page 5 of 13 Fig. 3 Gene transcript levels among preparations of schizont-enriched P. falciparum cultures of laboratory-adapted clones and clinical isolates. Plots of normalised RNA-seq read counts for eight of the genes showing differences between four long-term laboratory-adapted clones (3D7, D10, Dd2 and HB3) and six recently culture-established Ghanaian clinical isolates (271, 278, 280, 286, 293 and 296). Each point shows the value for an independent biological replicate sample. For all genes, a wide variation in transcript levels among biological replicates may be noted. a The two panels show examples of genes for which one of the laboratory-adapted clones had a deletion, D10 which is missing the PF3D7_0935700 gene as part of a deleted region near the end of chromosome 9, and HB3 which is missing gene PF3D7_1371600 that encodes erythrocyte binding-like protein 1. These, and other genes with suspected or known deletions in the sampled parasites, are excluded from the list of differentially expressed genes in Table 1 but are listed Additional file 4: Table S2. b The six panels show examples of genes listed in Table 1 that are in the top quartile of expression and show a log2 fold difference > 2 between the groups. Most of these genes have lower transcript levels in the clinical isolates (including AP2 transcription factor gene PF3D7_0420300, as well as predicted methyltransferase genes PF3D7_0422900 and PF3D7_0522300, and ubiquitin-binding protein 1 gene PF3D7_0104300). In contrast, the mspdbl2 gene PF3D7_1036300 shown in the bottom right panel had higher transcript levels in clinical isolates Table 1 P. falciparum genes in the top quartile of expression showing a difference in schizont stage transcript levels between long- term laboratory-adapted clones and recent clinical isolates in culture Gene ID Product annotation Log2 fold difference Wald test Higher transcription in long-term laboratory-adapted clones PF3D7_0422900 methyltransferase, putative −2.50 −6.48 PF3D7_0501300 skeleton-binding protein 1 −2.50 −4.98 PF3D7_0410000 erythrocyte vesicle protein 1 −2.38 −10.62 PF3D7_0104300 ubiquitin carboxyl-terminal hydrolase 1, putative −2.18 −6.13 PF3D7_0522300 18S rRNA (guanine-N(7))-methyltransferase, putative −2.15 −8.67 PF3D7_0420300 AP2 domain transcription factor, putative −2.02 −7.19 PF3D7_1030200 claudin-like apicomplexan microneme protein, putative −2.00 −9.21 PF3D7_1327300 conserved Plasmodium protein, unknown function −2.00 −4.97 Higher transcription in clinical isolates PF3D7_1036300 duffy binding-like merozoite surface protein 2 3.02 5.32 PF3D7_1302100 gamete antigen 27/25 2.02 3.99 Tarr et al. BMC Genomics (2018) 19:894 Page 6 of 13 among the less expressed genes (Odds Ratio 0.66, Fisher’s antigen-3 [43]). These gene transcripts were quantified Exact P = 0.02) (Additional file 6). Some of the variably by RT-qPCR in 49 RNA preparations from the laboratory expressed genes encode merozoite proteins previously clones and clinical isolates that had been analysed by characterised as transcriptionally variable among ex vivo RNA-seq. Transcript levels were normalised to those of a clinical isolates [14], including dblmsp2 and msp6, as well housekeeping gene (serine-tRNA ligase PF3D7_0717700) as erythrocyte binding antigen genes eba-175, eba-181, and correlated with similarly normalised FPKM expres- and eba-140 (which is also deleted in one of the laboratory sion values for the RNA-seq data. Each gene showed adapted clones) (Additional file 7). Notably, among the 214 strong positive and highly significant correlation between genes variably expressed among cultured clinical isolates, 38 RNA-seq and RT-qPCR-derived expression measures, all (17.8%) have been shown to be targets of HP1 regulation, except one having correlation coefficients > 0.8 (Fig. 4). compared with only 1.6% of the non-variably expressed genes (Odds Ratio 12.9, Fisher’s Exact P < 2.2 × 10− 16). This Variable expression among isolates with schizonts association was even more extreme among genes in the top sampled in the first ex vivo cycle quartile of overall expression values, 16 (41%) of the 39 vari- We next determined the expression levels in additional ably expressed genes being targets of HP1 regulation com- clinical samples from Ghana matured during the first pared with only 0.6% of the non-variably expressed genes cycle ex vivo until a large proportion of parasites were (Odds Ratio 118.6, Fisher’s exact P < 2.2 × 10− 16). schizonts. For nine of these isolates there was sufficient To validate the data obtained through RNA-seq, a RNA yield for RNA-seq to be performed, out of which subset of genes was selected for quantitation by seven gave adequate data for transcriptome ana- reverse-transcription quantitative PCR (RT-qPCR). We lysis. With remaining material from two of these as well chose differentially expressed genes encoding proteins as another six isolates there was sufficient to analyse the containing predicted transmembrane domains or signal selected eight gene panel by RT-qPCR. Characterisation peptides, thereby likely to enter the parasite secretory of the ex vivo samples showed considerable variation in pathway, excluding genes previously characterised as relative expression levels of each gene, but the mean variably expressed or members of multi-gene families. levels across the isolates were similar to those deter- Eight genes were selected for assay (Additional file 8), mined in the continuously-cultured parasites that had two of which encode known antigens (merozoite-asso- extensive sample replication (Fig. 5). Without sample ciated tryptophan-rich antigen [42] and liver-stage replication of the first cycle ex vivo clinical samples, Fig. 4 High correlations between RNA-seq and RT-qPCR expression measures. Eight genes identified as differentially expressed among clinical isolates by RNA-seq were validated by RT-qPCR for 49 of the independent RNA preparations from the four laboratory-adapted clones and six cultured clinical isolates under study. Scatter plots for each gene show the RT-qPCR gene of interest (GOI) copies (log2-transformed) normalised to house-keeping gene copies (HKG; PF3D7_0717700), correlated with the RNA-seq GOI FPKM values (log2-transformed) normalised to HKG FPKM values. Spearman’s correlation coefficients are shown for each plot Tarr et al. BMC Genomics (2018) 19:894 Page 7 of 13 routine technical variation and random sampling effects the data to calculate the true- and false-discovery rates ob- are likely to exaggerate the apparent variation in expres- tained using fewer replicates. Consistent with studies on sion levels. However, the observation that for each gene other cultured eukaryotes [17], we show that for malaria the normalised expression values have similar means to parasites increasing the sample replication improves the those for cultured clinical lines indicates there are no true-positive discovery rate in identifying variably systematic differences in expression between the first ex expressed genes, and under the experimental conditions vivo cycle and under continuous culture conditions. here six independent replicates were useful to balance ac- These results support the use of cultured clinical isolates curacy and experimental feasibility. Where these numbers as a means of studying parasite gene expression reflect- of replicates are not obtainable, it is recommended to ing original populations, while enabling the necessary focus on genes that are highly expressed in order to experimental replication for many analyses. achieve as much accuracy as possible per gene. Con- versely, where a very high sensitivity is needed it may be Discussion worth studying even more biological replicates for each Transcriptomic analyses perform optimally with increased parasite line and condition being investigated. levels of sample replication to minimise the impact of sto- It has been proposed that spontaneous transcrip- chastic or technical variation in individual samples [17, tional variation within parasite populations is a strat- 44]. This is particularly important for studies focusing on egy that ensures fitness of parasites facing a range of a defined developmental stage, as errors may be caused by potential and changing selective pressures [6], and it is subtle differences between samples. However, the nature likely that the rates and mechanisms of regulation of clinical samples of malaria parasites is such that replica- differ among most genes. In comparison between tion is technically very difficult. In order to assess the de- long-term laboratory-adapted clones and cultured clin- gree of replication useful to identify differentially ical isolates, we found that among the genes most dif- expressed genes, we first generated RNA-seq profiles from ferentially expressed were an AP2-transcription factor multiple replicates of laboratory-adapted clones, and used [45] and two methyltransferases, the functional roles of Fig. 5 Transcript levels of variably expressed genes for first cycle ex vivo clinical samples compared with continuously cultured isolates. For each gene, the log2 normalised expression value is plotted as a ratio of the value for a housekeeping gene (PF3D7_0717700). For the cultured clinical isolates (red, N = 6), each point represents the mean expression value for each gene calculated across multiple replicates of each isolate. For the ex vivo clinical isolates, each point represents values for single samples without replicates. The panel of ex vivo A samples (green N = 8) were analysed by RT-qPCR, while the panel of ex vivo B samples (black, N = 7) were analysed by RNA-seq Tarr et al. BMC Genomics (2018) 19:894 Page 8 of 13 which remain to be determined although genome-wide and other protein modifications [49, 50] may also identify mutagenesis studies indicate they are important for P. fal- functional determinants, and large-scale analysis of parasite ciparum asexual erythrocytic stage replication [46]. It is protein variation has not yet been attempted on clinical iso- possible that use of different oxygen concentrations in the lates. A more complete description of transcriptome vari- culture of long-term adapted clones and recent clinical ation may also be derived in future by analysis of individual isolates here might affect transcription of some genes, par- infected cells, which has not been undertaken on parasites ticularly in earlier trophic stages of the intra-erythrocytic from clinical isolates, as initial single-cell sequencing studies cycle [47], and although we did not consider oxygen me- of laboratory model malaria parasites have only recently tabolism in analysis of schizont stages it could be exam- begun to probe the expression pathways of different ined in future. Most of the differentially expressed genes life-cycle stages [51–53]. Scaling up transcriptomic analyses noted here had lower transcription in the clinical isolates, to clinical studies is challenging, and biological replicate and we note that HP1 is responsible for extensive gene si- preparations of parasite samples may only be feasible in a lencing in malaria parasites [38], and there was an over limited number of cases. In an epidemiological context, re- representation of HP1-regulated genes among those dif- duction of error also requires large sample sizes and tight ferentially expressed between laboratory-adapted clones definition of clinical phenotypes, and more statistical ana- and cultured clinical isolates. It would be interesting to lysis and modelling of covariates may be appropriate, ideally undertake studies directly on HP1 and other chromatin including data on host transcriptome variation within the modifications in recent clinical isolates, as experimental same samples [54]. studies on chromatin have so far focused on long-term laboratory-adapted clones. Interestingly, among the top Conclusions quartile of expressed genes, the one most markedly highly The transcriptomes of Plasmodium falciparum erythrocytic expressed in clinical isolates was mspdbl2 that encodes a schizont preparations of long-term laboratory-adapted merozoite surface protein expressed within only a minor- clones and recently established clinical isolates were ana- ity of schizonts [7] which may be a marker of parasites lysed, using large numbers of biological replicate samples committed to sexual development [37]. Further studies of to minimise the impact of inter-replicate variation on ob- clinical isolates are needed to understand the process of served patterns of differential expression. Testing subsam- parasite commitment to sexual differentiation and trans- ples of up to ten replicates of each cultured line shows that mission, as indicated by other transcriptome comparisons six replicate preparations gives power to detect most differ- [5, 30]. ences. Particular genes show differential expression in We identified many genes differentially expressed among laboratory-adapted clones compared to cultured clinical the clinical isolates. Targeted RT-qPCR assays of isolates, or among different clinical isolates, including those schizont-stage ex vivo cultured clinical isolates have previ- predicted to affect differentiation and encoding targets of ously shown some of the genes to be differentially immunity. In schizonts grown in the first cycle ex vivo, ex- expressed [7, 13, 14]. Focusing on genes for which variable pression levels of a panel of these genes vary among iso- expression has not been previously studied, we selected a lates, but mean levels are similar to those in continuously panel of eight highly differentially expressed genes that en- cultured clinical isolates suitable for experimental analyses code secreted proteins. Of these, the merozoite-associated requiring biological replicates to understand relevant para- tryptophan-rich antigen and liver stage antigen 3 have site phenotypes. been previously identified as merozoite proteins [42, 43], and another gene (PF3D7_0423900) is adjacent to loci en- Methods coding the cysteine-rich protective antigen (CyRPA) and P. falciparum isolates sampled from clinical malaria cases reticulocyte binding homologue 5 (Rh5) which are targets Blood samples were collected from clinical malaria cases of antibodies that inhibit merozoite invasion. We exten- attending Ghana government health facilities between sively validated the RNA-seq data by RT-qPCR for these 2012 and 2013, in Kintampo (Brong-Ahafo Region of genes, and extended quantitation to additional ex vivo central Ghana), and Navrongo (Kassena-Nankana East clinical isolate samples. Despite the lack of sample repli- Municipality, in the Upper East Region of northern cates for the latter, results showed that average expression Ghana). Patients were eligible to participate in the study if levels for each gene were similar to those for the cultured they had uncomplicated clinical malaria, were aged 2–14 samples that had biological replicates. years, tested positive for P. falciparum malaria by lateral Together, these data highlight new candidates for investi- flow rapid diagnostic test and slide microscopy, and had gation as potential determinants of alternative developmen- not taken antimalarial drugs during the 72 h preceding tal pathways or targets of immunity. Characterisation of sample collection. Venous blood samples (up to 5ml) were gene transcript levels is only one level of analysis, as prote- collected into heparinized Vacutainer tubes (BD Biosci- omic characterisation [48] and analysis of phosphorylation ences). Blood samples were centrifuged, plasma separated Tarr et al. BMC Genomics (2018) 19:894 Page 9 of 13 and leukocyte buffy coat cell layer removed, and erythro- For discontinuous density centrifugation purification, cytes were cryopreserved in glycerolyte and stored frozen at parasites were maintained as 25 ml cultures at 2.5% − 80 °C or in liquid nitrogen until shipment on dry ice to hematocrit. Cultures were used when > 1% erythrocytes the London School of Hygiene and Tropical Medicine for contained parasites with multiple nuclei. Schizonts were subsequent culture. purified using 70% Percoll (GE Healthcare)/2.93% sorb- itol/PBS overlayed with 35% Percoll/1.47% sorbitol/PBS, which was overlayed with parasitized erythrocytes, Parasite culture and enrichment for schizont stages allowing the schizonts to be separated by centrifugation Parasites from ex vivo clinical samples were matured at 2500 g for 10 min at 24 °C, with a light brake at the within the original patient erythrocytes at 2–5% haema- end of centrifugation. Purified schizonts were washed toctit for up to 48 h in RPMI 1640 medium containing once in complete medium and the pellet volume was es- 2% human AB serum (GE Healthcare) and 0.3% Albu- timated, following which the pellet was resuspended max II (Thermo Fisher Scientific) under an atmosphere with six times the pellet volume of 50% haematocrit of 5% O2, 5% CO2, and 90% N2 at 37 °C, until most para- erythrocytes, a slide was prepared for microscopic exam- sites were at the schizont stage of parasite development. ination, and the cells were then incubated in 6 ml of Laboratory-adapted clones and continuously cultured complete culture medium. Of this, 1 ml was used as a clinical isolates were grown in erythrocytes at 2–5% haem- control untreated sample to track parasite egress, and atocrit in RPMI 1640 medium containing 0.5% Albumax 10 μM final concentration of E64 was added to the II, at 37 °C. The biological replicate schizont preparations remaining 5 ml, and both were cultured at 37 °C in a 5% of clinical isolates for RNA-seq analysis were made by CO2 static incubator for 5.5 h. Schizonts from the sampling at multiple time points between 182 and 250 E64-treated culture were overlaid on 70% Percoll/2.93% days after first initiation into continuous culture. sorbitol/PBS and separated by centrifugation at 2500 g Laboratory-adapted clones which have been in culture for for 10 min at 24 °C, with a light brake at the end of cen- many years (3D7, HB3, Dd2, and D10) were maintained trifugation. The schizont layer was washed once in under atmospheric air with 5% CO2, and cultured clinical complete culture medium and final cell pellets of ap- isolates were maintained in 5% O2, 5% CO2, 90% N2. proximately 10–20 μl were used for RNA extraction. Schizonts were purified using magnetic-activated cell sort- ing (MACS) for all of the lines, except for the D10 and RNA extraction HB3 clones and ten of the 3D7 replicates which were puri- Erythrocytes containing the prepared schizonts were re- fied using Percoll density gradient centrifugation. suspended in 500 μl TRIzol® reagent (Thermo Fisher Sci- For MACS purification of cultured clinical isolates, entific) and stored at − 80 °C until RNA extraction using parasites were prepared from 100ml cultures with at RNeasy mini columns (Qiagen). RNA pellets were dis- least 0.7% of erythrocytes containing schizonts. Parasites solved in 100 μl RNase-free H2O, and a second RNA were isolated by magnetic purification using magnetic clean-up and on-column DNase treatment was carried LD Separation columns (Miltenyi Biotech). One column out with RNA eluted in 30–50 μl RNase-free H2O, and was used per 25ml culture, and first washed twice in 3ml concentration quantified by Qubit High Sensitivity RNA of culture medium at room temperature. Parasite culture Assay (Thermo Fisher Scientific). Samples containing at erythrocytes were pelleted by centrifugation at 500 g for 5 least 500 ng RNA were considered for RNA-seq process- min, then re-suspended in 3ml culture medium per 1ml ing after the RNA integrity was checked on an Agilent of packed cell volume. The re-suspended material was Bioanalyzer using RNA 6000 Nano reagents and chips bound to the MACS column, which was then washed (Agilent Genomics). three times with 3ml culture medium, and schizonts were eluted twice by removing the magnet from the column RNA-seq library preparation and sequencing and forcing 2ml culture medium through the column into RNA-seq libraries of each of the replicate preparations a 15ml collection tube. Finally, the schizonts were pelleted of parasites were prepared with TruSeq Stranded mRNA by centrifugation at 500 g for 5min and the pellet volume Library Prep Kits (Illumina) using 500 ng – 1 μg RNA was estimated, with 0.5 μl used for Giemsa-stained micro- following the Illumina TruSeq Stranded mRNA protocol. scopical examination to assess staging, 1 μl added back to Libraries were validated on an Agilent Bioanalyzer using 250 μl culture at 0.8% hematocrit to follow the progres- DNA 1000 reagents and chips (Agilent Genomics) to sion. Remaining parasites were re-suspended in 1.5ml of quantify library sizes and confirm the absence of primer culture medium with 10 μM E64 in a 12 well plate, and dimers. Libraries were quantified using a KAPA parasites were incubated for 5.5 h in 5% CO2 at 37 °C, be- Universal Library Quantification kit (Roche Diagnostics fore centrifugation in a 1.5ml tube and processing for Limited) on a 7500 Fast Real-Time PCR System RNA extraction. (Thermo Fisher Scientific) and library concentrations Tarr et al. BMC Genomics (2018) 19:894 Page 10 of 13 were adjusted for library size. 12–15 pM pooled libraries above enabled efficient preparation of mature schizonts, were sequenced on a MiSeq System (Illumina) using a so this was used for preparations of schizonts through- MiSeq Reagent Kit v3 (Illumina) with 2 × 75 cycles. out the study. RNA-seq libraries of ex vivo schizont-enriched P. fal- ciparum isolates (not cultured beyond the first cycle) RT-qPCR assays were prepared using a modified protocol. PolyA+ RNA 150–500 ng total RNA from each preparation of parasite (mRNA) was selected using magnetic oligo-d(T) beads, schizonts was reverse transcribed using Superscript II® and mRNA was reverse transcribed using Superscript (Thermo Fisher Scientific) with 250 ng random hexamer III® (Thermo Fisher Scientific), primed using oligo-d(T) oligonucleotide primers per 20 μl reaction. Quantitative PCR primers, with dUTP included during second-strand (qPCR) was carried out using SYBR® Select Master Mix cDNA synthesis. The resulting double stranded cDNA (Thermo Fisher Scientific) with 500 nM forward and reverse was fragmented using a Covaris AFA sonicator. Sheared primers, in a Prism 7500 Fast qPCR machine. For each gene, double stranded cDNA was dA-tailed, end repaired, and threshold-cycle values were quantified against a serially di- “PCR-free” barcoded sequencing adaptors (Bioo Scien- luted genomic DNA (Dd2 strain) standard curve, run on the tific) [55] were ligated. Libraries were cleaned twice, same plate. Cycling parameters were: 50 °C for 2min, 95 °C using solid phase reversible immobilisation beads, and for 2min followed by 40 cycles of 95 °C for 15 s and 60 °C eluted in EB buffer (Qiagen). Second strand cDNA was for 1min. All wells were run as 10 μl volumes in technical removed using uracil-specific excision reagent enzyme duplicate. The qPCR copy numbers were normalised against mix (NEB) prior to library generation, and libraries were copies of a house-keeping gene, PF3D7_0717700 [60]. PCR assayed by quantitative PCR prior to sequencing on primer pair sequences are as follows: PF3D7_0102700 a HiSeq system (Illumina). 5’-CAACCAGACAAACAACAAGAAA-3′ and 5’-TCCAT TCTGATGCTTTCCAC-3′, PF3D7_0220000 5’-GTAAAT Data analysis GGTGAATTAGCTAGTGAAG-3′ and 5’-CCTTTATATC Raw Illumina sequence reads were aligned to the P. fal- TTCGGCTTCTTCT-3′, PF3D7_0423900 5’-GAGAAGAA ciparum 3D7 v3 genome using HISAT2 [56] and con- GCCAAAGTAAATGAAC-3′ and 5’-GAATGTGTCAAG verted to ‘BAM’ format using SAMtools [57]. Reads with TGCATCATAA-3′, PF3D7_0605900 5’-CGCAATAACAA- MAPQ scores < 60 were removed. Reads were counted GAAGTCACAAA-3′ and 5′-AAGATTGTAGGAGGGTG using the “summarizeOverlaps” feature of the Genomi- TTAAT-3′, PF3D7_0935300 5’-GGGCTGCTGTTATACC cAlignments package [58] in R, against a previously pub- TTG-3′ and 5’-AGAATGAGGAGGTTCAGTTTG-3′, PF3 lished P. falciparum genome annotation file that had D7_1252900 5’-CCTTAGTAGAACTTCAAAGTGACA-3′ been masked for regions of polymorphism (outlined in and 5’-TGTAACCAACTACGAAATTCCC-3′, PF3D7_14 Additional file 1, which explains the coded annotation in 61700 5’-TGCTGACTTTCAAGAGATAAGG-3′ and 5’-T Additional files 9 and 10). Fragments Per Kilobase of TTCATTTGTTCATTTTGTACAACC-3′, PF3D7_147650 transcript per Million mapped reads (FPKMs) for the data 0 5’-CTTCGATTCACAAATGCAGAAA-3′ and 5’-CGTA here and those of previously published data on a develop- TTCCACACCTTGTTCTAT-3′, PF3D7_0717700 5’-AAG- mental time course of the clone 3D7 [33] were calculated TAGCAGGTCATCGTGGTT-3′ and 5’-GTTCGGCACAT using the ‘fpkm’ function of DESeq2 [59] in R. FPKMs for TCTTCCATAA-3′. all genes in each of our parasite preparations were corre- lated using a Spearman’s Rank correlation with FPKMs for Additional files each of the seven time points (0, 8, 16, 24, 32, 40 and 48 h post-invasion; Additional file 2: Table S1). Replicates with Additional file 1: Masking of P. falciparum genome annotation file to peak Spearman correlation values of ρ > 0.7 at the latest exclude polymorphic regions of genes. (PDF 546 kb) time points (40 or 48 h) were included for further analysis. Additional file 2: Table S1. List of genes with different detected transcript Differential expression analysis was conducted using levels between 3D7 and D10 strains by absolute log2 fold > 2 (at least 4 folddifference). Note that this list contains some genes that are deleted in either DESeq2 in R. line. There are also two highly polymorphic genes that were not masked in To test that the protocol for E64 treatment to prevent the initial annotation file, asterisked below (*) and excluded from subsequent schizont egress from erythrocytes had no major effect analyses. (XLSX 18 kb) on the transcriptomes, a comparison of four paired rep- Additional file 3: Figure S1. Multiple biological replicate preparations ofP. falciparum schizont stage transcriptomes in correlation with reference licates of schizont preparations of cultured P. falciparum stage-specific transcriptome data. Multiple parasite preparations were made clone 3D7 was performed, which showed that only a sin- from four long-term laboratory adapted clones (between seven and ten gle gene had log fold difference > 2 (Additional file 11), replicates of each) and six Ghanaian clinical isolates (five or six replicates of2 each) for RNA-seq analysis. Each parasite culture was enriched for schizont- a difference considered as potentially random and stage parasites, with egress blocked using E64 treatment. Plots show the marginal compared to differences anticipated between correlations of FPKM values across all genes in comparisons with previous different parasite lines. The E64 treatment as described data from seven time-points across the P. falciparum asexual erythrocytic Tarr et al. BMC Genomics (2018) 19:894 Page 11 of 13 cycle [33], with peak correlations indicating the predominant parasite stage in Authors’ contributions each replicate. Red lines plot data for samples with peak correlation at either SJT contributed to technical design, performed parasite culture, RNA-seq data 40 or 48 h post-invasion, and grey lines plot replicate samples that did not analysis and drafted the manuscript. OD-I contributed to technical design, maximally correlate with either of these time points, which were therefore performed parasite culture and RNA-seq data generation. LBS contributed to excluded from further analysis. (TIFF 92 kb) technical design, performed parasite culture and RNA-seq data generation. SEH, LM, and CWD assisted with RNA-seq bioinformatics data analysis. TDO advised Additional file 4: Table S2. Genes showing significant differences in and gave input on RNA-seq bioinformatics data analysis and interpretation. LC transcript levels (log2 fold > 2, at least 4 fold difference on average) in performed RNA-seq data generation. JCR contributed to technical design, RNA- comparison of six cultured clinical isolates and four long-term laboratory seq data generation and manuscript revision. GAA contributed to technical adapted clones, including all biological replicate samples. For genes among the design, managed sample collection and contributed to manuscript revision. top quartile of expression values genome-wide (top 18 genes in the table), DJC designed and supervised the study, contributed to technical design and members of multigene families and genes in which strain-specific deletions RNA-seq data analysis and interpretation, and managed manuscript drafting may be responsible for differences are annotated with asterisks (*). (XLSX 20 kb) and revision. All authors read and approved the final manuscript. Additional file 5: Figure S2. Normalised read counts for the ten most highly differentiated genes between cultured clinical isolates and laboratory- Ethics approval adapted clones. Individual sample replicates are plotted according to the time This study of clinical parasite transcriptomes was approved by the Ethics of overall peak transcriptome correlation with reference time course data committees of the Ghana Health Service, the Noguchi Memorial Institute for (either 40 or 48 h). Replicates from clinical isolates are in red (those having Medical Research, University of Ghana, the Kintampo Health Research Centre, peak correlation with 40 h are plotted), and replicates from laboratory isolates the Navrongo Health Research Centre and the London School of Hygiene are in blue. (PDF 312 kb) and Tropical Medicine. Written informed consent was obtained from parents Additional file 6: Table S3. Log2 fold differences in transcript levels of or legal guardians of participating children with malaria, and additional genes differentially expressed among pairwise comparisons of six assent was received from the children. cultured clinical isolates with multiple schizont preparations of each, among genes within the top quartile of expression overall. (XLSX 24 kb) Consent for publication Additional file 7: Figure S3. Differential expression of merozoite All authors declare consent for publication. invasion-related genes among schizonts from different parasite cultures. Distributions of read counts (normalised to library size) for eight genes, Competing interests for replicated laboratory-adapted and clinical isolate samples, showing The authors declare that they have no competing interests. data from each replicate culture preparation of each strain. (PDF 324 kb) Additional file 8: Figure S3. Gene expression levels for eight genes newly detected as differentially expressed among clinical isolates (Table Publisher’s Note 1). Distributions of read counts (normalised to library size) for eight Springer Nature remains neutral with regard to jurisdictional claims in genes, showing data from each replicate culture preparation of each published maps and institutional affiliations. strain. (PDF 84 kb) Additional file 9: Sequence annotation file Author details 1 Pf3D7.May2015.NoSplice.LSHTM.gtf (see Additional file 1 for explanatory Pathogen Molecular Biology Department, London School of Hygiene and 2 details). (GTF 2322 kb) Tropical Medicine, London, UK. Institute of Infection, Immunity and Additional file 10: Inflammation, University of Glasgow, Scotland, UK. 3Wellcome Sanger Sequence annotation file GTF_VarRifStev_filtered out.gtf (see Additional file 1 for explanatory details). (GTF 2102 kb) Institute, Hinxton, Cambridge, UK. 4West African Centre for Cell Biology of Infectious Pathogens, Department of Biochemistry, Cell and Molecular Additional file 11: RNA-seq comparison of gene expression in paired E64- Biology, University of Ghana, Legon, Ghana. treated and untreated P. falciparum 3D7 schizont preparations. (PDF 365 kb) Received: 21 August 2018 Accepted: 16 November 2018 Acknowledgements We are grateful to malaria patients who contributed blood samples for References analysis of parasite transcription, and to colleagues who supported the 1. Mata J, Wilbrey A, Bahler J. Transcriptional regulatory network for sexual clinical sample collection and initial laboratory processing. We are also differentiation in fission yeast. Genome Biol. 2007;8:R217. grateful to Mandy Sanders for support in sequencing some of the samples, 2. Cases I, de Lorenzo V, Ouzounis CA. Transcription regulation and and to Alistair Miles and Antoine Claessens for help towards identification of environmental adaptation in bacteria. Trends Microbiol. 2003;11:248–53. genomic coordinates of highly polymorphic sequences within genes. 3. Cestari I, Stuart K. Transcriptional regulation of telomeric expression sites and antigenic variation in trypanosomes. Curr Genomics. 2018;19:119–32. Funding 4. Bozdech Z, Llinás M, Pulliam BL, Wong ED, Zhu J, de Risi JL. The Funding for the study was granted by the European Research Council (AdG-2011- transcriptome of the intraerythrocytic developmental cycle of P falciparum. 294428, https://erc.europa.eu/) to DJC, the Leverhulme Trust-Royal Society PLoS Biol. 2003;1:e5. (AA110050, https://royalsociety.org/) to GAA and DJC, the Biotechnology and 5. Mackinnon MJ, Li J, Mok S, Kortok MM, Marsh K, Preiser PR, Bozdech Z. Biological Sciences Research Council (BBSRC, London Interdisciplinary Doctoral Comparative transcriptional and genomic analysis of Plasmodium falciparum Programme https://bbsrc.ukri.org/) to LM, SEH and DJC, and the Wellcome Trust field isolates. PLoS Pathog. 2009;5:e1000644. (098051, https://wellcome.ac.uk) to JCR. The funders had no role in study design, 6. Rovira-Graells N, Gupta AP, Planet E, Crowley VM, Mok S, Ribas de Pouplana data collection and analysis, decision to publish, or preparation of the manuscript. L, Preiser PR, Bozdech Z, Cortes A. Transcriptional variation in the malaria parasite Plasmodium falciparum. Genome Res. 2012;22:925–38. 7. Amambua-Ngwa A, Tetteh KK, Manske M, Gomez-Escobar N, Stewart LB, Availability of data and materials Deerhake ME, Cheeseman IH, Newbold CI, Holder AA, Knuepfer E, et al. The datasets supporting the conclusions of this article, including the Population genomic scan for candidate signatures of balancing read count matrices, are available in the Gene Expression Omnibus selection to guide antigen characterization in malaria parasites. PLoS (https://www.ncbi.nlm.nih.gov/geo/), entry: GSE113718 for the P. Genet. 2012;8:e1002992. falciparum RNA-seq data from the laboratory-adapted and cultured 8. Ochola LI, Tetteh KK, Stewart LB, Riitho V, Marsh K, Conway DJ. Allele clinical isolates, and in the European Nucleotide Archive (https://www.e- frequency-based and polymorphism-versus-divergence indices of balancing bi.ac.uk/ena), study: ERP103955 for the P. falciparum RNA-seq data from selection in a new filtered set of polymorphic genes in Plasmodium the first cycle ex vivo isolates. falciparum. Molr Biol Evol. 2010;27:2344–51. Tarr et al. BMC Genomics (2018) 19:894 Page 12 of 13 9. Osier FH, Mackinnon MJ, Crosnier C, Fegan G, Kamuyu G, Wanaguru M, Ogada composition from gene expression in human malaria. PLoS Comp Biol. E, McDade B, Rayner JC, Wright GJ, et al. New antigens for a multicomponent 2013;9:e1003392. blood-stage malaria vaccine. Sci Trans Med. 2014;6:247ra102. 30. Rono MK, Nyonda MA, Simam JJ, Ngoi JM, Mok S, Kortok MM, Abdullah AS, 10. Richards JS, Arumugam TU, Reiling L, Healer J, Hodder AN, Fowkes FJ, Cross Elfaki MM, Waitumbi JN, El-Hassan IM, et al. Adaptation of Plasmodium N, Langer C, Takeo S, Uboldi AD, et al. Identification and prioritization of falciparum to its transmission environment. Nature Ecol Evol. 2017;1. merozoite antigens as targets of protective human immunity to 31. Mok S, Ashley EA, Ferreira PE, Zhu L, Lin Z, Yeo T, Chotivanich K, Imwong M, Plasmodium falciparum malaria for vaccine and biomarker development. J Pukrittayakamee S, Dhorda M, et al. Drug resistance. Population Immunol. 2013;191:795–809. transcriptomics of human malaria parasites reveals the mechanism of 11. Tetteh KK, Osier FH, Salanti A, Kamuyu G, Drought L, Failly M, Martin C, artemisinin resistance. Science. 2015;347:431–5. Marsh K, Conway DJ. Analysis of antibodies to newly described Plasmodium 32. Cheeseman IH, Miller B, Tan JC, Tan A, Nair S, Nkhoma SC, De Donato M, falciparum merozoite antigens supports MSPDBL2 as a predicted target of Rodulfo H, Dondorp A, Branch OH, et al. Population structure shapes copy naturally acquired immunity. Infect Immun. 2013;81:3835–42. number variation in malaria parasites. Mol Biol Evol. 2016;33:603–20. 12. Stubbs J, Simpson KM, Triglia T, Plouffe D, Tonkin CJ, Duraisingh MT, Maier AG, 33. Otto TD, Wilinski D, Assefa S, Keane TM, Sarry LR, Bohme U, Lemieux J, Winzeler EA, Cowman AF. Molecular mechanism for switching of P. falciparum Barrell B, Pain A, Berriman M, et al. New insights into the blood-stage invasion pathways into human erythrocytes. Science. 2005;309:1384–7. transcriptome of Plasmodium falciparum using RNA-Seq. Mol Microbiol. 13. Gomez-Escobar N, Amambua-Ngwa A, Walther M, Okebe J, Ebonyi A, Conway 2010;76:12–24. DJ. Erythrocyte invasion and merozoite ligand gene expression in severe and 34. Peterson DS, Wellems TE. EBL-1, a putative erythrocyte binding protein of mild Plasmodium falciparum malaria. J Infect Dis. 2010;201:444–52. Plasmodium falciparum, maps within a favored linkage group in two 14. Bowyer PW, Stewart LB, Aspeling-Jones H, Mensah-Brown HE, Ahouidi AD, genetic crosses. Mol Biochem Parasitol. 2000;105:105–13. Amambua-Ngwa A, Awandare GA, Conway DJ. Variation in Plasmodium 35. Nacer A, Roux E, Pomel S, Scheidig-Benatar C, Sakamoto H, Lafont F, Scherf falciparum erythrocyte invasion phenotypes and merozoite ligand gene A, Mattei D. Clag9 is not essential for PfEMP1 surface expression in non- expression across different populations in areas of malaria endemicity. Infect cytoadherent Plasmodium falciparum parasites with a chromosome 9 Immun. 2015;83:2575–82. deletion. PLoS One. 2011;6:e29039. 15. Hansen KD, Wu Z, Irizarry RA, Leek JT. Sequencing technology does not 36. Henriques G, Hallett RL, Beshir KB, Gadalla NB, Johnson RE, Burrow R, van eliminate biological variability. Nat Biotechnol. 2011;29:572–3. Schalkwyk DA, Sawa P, Omar SA, Clark TG, et al. Directional selection at the 16. Elowitz MB, Levine AJ, Siggia ED, Swain PS. Stochastic gene expression in a pfmdr1, pfcrt, pfubp1, and pfap2mu loci of Plasmodium falciparum in single cell. Science. 2002;297:1183–6. Kenyan children treated with ACT. J Infect Dis. 2014;210:2001–8. 17. Schurch NJ, Schofield P, Gierlinski M, Cole C, Sherstnev A, Singh V, Wrobel 37. Filarsky M, Fraschka SA, Niederwieser I, Brancucci NMB, Carrington E, Carrio N, Gharbi K, Simpson GG, Owen-Hughes T, et al. How many biological E, Moes S, Jenoe P, Bartfai R, Voss TS. GDV1 induces sexual commitment of replicates are needed in an RNA-seq experiment and which differential malaria parasites by antagonizing HP1-dependent gene silencing. Science. expression tool should you use? RNA. 2016;22:839–51. 2018;359:1259–63. 18. Seo J, Gordish-Dressman H, Hoffman EP. An interactive power analysis 38. Brancucci NMB, Bertschi NL, Zhu L, Niederwieser I, Chin WH, Wampfler R, tool for microarray hypothesis testing and generation. Bioinformatics. Freymond C, Rottmann M, Felger I, Bozdech Z, et al. Heterochromatin 2006;22:808–14. protein 1 secures survival and transmission of malaria parasites. Cell Host 19. Busby MA, Stewart C, Miller CA, Grzeda KR, Marth GT. Scotty: a web tool for Microbe. 2014;16:165–76. designing RNA-Seq experiments to measure differential gene expression. 39. Carter R, Graves PM, Creasey A, Byrne K, Read D, Alano P, Fenton B. Bioinformatics. 2013;29:656–7. Plasmodium falciparum: an abundant stage-specific protein expressed 20. Llinas M, Bozdech Z, Wong ED, Adai AT, DeRisi JL. Comparative whole during early gametocyte development. Exp Parasitol. 1989;69:140–9. genome transcriptome analysis of three Plasmodium falciparum strains. 40. Lu XM, Batugedara G, Lee M, Prudhomme J, Bunnik EM, Le Roch KG. Nucleic Acids Res. 2006;34:1166–73. Nascent RNA sequencing reveals mechanisms of gene regulation in the 21. Claessens A, Affara M, Assefa SA, Kwiatkowski DP, Conway DJ. Culture human malaria parasite Plasmodium falciparum. Nucleic Acids Res. 2017; adaptation of malaria parasites selects for convergent loss-of-function 45:7825–40. mutants. Sci Rep. 2017;7:41303. 41. Fraschka SA, Filarsky M, Hoo R, Niederwieser I, Yam XY, Brancucci NMB, 22. Kafsack BF, Rovira-Graells N, Clark TG, Bancells C, Crowley VM, Campino SG, Mohring F, Mushunje AT, Huang X, Christensen PR, et al. Comparative Williams AE, Drought LG, Kwiatkowski DP, Baker DA, et al. A transcriptional heterochromatin profiling reveals conserved and unique epigenome switch underlies commitment to sexual development in malaria parasites. signatures linked to adaptation and development of malaria parasites. Cell Nature. 2014;507:248–52. Host Microbe. 2018;23:407–20. 23. Cheeseman IH, Gomez-Escobar N, Carret CK, Ivens A, Stewart LB, Tetteh KK, 42. Ntumngia FB, Bouyou-Akotet MK, Uhlemann AC, Mordmuller B, Kremsner PG, Conway DJ. Gene copy number variation throughout the Plasmodium Kun JF. Characterisation of a tryptophan-rich Plasmodium falciparum antigen falciparum genome. BMC Genomics. 2009;10:353. associated with merozoites. Mol Biochem Parasitol. 2004;137:349–53. 24. Jeffares DC, Pain A, Berry A, Cox AV, Stalker J, Ingle CE, Thomas A, Quail MA, 43. Morita M, Takashima E, Ito D, Miura K, Thongkukiatkul A, Diouf A, Fairhurst Siebenthall K, Uhlemann AC, et al. Genome variation and evolution of the RM, Diakite M, Long CA, Torii M, et al. Immunoscreening of Plasmodium malaria parasite Plasmodium falciparum. Nat Genet. 2007;39:120–5. falciparum proteins expressed in a wheat germ cell-free system reveals a 25. Nair S, Nkhoma S, Nosten F, Mayxay M, French N, Whitworth J, Anderson T. novel malaria vaccine candidate. Sci Rep. 2017;7:46086. Genetic changes during laboratory propagation: copy number at the 44. Robinson MD, Smyth GK. Moderated statistical tests for assessing differences reticulocyte-binding protein 1 locus of Plasmodium falciparum. Mol in tag abundance. Bioinformatics. 2007;23:2881–7. Biochem Parasitol. 2010;172:145–8. 45. Balaji S, Babu MM, Iyer LM, Aravind L. Discovery of the principal specific 26. Daily JP, Scanfeld D, Pochet N, Le Roch K, Plouffe D, Kamal M, Sarr O, transcription factors of Apicomplexa and their implication for the evolution Mboup S, Ndir O, Wypij D, et al. Distinct physiological states of Plasmodium of the AP2-integrase DNA binding domains. Nucleic Acids Res. 2005;33: falciparum in malaria-infected patients. Nature. 2007;450:1091–5. 3994–4006. 27. Pelle KG, Oh K, Buchholz K, Narasimhan V, Joice R, Milner DA, Brancucci NM, 46. Zhang M, Wang C, Otto TD, Oberstaller J, Liao X, Adapa SR, Udenze K, Ma S, Voss TS, Ketman K, et al. Transcriptional profiling defines dynamics of Bronner IF, Casandra D, Mayho M, et al. Uncovering the essential genes of parasite tissue sequestration during malaria infection. Genome Medicine. the human malaria parasite Plasmodium falciparum by saturation 2015;7:19. mutagenesis. Science. 2018;360. 28. Lemieux JE, Gomez-Escobar N, Feller A, Carret C, Amambua-Ngwa A, 47. Duffy S, Avery VM. Routine in vitro culture of Plasmodium falciparum: Pinches R, Day F, Kyes SA, Conway DJ, Holmes CC, et al. Statistical experimental consequences? Trends Parasitol. 2018;34:564–75. estimation of cell-cycle progression and lineage commitment in 48. Kumar K, Srinivasan P, Nold MJ, Moch JK, Reiter K, Sturdevant D, Otto TD, Squires Plasmodium falciparum reveals a homogeneous pattern of transcription in RB, Herrera R, Nagarajan V, et al. Profiling invasive Plasmodium falciparum ex vivo culture. Proc Natl Acad Sci U S A. 2009;106:7559–64. merozoites using an integrated omics approach. Sci Rep. 2017;7:17146. 29. Joice R, Narasimhan V, Montgomery J, Sidhu AB, Oh K, Meyer E, Pierre-Louis 49. Doerig C, Rayner JC, Scherf A, Tobin AB. Post-translational protein W, Seydel K, Milner D, Williamson K, et al. Inferring developmental stage modifications in malaria parasites. Nat Rev Microbiol. 2015;13:160–72. Tarr et al. BMC Genomics (2018) 19:894 Page 13 of 13 50. Jones ML, Collins MO, Goulding D, Choudhary JS, Rayner JC. Analysis of protein palmitoylation reveals a pervasive role in Plasmodium development and pathogenesis. Cell Host Microbe. 2012;12:246–58. 51. Poran A, Notzel C, Aly O, Mencia-Trinchant N, Harris CT, Guzman ML, Hassane DC, Elemento O, Kafsack BFC. Single-cell RNA sequencing reveals a signature of sexual commitment in malaria parasites. Nature. 2017;551:95–9. 52. Reid AJ, Talman AM, Bennett HM, Gomes AR, Sanders MJ, Illingworth CJR, Billker O, Berriman M, Lawniczak MK. Single-cell RNA-seq reveals hidden transcriptional variation in malaria parasites. eLife. 2018;7:e33105. 53. Walzer KA, Kubicki DM, Tang X, Chi JT. Single-cell analysis reveals distinct gene expression and heterogeneity in male and female Plasmodium falciparum gametocytes. mSphere. 2018;3:e00130. 54. Lee HJ, Georgiadou A, Otto TD, Levin M, Coin LJ, Conway DJ, Cunnington AJ. Transcriptomic studies of malaria: a paradigm for investigation of systemic host-pathogen interactions. Microbiol Mol Biol Rev. 2018;82:e00071. 55. Kozarewa I, Ning Z, Quail MA, Sanders MJ, Berriman M, Turner DJ. Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes. Nat Methods. 2009;6:291–5. 56. Kim D, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nat Methods. 2015;12:357–60. 57. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, Genome Project Data Processing S. The sequence alignment/ map format and SAMtools. Bioinformatics. 2009;25:2078–9. 58. Lawrence M, Huber W, Pages H, Aboyoun P, Carlson M, Gentleman R, Morgan MT, Carey VJ. Software for computing and annotating genomic ranges. PLoS Comp Biol. 2013;9:e1003118. 59. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. 60. Salanti A, Staalsoe T, Lavstsen T, Jensen AT, Sowa MP, Arnot DE, Hviid L, Theander TG. Selective upregulation of a single distinctly structured var gene in chondroitin sulphate A-adhering Plasmodium falciparum involved in pregnancy-associated malaria. Mol Microbiol. 2003;49:179–91.