Original Research Highlight article High-throughput genotyping assays for identification of glycophorin B deletion variants in population studies Dominic SY Amuzu1,2,3 , Kirk A Rockett3,4, Ellen M Leffler4,5, Felix Ansah1,2, Nicholas Amoako1,2, Collins M Morang’a1,2, Christina Hubbart3 , Kate Rowlands3, Anna E Jeffreys3 , Lucas N Amenga-Etego1,2, Dominic P Kwiatkowski3,4,6 and Gordon A Awandare1,2 1West African Centre for Cell Biology of Infectious Pathogens, University of Ghana, Accra, GH 0233, Ghana; 2Department of Biochemistry, Cell and Molecular Biology, University of Ghana, Accra, GH 0233, Ghana; 3Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, UK; 4Wellcome Sanger Institute, Hinxton CB10 1SA, UK; 5Department of Human Genetics, Eccles Institute of Human Genetics, University of Utah, Salt Lake City, UT 84112-5330, USA; 6Big Data Institute, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7FZ, UK Corresponding authors: Kirk A Rockett. Email: krockett@well.ox.ac.uk; Gordon A Awandare. Email: gawandare@ug.edu.gh Impact statement Abstract Glycophorins are of interest because they Glycophorins are the most abundant sialoglycoproteins on the surface of human erythro- serve as receptors for pathogens including cyte membranes. Genetic variation in glycophorin region of human chromosome 4 (con- Plasmodium sp., Babesia sp., Influenza virus, Vibrio cholerae El Tor Hemolysin, and taining GYPA, GYPB, and GYPE genes) is of interest because the gene products serve as Escherichia coli. Variation in the genes receptors for pathogens of major public health interest, including Plasmodium sp., Babesia encoding these receptors may be impor- tant in influencing disease susceptibility. sp., Influenza virus, Vibrio cholerae El Tor Hemolysin, and Escherichia coli. A large structural Due to high sequence homology (96) rearrangement and hybrid glycophorin variant, known as Dantu, which was identified in between glycophorins A, B, and E genes, it is challenging designing assays to geno- East African populations, has been linked with a 40% reduction in risk for severe malaria. type specific variations in the three genes. Apart fromDantu, other large structural variants exist, with themost common being deletion This work reports the development of two of the whole GYPB gene and its surrounding region, resulting in multiple different deletion separate high-throughput assays for reli- ably genotyping glycophorin B deletions forms. In West Africa particularly, these deletions are estimated to account for between 5 (DEL1 and DEL2) on large-scale. This is and 15% of the variation in different populations, mostly attributed to the forms known as important for population prevalence stud- ies and identification of affected individuals DEL1 and DEL2. Due to the lack of specific variant assays, little is known of the distribution for investigating the functional effects of of these variants. Here, we report a modification of a previous GYPB DEL1 assay and the the gene variation. Furthermore, the work identified the location of the breakpoint for development of a novel GYPB DEL2 assay as high-throughput PCR-RFLP assays, as well GYPB DEL2, which is important for under- as the identification of the crossover/breakpoint for GYPB DEL2. Using 393 samples from standing the gene cluster and the deletion mechanisms. three study sites in Ghana as well as samples from HapMap and 1000G projects for val- idation, we show that our assays are sensitive and reliable for genotyping GYPB DEL1 and DEL2. To the best of our knowledge, this is the first report of such high-throughput genotyping assays by PCR-RFLP for identifying specific GYPB deletion types in populations. These assays will enable better identification of GYPB deletions for large genetic association studies and functional experiments to understand the role of this gene cluster region in susceptibility to malaria and other diseases. Keywords: Glycophorins, malaria, Plasmodium, red blood cell, invasion, GYPB deletion Experimental Biology and Medicine 2021; 246: 916–928. DOI: 10.1177/1535370220968545 Introduction sub-Saharan Africa (sSA). It is estimated that every year Malaria is still an important public health issue worldwide about 216 million cases of malaria and 445,000 deaths and the leading cause of death among children in occur globally, with sSA being the most affected.1 ISSN 1535-3702 Experimental Biology and Medicine 2021; 246: 916–928 Copyright ! 2020 by the Society for Experimental Biology and Medicine Amuzu et al. Genotyping assays for identification of glycophorin B deletion variants 917 ............................................................................................................................................................... Plasmodium falciparum, which is responsible for most of whereby GYPB evolved from GYPA, and GYPE evolved these deaths, has evolved complex machinery for invading from GYPB.9,25,26 erythrocytes. The mechanism is mediated by multiple Recently, a large structural variant in the chromosome 4 redundant parasite ligands and specific human host recep- GYP region that gives rise to the Dantu glycophorin (DUP4) tors on the surface of erythrocytes to facilitate invasion.2–4 has been associated with about a 40% reduction in risk for 9,27 Many studies have shown that host-pathogen factors influ- severe malaria. Interestingly, this Dantu variant was ence malaria outcomes and most certainly the parasite has found predominantly in East African populations, but affected the evolution of the human genome over the there are many other common structural variants across all years.5,6 Several genetic host factors from single nucleotide the West and East African populations that have been stud- 9 polymorphisms to large structural variants are known to ied. The most common variants identified were deletions of influence an individual’s susceptibility or resistance the whole GYPB gene and the surrounding region known as 7–10 GYPB DEL1 and GYPB DEL2 (Figure 1). However, littleto malaria. research has been conducted on these variants and their Glycophorins (GYP) are glycosylated sialoglycoproteins population distributions due to the lack of high throughput found on the surface of human and animal erythrocytes.11 methods for genotyping these structural variants. With the Human GYPA and GYPB are determinants of the major difficulties in screening for these deletions and other var- MNS blood group system while GYPC and GYPD are iants, there is also lack of functional data on the effect determinant of the Gerbich Blood Group. Some of these of GYPB deletions on erythrocyte invasion, the growth of glycophorins have also been shown to be receptors on P. falciparum, and the changes that occur on the surface of erythrocytes use by the P. falciparum to invade these cells. the erythrocytes with respect to protein expression. Here, we In addition, these glycophorins serve as receptors for other show the development of two separate high-throughput pathogens such as Babesia sp., Influenza virus, encephalo- assays for reliably detecting and genotypingGYPB deletions myocarditis virus, Vibrio cholerae El Tor Hemolysin, and E. DEL1 and DEL2 that can be used to determine their distri- coli.11–15 These include GYPA which interacts with the P. bution in populations and identify phenotypes functional falciparum protein erythrocyte binding antigen (EBA)- investigations of these deletions on P. falciparum erythrocyte 175,16–18 GYPB which interacts with erythrocyte binding invasion and growth. Using data from 393 samples from ligand 1 (EBL-1),19,20 and GYPC which interacts with different ethnic populations in southern Ghana as well as EBA-140, all in a sialic acid-dependent manner.21–23 The DNA samples from the HapMap and 1000G projects, we GYPE gene is not known to be expressed as a protein on show the development of high throughput assays for the erythrocyte surface.24 GYPBDEL1.We also DEL2 and show that these are sensitive The genes GYPE, GYPB, and GYPA are located in a gene and reliable for screening population samples with little cluster on the long arm of chromosome 4 (4q28-q31) interference from other glycophorin structural variants. approximately 360 kb long with each gene segmental dupli- cation unit (SDU) spanning 120 kb comprised of a Materials and methods gene region of 30 kb and an intergenic region of 90 kb (Figure 1). GYPC and GYPD are located on chromosome 2 Location of putative breakpoints for the GYPB and are not discussed here. The GYPA, GYPB, and GYPE whole-gene deletions DEL1 and DEL2 genes are evolutionarily related, with at least 95% sequence The breakpoints for DEL1 have previously been located on homology between them resulting from duplication events, GRCh37 at chr4:144835160–144835280 (4:143914007– Figure 1. Schematic diagram of the human referenceGYP gene region on chromosome 4. The threeGYP segmental duplication units (SDUs) are indicated by different colors. The GYP gene-region boundary-locations and genes are shown with respect to GRCh37. The approximate locations of the DEL1 and DEL2 deletions are shown. Intergenic region names used in the main text are shown.9 918 Experimental Biology and Medicine Volume 246 April 2021 ............................................................................................................................................................... 143914127 in GRCh38) in the 30 region of the GYPE unit of GYPB and GYPB-GYPA regions (Figure 2(a) and 2(b)) but the SDU, and chr4:144945398–144945517 (4:144024245– placed to generate a shorter PCR amplicon (2 kb) than 144024364 in GRCh38) in the 30 region of the GYPB unit that published. From the human genome reference of the SDU,9 while the predicted location of the DEL2 sequence alignments (Supplementary File 1) of the equiva- breakpoint was given as 206,000 bases from the 50 end lent SDUs, a restriction enzyme (AciI) site was identified of the GYP region (GRCh37:4:144706830, GRCh38:4: that distinguished between the wild-type and DEL1 143785677).9 We downloaded 4 kb of sequence surrounding sequences (Figure 2(a) and (b)). the DEL1 or DEL2 putative breakpoint coordinates for each The assay for GYPB DEL2 used a strategy similar to that of the GYPE, GYPB, and GYPA SDUs, from Ensembl of the GYPB DEL1 assay, but with a unique primer such (http://grch37.ensembl.org/Homo_sapiens/). The sequen- that at theGYPA end of the sequence (reverse primer) and a ces were aligned using Clustal Omega (www.ebi.ac.uk/ common primer placed at the 50 end (forward primer) Tools/msa/clustalo/) and manually finished where (Figure 2(c) and (d)). From the human genome reference required (Figure 1 and Supplementary Files 1 and 2). sequence alignments (Supplementary File 2) of the equiva- While initial designs were made using GRCh37, we have lent SDUs, a restriction enzyme (BsrBI) site was identified since compared our analysis with GRCh38 and provided that distinguished between the wild-type and GYPB DEL2 coordinates with respect to GRCh38 or both where sequences (Figure 2(c) and (d)). For both GYPB DEL1 and appropriate. DEL2, the restriction enzymes used were non-palindromic and therefore strand oriented (Table 1, Supplementary Files Assay design for DEL1 and DEL2 putative breakpoint 2 to 5). regions We developed a PCR-RFLP version of the published DEL1 Generation of GYPB variant control DNA assay9 using a similar primer strategy. A forward primer Cell lines from the 1000G and associated projects with was positioned in the unique sequence 30 to the GYPE gene known GYPB deletions or wild-type (identified from the with a common reverse primer positioned in the GYPE- Leffler et al.9 study and Table 2) were identified and these Figure 2. Schematic representation of strategies for amplifying and testing for the GYPB DEL1 and DEL2 structural variants. (a) Schematic representation of the alignment for theGYP SDUs showing the location of PCR primers (blue rectangles), putative breakpoint (gold rectangle), and AciI restriction site (yellow rectangle). The forward primer GYP_DEL1_F10 is specific to upstream of GYPE in the GYPE-GYPB region. The reverse primer GYPB_DEL1_R2B5 binds to the upstream of the GYPB gene in the GYB-GYPA region. In a normal or wild type individual, the GYPB_DEL1_R2B5 in the GYPE-GYPB region and the GYP_DEL1_F10 forward primer forms a PCR product made of sequences in the GYPE_GYPB region. In the GYPB DEL1 state, the PCR product formed is made of sequences in the GYPE and GYPA region because the GYPB is deleted. (b) Alternate schematic representation of the GYPB DEL1 RFLP assay showing a normal chromosome and the GYPB DEL1 chro- mosomes aligned. Genes (green, orange, and purple rectangles) and primers (blue rectangle) are indicated as well as the AciI restriction site and PCR-digestion fragment lengths. (c) Schematic representation of the alignment for theGYP SDUs showing the location of PCR primers, putative breakpoint (gold rectangle), and BsrBI restriction site (yellow rectangle). The forward primer GYP_DEL2_F3 is common to the GYPA downstream of the GYPB-GYPA region. The reverse primer GYPB_DEL2_R3 specifically binds to the upstream of the GYPE gene in the GYPE-GYPB region. In a normal or wild type individual, the GYPB_DEL2_F3 in the GYPB- GYPA region and the GYP_DEL2_R3 primer forms a PCR product made of sequences in theGYPB_GYPA region. In theGYPB DEL2 state, the PCR amplicon formed is made of sequences in the GYPE and GYPA region because the GYPB is deleted. (d) Alternate schematic of the GYPB DEL2 RFLP assay showing a normal chro- mosome and the GYPB DEL2 chromosomes aligned. Genes (green, orange, and purple rectangles) and primers (blue rectangle) are indicated as well as the BsrBI restriction site (red dotted rectangular area) and PCR-digestion fragment lengths. Coordinates of sequences are given with respect to GRCh38. Amuzu et al. Genotyping assays for identification of glycophorin B deletion variants 919 ............................................................................................................................................................... cell-lines or their genomic DNA were sourced from the NHGRI repository (Coriell Institute, NY, USA [https:// www.coriell.org/1/Browse/Biobanks]). GYPB DEL1 and DEL2 PCR conditions and restriction digest QIAGEN Fast Cycling PCR Kit (Qiagen, UK) was used for both theGYPBDEL1 and DEL2 PCR reactions as detailed in Table 3a and 3b, with primers purchased from IDT (Leuven, Belgium). Restriction enzymes AciI (Catalog number: R0551L, NEB, UK) and BsrBI (Catalog number: R0102L, NEB, UK) were purchased (Table 4a). Reactions were prepared in 96-well plates (#AB-800, ThermoFisher Scientific, UK) and cycled on an MJ Tetrad (BioRad, UK) as described in Table 3a and 3b. The PCR products were digested with the relevant restriction enzymes (NEB, UK) for 2 hours at 37C and then the digestion fragments were separated on 1% agarose gel electrophoresis containing ethidium bromide (3 ng/uL) for 2–21=2 hours. Products were visualized under UV light and photographed to allow genotype assignment. All plates contained control samples obtained from the NHGRI repository (Table 2). Genotyping cell lines We tested the DEL1 and DEL2 assays on several cell lines (Table 2) that were identified from whole genome sequence analysis as having different GYP variants9 to check for cross-reactions or aberrant products. Screening for GYP Dantu (DUP4) Samples were screened for the glycophorin variant Dantu (DUP4) using the assay described by Leffler et al.9 Ethical approval and screening population in Ghana for GYPB DEL1 and DEL2 Ethical approval for two ongoing studies on glycophorins and malaria was granted by the Ethics Committee for Basic and Applied Sciences, College of Basic and Applied Sciences, University of Ghana (CPN: ECBAS 037/18–19), and the Noguchi Memorial Institute for Medical Research IRB, University of Ghana (CPN 004/11–12). Written informed consent was obtained from all the study partici- pants or their parents/guardians in the case of the children. The assays developed were used to genotype DNA samples obtained from volunteers who had been enrolled in various ongoing studies in three areas of Ghana namely: Accra, Kintampo, and Hohoe. Venous blood samples were collected and following curation of self-reported ethnicity to only include individ- uals of Ghanaian origin, comprised; Kintampo (n¼ 147), Hohoe (n¼ 43), and Accra (n¼ 203) (Figure 6, Table 5). Genomic DNA was extracted using the Qiagen QIAmp Blood Mini Kit or Chelex-100 as described for different batches of samples.28 The DNA samples were quantified using Picogreen as described above. The gDNA samples were diluted to 20 ng/mL and stored in 96-well PCR plates at 20C until ready for genotyping. GYPB DEL1 Table 1. Samples selected for testing and sequencing to identify the breakpoints for GYPB DEL1 and DEL2. Use GYP Target Primers Sequence (5’-3’) Dir GC (%) T (om C) GYP Region GRCh38 Location GRCh38 Location PCR/sequencing DEL1 GYP_DEL1_F10 GGACTGCCGCATGTTCAG Fwd 61 53 GYPE-GYPB 4:144833378-144833395 4:143912225-143912242 PCR/sequencing DEL1 GYP_DEL1_R2B5 CTCTGGTAGCCCTCCTCAAG Rev 60 56 GYPE-GYPB 4:144835652-144835671 4:143914499-143914518 GYPB-GYPA 4:144945890-144945909 4:144024737-144024756 GYPA-3’ 4:145067236-145067255 4:144146083-144146102 PCR/sequencing DEL2 DEL2_GYPEBAc_F3 GGTCATGAGAAAACGTTTGAATTTTCCAG Fwd 37.9 59 5’-GYPE 4:144791036-144791060 4:143869883-143869907 GYPE-GYPB 4:144911831-144911855 4:143990678-143990702 GYPB-GYPA 4:145015096-145015120 4:144093943-144093967 PCR/sequencing DEL2 DEL2_GYPBAs_R3 CAGTTCTGCCAACTCTCATCTT Rev 45 56 GYPB-GYPA 4:145017216-145017238 4:144096063-144096085 Sequencing DEL2 DEL2_BP_seq_Rev1 CTATGGGTCCCTCTCTGTGGA Rev 5’-GYPE 4:144792394-144792414 4:143870624-143870648 GYPE-GYPB 4:144913188-144913208 4:143991419-143991443 GYPB-GYPA 4:145016443-145016458 4:144094684-144094708 Sequencing DEL2 DEL2_BP_seq_Fwd CATGTCTCACATCCAGTTAATGCTG Fwd 5’-GYPE 4:144791777-144791801 4:143871241-143871261 GYPE-GYPB 4:144912572-144912596 4:143992035-143992055 GYPB-GYPA 4:145015837-145015861 4:144095290-144095305 Note: Primers were designed using Primer3 and purchased from IDT (see methods). Alignments are shown in the GYP region. One primer from each PCR (GYP_DEL1_F10 and DEL2_GYPBAs_R3) was unique to the GYPB DEL1 and GYPB DEL2 breakpoints respectively while the other primers were designed against homologous sequence (GYP_DEL1_R2B5 and DEL2_GYPEBAc_F3). Two primers were designed to internal regions of the GYPB DEL2 amplicon to aid with sequencing (DEL2_BP_seq_REV1 and DEL2_BP_seq_FWD) 920 Experimental Biology and Medicine Volume 246 April 2021 ............................................................................................................................................................... Table 2. Samples selected testing and sequencing to identify DEL1 and DEL2 Breakpoints Coriell Identifier* Population Code Population Country Genotype DEL1 DEL2 HG00097 GBR British UK N/N II II GM06985 CEU CEPH USA N/N II II GM06986 CEU CEPH USA N/N II II GM06994 CEU CEPH USA N/N II II GM18522 YRI Yoruba Nigeria N/N II II GM19140 YRI Yoruba Nigeria N/N II II GM19141 YRI Yoruba Nigeria N/N II II GM19152 YRI Yoruba Nigeria N/N II II GM18523 YRI Yoruba Nigeria DEL1/N DI II GM19207 YRI Yoruba Nigeria DEL1/N DI II GM19223 YRI Yoruba Nigeria DEL1/N DI II HG02464 GWD Mandinka Gambia DEL1/DEL1 DD II HG02545 ACB Afro-Caribbean Barbados DEL1/DEL1 DD II HG03072 MSL Mende Sierra Leone DEL1/DEL1 DD II HG03139 ESN Essan Nigeria DEL1/DEL1 DD II GM18519 YRI Yoruba Nigeria DEL1/DEL1 DD II GM18856 YRI Yoruba Nigeria DEL2/N II DI GM19144 YRI Yoruba Nigeria DEL2/N II DI GM17125 ASW African-American USA DEL2/N II DI HG03385 MSL Mende Sierra Leone DEL2/DEL2 II DD GM20867 GIH Gujarati Indian USA DEL14/N XX II GM18858 YRI Yoruba Nigeria DEL17/N XX XX HG02586 GWD Mandinka Gambia DUP1/N XX II HG02588 GWD Mandinka Gambia DUP1/N II II GM18502 YRI Yoruba Nigeria DUP1/N II XX GM18870 YRI Yoruba Nigeria DUP1/N XX II HG02250 CDX Dai China DUP2/N II II HG02798 GWD Mandinka Gambia DUP2/N II II GM18552 CHB Han China DUP2/N II II GM18593 CHB Han China DUP2/N XX II GM18605 CHB Han China DUP2/N II II GM12829 CEU CEPH USA DUP23/N XX XX GM12249 CEU CEPH USA DUP28/N II II HG02554 ACB Afro-Caribbean Barbados DUP4/N II II HG02585 GWD Mandinka Gambia DUP6/N II II GM18545 CHB Han China TRP1/N XX II GM18620 CHB Han China TRP1/N XX II GM12341 CEU CEPH USA TRP13/N II II GM11894 CEU CEPH USA TRP5/N XX II GM18852 YRI Yoruba Nigeria UNK XX II GM19221 YRI Yoruba Nigeria UNK XX II Ghana Identifier§ GX0387 Ghana Ghana N/N II II GX0540 Ghana Ghana N/N II II GX0600 Ghana Ghana N/N II II GX0610 Ghana Ghana N/N II II GX0531 Ghana Ghana N/N II II GX0258 Ghana Ghana DEL1/DEL1 DD II GX0458 Ghana Ghana DEL1/DEL1 DD II GX0537 Ghana Ghana DEL1/DEL1 DD II GX0403 Ghana Ghana DEL2/DEL2 II DD GX0440 Ghana Ghana DEL2/DEL2 II DD GX0300 Nigeria Nigeria DEL1/DEL2 DI DI Note: A set of 1000G samples obtained from the NHGRI repository at the Coriell Institute, NY, USA, and with known GYP variation status of GYPB DEL1 homozygote or DEL2 homozygote (Leffler et al., 2017) were selected for sequencing. A second set of samples, selected from the Ghana sample collection described in this manuscript, were also sequenced following GYPB DEL1 or DEL2 status identification from the RFLP assays described in this manuscript. N/N; wild-type or at least not GYPB DEL1/DEL2 positive; GYPB DEL1/DEL1: DEL1 homozygous; GYPB DEL2/DEL2: DEL2 homozygous; GYPB DEL1/DEL2: Heterozygous *Coriell identifier for 1000G and HapMap samples (https://www.coriell.org/1/Browse/Biobanks) §Ghana identifier is an anonymised identifier with no relationship to local identifiers. This individual was collected at the Accra study site and was later found to be of Nigerian origin. Amuzu et al. Genotyping assays for identification of glycophorin B deletion variants 921 ............................................................................................................................................................... Table 3. PCR assay conditions for detecting GYPB DEL1 and DEL2 A: PCR Reaction Components Volume (mL) Reagent Stock DEL1 DEL2 gDNA Template 20 ng/ml 1 1 Forward Primer 10 mM 0.25 0.75 Reverse Primer 10 mM 0.25 0.25 Qiagen Fast Cycling PCR Kit 2X 2.5 2.5 MilliQ Water - 7 6.5 Total - 11.00 11.00 B: PCR Cycling Conditions DEL1 DEL2 PCR Step Description Temp (C) Time Temp (C) Time 1 Initial Denaturation 95 5 min 95 5 min 2 Denaturation 96 5 sec 96 5 sec 3 Annealing 59 6 sec 55 6 sec 4 Extension 68 3 min 15 sec 68 3 min 15 sec 5 Repeats Step 2, x39 Step 2, x39 6 Final extension 72 1 min 72 1 min 7 End 15 4 min 15 4 min A: PCR volumes and concentration for amplifying the GYPB DEL1 and DEL2. B: PCR conditions for the amplifying the GYPB DEL1 and DEL2. Table 4. Restriction Digest Conditions for Detecting GYPB DEL1 and DEL2 A: Restriction Digest conditions Volume (mL) Reagent Stock DEL1 DEL2 Enzyme 10U/mL 0.2 0.2 Cat smart buffer 10x 1 1.2 Water - 3.8 5.6 PCR reaction - 5 5 Total Reaction Vol (mL) - 10 12 B: Restriction Digest Products Genotype DEL1 (AciI Enzyme) DEL2 (BsrBI Enzyme) Recognition site 5’-CCGC-3’ 5’-CCGCTC-3’ Product code NEB: R0551L NEB: R0102L Non-DEL1-DEL2 Sample (3, 7, 29, 38), 1921, 296 804, 29, 1318 Homozygous Sample (3, 7, 29, 38), 2225 2141 Heterozygous Sample (3, 7, 29, 38), 1921, 296, 2225 804, 29, 1318, 2141 Note: 2 hours digest at 37C then 5 minutes at 65C to inactivate the enzyme (optional). Add 5 mL loading gel to the full reaction. Load 10 mL onto a 1% agarose gel with ethidium bromide. 100 V for 2 - 21=2 hours using Bioline Hyperladder 1 kB (BIO-33053). and DEL2 genotyping assays were undertaken as given in sequencing company and sent for Sanger sequencing Tables 3 and 4. by Eurofins Genomics (Ebersberg, Germany [https:// www.eurofinsgenomics.eu/en/custom-dna-sequencing/ Sequence analysis of GYPB DEL1 and DEL2 PCR eurofins-services]). The sequence data were inspected and products curated using Chromas (https://technelysium.com.au/ For Sanger sequencing, PCR amplicons were separated wp/chromas/) to generate FASTA files for the different on the agarose gels and extracted from the gel using sequencing reactions. These data were aligned using the the Qiagen PCR gel-extraction kit (Qiagen QIAquick multiple-sequence-alignment tool Clustal Omega Gel Extraction) as described by the manufacturer. The con- (https://www.ebi.ac.uk/Tools/msa/clustalo/), after centration of the DNA recovered was determined which pile-ups were manually curated and residues anno- by Quant-i Picogreen assay (Invitrogen, UK). Samples tated according to the consensus sequence with respect to were prepared following instructions described by the the PCR amplicon primers. Paralogous sequence 922 Experimental Biology and Medicine Volume 246 April 2021 ............................................................................................................................................................... Table 5. Details of Sample Collection from 3 Study Sites in Ghana Age (years) Ethnicity Location Gender Mean range Akan Ewe Ga Gurunsi Konkomba Mo Other Total Accra 5.44  4.18 1-15 78 48 63 5 0 0 9 203 Female 5.15  4.21 1-15 48 22 30 4 0 0 3 107 Male 5.67  4.14 1-15 30 26 33 1 0 0 8 98 Hohoe Adults >18 3 35 2 1 0 0 2 43 Female Adults > 18 2 21 0 1 0 0 2 26 Male Adults > 18 1 14 2 0 0 0 0 17 Kintampo 3.25  2.85 1-15 17 2 0 24 20 21 63 147 Female 3.39  3.72 1-12 5 2 0 9 8 13 24 61 Male 3.15  2.93 1-15 12 0 0 15 12 8 39 86 Overall 4.52  3.83 98 85 65 30 20 21 76 393 Note: The six named ethnic groups were represented by more than 20 samples in the full sample set. Figure 3. GYPB DEL1 (a) and DEL2 (b) assays on cell lines with known GYP States. PCR using the GYPB DEL1 (a) or DEL2 (b) primers was carried out on the samples followed by AciI or BsrBI restriction enzyme digestion, respectively. (a) DEL1 assay; The first seven wells (1–7) contain wild-type cell lines giving bands at 1.9 kb and 0.3 kb (V and W, respectively), five wells (11–15) contain homozygous cell-lines which give a single uncut band at 2.2 kb (U), while three cell-lines heterozygous for DEL1 (lanes 8–10) show two upper bands and one small band (2.3 kb [U], 1.9 kb [V], and 0.3 kb [W]). Lanes 16–19 areGYPB DEL2 positive cell lines that are all cut by the AciI enzyme (1.9 kb [V] and 0.3 kb [W]) indicating “normal” or non-DEL1. (b) GYPB DEL2 assay; The first seven wells (1–7) contain wild-type cell lines giving a single uncut band at 2.1 kb (X); lane (19) contains aGYPB DEL2 homozygous cell line giving two bands (1.3 kb [Y] and 0.8 kb [Z]); three wells (16–18) contain heterozygous cell-lines which give three bands (2.1 kb [X], 1.3 kb [Y], and 0.8 kb [Z]). Lanes 8–15 are DEL1 positive cell lines that are not cut by BsrBI indicating “normal” or non-DEL2. differences between the three genes SDUs were used to (b) and Supplementary Figure 2. The DEL2 deletion assay confirm the PCR products for both DEL1 and DEL2 and was designed in a similar way as the DEL1 assay using the also to identify the putative breakpoint regions for GYPB BsrBI restriction enzyme (Figure 2(c) and (d) and DEL1 and DEL2. Supplementary Figure 3). Several cell lines identified as DEL1 or DEL2 homozygous or heterozygous,9 were used Results to test both assays and the expected banding patterns were observed (Figure 3). The non-DEL1 homozygous reference Development of novelGYPB DEL1 and DEL2 PCR-RFLP samples gave two visible PCR amplicons on agarose gel- assays electrophoresis after AciI digest (1.9 kb and 0.3 kb), while We have successfully designed a PCR-RFLP assay forGYPB the DEL1 deletion homozygote samples were not cut and DEL1 using an AciI restriction enzyme digest to differenti- gave a single visible 2.2 kb amplicon. Samples that are het- ateGYPBDEL1 from the reference (“normal”/wild type) or erozygous for DEL1 gave a combination of all three bands non-DEL1 forms. A schematic representation of the strate- (0.3 kb, 1.9 kb, and 2.2 kb) (Figures 2(a), 2(b), 3(a) and Table gy for the restriction digest is presented in Figure 2(a) and 2 4). Four other smaller products (all less than 50 bp) are Amuzu et al. Genotyping assays for identification of glycophorin B deletion variants 923 ............................................................................................................................................................... Figure 4. GYPBDEL1 (a) and DEL2 (b) assays on cell lines with knownGYP states other thanGYPBDEL1 and DEL2. PCR using theGYPBDEL1 (a) or DEL2 (b) primers was carried out on the samples followed by AciI or BsrBI restriction enzyme digestion, respectively. Lanes 22–24 are negative control wells. (a) DEL1 assay; 1.9 kb (V) and 0.8 kb (W) identify the bands expected for a non-DEL1 sample, and 2.2 kb (U) identifies the presence of GYPB DEL1. (b) GYPB DEL2 assay; 2.1 kb (X) band identifies a non-DEL2 sample, and the 1.3 kb (Y) plus 0.8 kb (Z) bands identify the presence of GYPB DEL2. See also Figure 4. Sample designations identified from Leffler et al.9 produced in this AciI digestion but are present in both the genome reference sequences (GRCh38) (Supplementary reference and alternate PCR amplicons (Table 4) and are Files 2 to 5) showing that the “normal” amplicons were generally not visible or resolvable. Similarly, the GYPB derived from the expected GYP regions (GYPE-GYPB for DEL2 assay showed clear discrimination of genotypes the DEL1 assay and GYPB-GYPA for the DEL2 assay using the BsrBI restriction enzyme (Figures 2(c), 2(d) and [Supplementary Figure 1]). Amplicons from samples iden- 3(b)). Reference non-DEL2 samples gave a single uncut tified as homozygous for either DEL1 or DEL2 showed band at 2.1 kb, while the alternate DEL2 variant gave visible hybrid sequences (GYPE-GYPB/GYPB-GYPA). bands at 1.3 kb and 0.8 kb showing that it was cut by this For DEL1, the sequence changed from GYPE-GYPB to enzyme. Similar to the DEL1 assay, the DEL2 heterozygous the GYPB-GYPA sequence between bases 1672 and 1783 in samples showed three distinct bands. It is worth noting that the amplicon (corresponding to 4:143914016–143914126 samples that show negative result for any of the deletions [GYPE-GYPB] and 4:144024254–144024364 [GYPB-GYPA]0 should be classified as non-DEL1 or non-DEL2, respective- in GRCh38). The 5 boundary was identified by a tandem ly, since the assays only detect variants positive as homo- repeat motif made up of CA and ATrepeats (XXXXXXXX in zygous or heterozygous. Also, the PCR-RFLP assays that Figure 5(a)), while the 3 0 end was marked by a single paral- we developed were used on non-DEL1 and non-DEL2 cell- ogous base difference (A/G, marked Y in Figure 5(a)) lines that carried other GYP structural variants to confirm between the reference sequences. A further 62 bases specificity of reactions and assays (Figure 4). Across the upstream from the 3 0 boundary, there is also a 2-base paral- other cell lines tested, only non-DEL1 and non-DEL2 band- ogous difference between the reference sequences (ZZ in ing patterns were observed after restriction digest and gel Figure 5(a)). The region bounded by these distinguishing electrophoresis, thus confirming the specificity of the assays. motifs identifies a 111 base sequence within which the puta- tive breakpoint occurs and is 8.5 kb from the GYPE ATG start site and 4.9 kb from the GYPB ATG start site, and Sanger sequencing of PCR amplicons for GYPB DEL1 deletes 110 kb to form DEL1 (Supplementary Files 4 and 6). and DEL2 For DEL2, the sequence changed from GYPE-GYPB to the Several HapMap/1000G cell-lines (“normal,” DEL1 homo- GYPB-GYPA sequence between bases 1043 and 1172 in the zygote and DEL2 homozygote) as well as Ghanaian amplicon (corresponding to 4:143991718–143991848 samples which comprised of three “normal” (non-DEL1, [GYPE-GYPB] and 4:144094974–144095103 [GYPB-GYPA] non-DEL2, and Dantu negative), three GYPB DEL1 homo- in GRCh38). The 50 boundary is identified by 4 different zygotes, one DEL2 homozygote, and one DEL1-DEL2 het- paralogous motifs, all within 50 bases of each other (5 erozygote were selected for Sanger sequencing of the PCR- base INDEL, 2 2 base difference, and 1 single bases dif- RFLP amplicons. Pile-up sequences were aligned with the ference – marked with X’s in Figure 5(b)). The 30 end is 924 Experimental Biology and Medicine Volume 246 April 2021 ............................................................................................................................................................... Region Sequences (A) GYPB DEL1 (B) GYPB DEL2 Figure 5. Breakpoint sequences identified for GYPB DEL1 and DEL2 from Sanger sequencing. (a) The GYPB DEL1 highlighted regions correspond to 4:143914016- 143914126 (GYPE-GYPB) and 4:144024254-144024364 (GYPB-GYPA) in GRCh38; XXXXXXXX; identifies the 50 boundary for the GYPB DEL1 breakpoint, Y identifies the 30 end, and ZZ identifies 2 further distinguishing bases 62 bases downstream. (b) The GYPB DEL2 highlighted regions correspond to 4:143870925-143871054 (50- GYPE), 4:143991718-143991848 (GYPE-GYPB) and 4:144094974-144095103 (GYPB-GYPA) in GRCh38. Bases marked with X’s identify the group of distinguishing bases at the 50 end of the GYPB DEL2 breakpoint, while the Y’s identify the 30 end of the GYPB DEL2 breakpoint. Supplementary Files 1–5 provide further detail and Sanger Sequencing pile-ups. The GYPE sequence was included to confirm that the amplicons were not amplifying this gene region. marked by 3 separate single paralogous base differences genotyping after curating for self-reported ethnicity to between the reference sequences, all within 23 bases of only include those of Ghanaian origin. There were 21 dis- each other (marked as Y in Figure 5(b)). This identifies a tinct self-reported ethnicities (Supplementary Table 1) of 129 base sequence within which the putative breakpoint which 6 had N 20 individuals (No. chromosomes 40 occurs and is located 86 kb from the GYPE ATG start providing a minimum detection of 2.5% allele frequency; site and 76 kb from the GYPB ATG start site, and deletes Table 5). 103 kb to form DEL2 (Supplementary File 5). The two new GYPB deletion assays as well as the assay for detecting the Dantu variant were used to measure the Distribution of GYPB DEL1 and DEL2 genotypes in frequency of the DEL1, DEL2, and Dantu variants among Ghana the selected individuals at the three locations in Ghana The assays developedwere used to genotype DNA samples (Table 6 and Supplementary Table 2). GYPB DEL1 was pre- obtained from volunteers who had been enrolled into var- sent at all three sites, with an overall GYPB DEL1 allele ious ongoing studies in three areas of Ghana, namely Accra, frequency of between 4.1% and 6.2%, while GYPB DEL2 Kintampo, and Hohoe. Individuals from Accra and was estimated at an overall allele frequency of between Kintampo were all children between the ages of 1 to 1.3% and 5.0%. In Accra, the frequency of DEL1 was 4- 15 years (mean 5.4 years from Accra [47:53% male:female fold higher than DEL2, while in Hohoe and Kintampo ratio] and 3.25 years from Kintampo [41:59% male:female they were similar (4.1% vs. 5.0%, and 5.7% vs. 4.2%, respec- ratio]), while individuals from Hohoe were all adults tively). As expected, we did not identify any individuals (>18 years) with 39:61% male:female ratio (Table 5 and carrying the Dantu variant9 (Supplementary Table 2). The Figure 6). In total, 393 individuals were included for number of individuals from different ethnic backgrounds Amuzu et al. Genotyping assays for identification of glycophorin B deletion variants 925 ............................................................................................................................................................... Table 6. Genotypes for DEL1 and DEL2 in Ghana Accra Hohoe Kintampo Overall II DI DD XX N % II DI DD XX N % II DI DD XX N % II DI DD XX N % DEL1 Group Akan 63 11 0 4 78 7.43 3 0 0 0 3 0.00 14 0 0 3 17 0.00 80 11 0 7 98 6.04 Ewe 38 4 0 6 48 4.76 27 3 0 5 35 5.00 1 0 0 1 2 0.00 66 7 0 12 85 4.79 Ga 51 5 1 6 63 6.14 2 0 0 0 2 0.00 NA NA NA NA NA NA 53 5 1 6 65 5.93 Gurunsi 4 0 0 1 5 0.00 1 0 0 0 1 0.00 22 1 0 1 24 2.17 27 1 0 2 30 1.79 Konkomba NA NA NA NA NA NA NA NA NA NA NA NA 17 1 0 2 20 2.78 17 1 0 2 20 2.78 Mo NA NA NA NA NA NA NA NA NA NA NA NA 16 2 0 3 21 5.56 16 2 0 3 21 5.56 Other 7 1 0 1 9 6.25 1 0 0 1 2 0.00 49 7 2 5 63 9.48 58 9 2 7 74 8.96 Total (N) 163 21 1 18 203 6.22 34 3 6 43 4.05 119 11 2 15 147 5.68 317 36 3 39 393 5.79 DEL2 Group Akan 73 1 0 4 78 0.68 2 1 0 0 3 16.67 14 2 0 1 17 6.25 89 4 0 5 98 2.15 Ewe 44 1 0 3 48 1.11 30 3 0 2 35 4.55 2 0 0 0 2 0.00 76 4 0 5 85 2.50 Ga 52 3 0 8 63 2.73 1 0 0 1 2 0.00 NA NA NA NA NA NA 53 3 0 9 65 2.68 Gurunsi 5 0 0 0 5 0.00 1 0 0 0 1 0.00 20 3 0 1 24 6.52 26 3 0 1 30 5.17 Konkomba NA NA NA NA NA NA NA NA NA NA NA NA 19 1 0 0 20 2.50 19 1 0 0 20 2.50 Mo NA NA NA NA NA NA NA NA NA NA NA NA 18 1 2 0 21 11.90 18 1 2 0 21 11.90 Other 9 0 0 1 9 0.00 2 0 0 0 2 0.00 59 1 0 3 63 0.83 70 2 0 4 74 1.39 Total (N) 182 5 0 16 203 1.34 36 4 0 3 43 5.00 132 8 2 5 147 4.23 351 18 2 24 393 2.96 Note: Genotypes for GYPB DEL1 and GYPB DEL2 in individuals from the 6 most-represented ethnic groups (N20) in the study, and by collection site. A total of 393 individuals were genotyped for GYPB DEL1, DEL2, and GYP Dantu (DUP4) variants by PCR-RFLP with 325 (76%) of the participants representing 6 ethnic groups. The GYPB DEL1 and DEL2 allele frequencies are shown as percentages (%) within ethnic populations and across Ghana with N as the total number of participants and NA as no samples represented. Genotypes were coded II – ’normal’ with respect to the assay; DI – heterozygote for variant assayed; DD – homozygous for the variant assayed and XX for failed samples. Failed samples are included as these may be due to technical reasons or other GYP variants not detected by the assays used here. GYP Dantu was not found at any study site (Supplementary Table 2). Mo, and Dagarti; Tables 1 and 2). GYPB DEL1 varied between 1.8% and 9.0% overall, while GYPB DEL2 varied between 0.7% and 11.9% overall. When analyzed by the study site, the DEL1/2 frequency estimates became more unreliable due to small sample sizes but where sites-ethnic groups had N 20, the estimates ranged from 2.17% to 7.4% (DEL1) and 0.7% to 11.9% (DEL2) (Table 6). In total, there were 21 ethnic groups represented of which 11/21 possess DEL1 (of the other 10 groups where DEL1 was not detected, all had 4 or fewer individuals). For DEL2, 7/21 groups showed the presence of DEL2 (but not neces- sarily the same groups as DEL1, Supplementary Table 2). GYPB DEL2 was not detected in 14/21 groups (of which 10 had 4 or fewer individuals and 4 groups between 8 and 19 individuals). Discussion Figure 6. Location of sampling site in Ghana. Boarders for level 2 administrative districts are shown within Ghana with the sampling districts filled in blue. The Glycophorins on the surface of erythrocyte are used by major towns where sampling was conducted are shown and named. The offset malaria parasites to mediate invasion,11 as such, variants Africa map shows the location of Ghana in red. The map was generated using R of the gene may protect against malaria parasite infection (https://www.r-project.org/) using a shape file downloaded from GADM (https:// gadm.org/download_country_v3.html). Town GPS coordinates were identified through mechanisms such as slowing parasite growth or 8,9 from Google Maps (https://www.google.com/maps). reducing chances of developing severe malaria. To better assess the effects of these structural variants on resis- tance to malaria, there is a need for population surveys in influenced the overall allele frequencies which skewed the malaria-endemic regions across sSA to generate prevalence overall estimates. The data were therefore analyzed using data and identify phenotypes for conducting functional the main ethnic groups represented in the dataset which studies. However, surveys have been limited by the lack were at least 20 individuals for any given group. of reliable high-throughput assays for the identification of Of the 393 samples across the 3 study sites in Ghana, 325 such phenotypes. The challenge of designing high- came from 6 ethnic groups (Akan, Ewe, Ga, Konkomba, throughput PCR-based screening assays is due to the 926 Experimental Biology and Medicine Volume 246 April 2021 ............................................................................................................................................................... high sequence homology (>96%) between the GYPE, further optimization of the PCR conditions and also the GYPB, and GYPA genes.29,30 In this study, we overcame use of Sanger sequencing to validate the amplicons pro- this challenge and successfully designed two high through- duced by the PCRs. When the homozygous GYPB DEL1 put PCR-RFLP protocols that detect the presence of the two or DEL2 samples were amplified, the sequences obtained most common GYPB deletion variants in West African pop- could be seen to change from one reference sequence to ulations, GYPB DEL1 and DEL2. The assays were con- another and the paralogous sequence differences were firmed by Sanger sequencing and analyzing the PCR used to identify the region where the switch occurred products of the assays. Furthermore, these assays can be from GYPA to GYPE. Further sequence data would be performed in a 96-well plate format, followed by agarose- required to identify whether these breakpoint boundaries gel electrophoresis, making it possible to run over 90 sam- are the same for all GYPB DEL1 and DEL2 chromosomes ples in a single experiment. The assays were validated in and begin to understand whether the flanking sequences the field by comparing their performance with an existing were important for the mismatches during chromosome assay for the detection of the GYP Dantu variant (DUP4) in replication. In both GYPB DEL1 and DEL2 deletions, the screening 393 individuals from three study sites in Ghana. equivalent of a whole SDU was removed amounting to The specificity and sensitivity of the assays in the field dem- 100 kb each. onstrate their applicability as less time-consuming and less The performance of the two assay systems we have expensive options to long-read sequencing for conducting developed was evaluated by screening nearly 400 individ- large-scale studies on the population distribution of these uals from three different sites in Central to Southern Ghana. variants. Considering the ethnic diversity at each study site and the Details for the putative location of the breakpoints for ethnic group sample numbers, the overall allele frequency GYPB DEL1 and DEL2 originally came from a previous of GYPB DEL1 varied between 4.1% and 6.2%, while that of study that used whole-genome sequencing from the GYPB DEL2 varied between 1.3% and 5.0%. Work done by 1000G samples, plus additional African samples.9 This ini- Gassner et al.31 reported that allele frequencies of the other tial information was used to align the reference sequences three distinct deletions within African ethnicities varied forGYPA, GYPB, andGYPE across the putative breakpoints greatly, to the extent that among the Congolese Mbuti and identify the paralogous differences. Due to the high Pygmy populations, cumulative allele frequencies were as homology between the three GYP regions, identifying pri- high as 23.3%. The frequencies reported in our current mers that will specifically amplify each region was chal- study are similar to frequencies reported in other West lenging. It was however easy to design a single primer African populations.9 In this study, analysis of the main that could anchor the three genes and their surrounding ethnic groups with at least 20 individuals, showed the regions. These features were used to overcome the chal- allele frequencies of DEL1 and DEL2 varied between 1% lenge of developing an assay to differentiate between the and 12%, with GYPB DEL2 mostly lower than DEL1. It is three genes and their surrounding regions. For GYPB worth noting that the frequency estimation within ethnic DEL1, a forward primer was designed to bind to the groups where the sample sizes are below 100 (n¼ 200 chro- unique sequences near the GYPE transcription start site, mosomes) will require confirmation by screening larger while the counterpart (reverse primer) was common to sample sizes to have a high power of study and confidence. the three regions. In the case of DEL2, designing a specific In general, larger sample sizes would be required for DEL2 primer was more challenging because the breakpoint was surveys as the allele frequency was less than that of GYPB not close to any entirely unique region. To overcome this, DEL1. In all the ethnic groups with more than 20 individ- the DEL2 specific primer was placed at a location on the uals, we detected GYPB DEL1 and DEL2; however, non- gene cluster close to the GYPB DEL2 breakpoint with suf- DEL1 and non-DEL2 individuals could only be identified ficient sequence variation to allow the PCR conditions to by increasing the number of study participants across all discriminate between the sequences. In view of the fact that the ethnic groups. These two new assays will thus allow both the resulting reference and deletion amplicons for surveys on a larger scale to determine the distribution of each assay would be of the same length, restriction these two main variants in other West African populations enzymes were used to distinguish between them. The and also identify phenotypes for functional assays to inves- restriction enzymes AciI for the digestion of GYPB DEL1 tigate the effects of GYPB DEL1 and DEL2 on the suscepti- PCR product and BsrBI for GYPBDEL2 PCR products were bility of erythrocytes to being invaded by P. falciparum and identified and selected. Other restriction enzymes may the resulting impact on disease pathogenesis. work well but have not been explored or tested in this The ability to identify these genotypes of interest from study. One problem with using restriction enzymes is the large populations accurately and rapidly has been an possibility that the recognition sites themselves may con- important goal in high-throughput genetic screening tain population variation that could complicate the inter- assay development.32,33 Therefore, the current DEL1 and pretation of the assays; however, from current information DEL2 assays offer an opportunity for rapid screening of in genome-browsers and variation databases (dbSNP153), populations for these GYPB polymorphisms that are this variation appears to be uncommon for the restriction common especially in West Africa. Furthermore, for any sites used here. malaria-related studies, it is important to collect alongside The two assays were used to analyze DNA from the GYP variants, other key genetic information such as HapMap and 1000G cell-lines that had known GYPB sickle (rs334) which is present throughout Africa,34 HbC types identified in the Leffler et al. study.9 This allowed (rs33930165; present in Ghana and other West African Amuzu et al. Genotyping assays for identification of glycophorin B deletion variants 927 ............................................................................................................................................................... Countries),35 and G6PD,36 as these may act as confounders ((NMIMR-IRB CPN 004/11–12). Written informed consent in studies examining associations with susceptibility to was obtained from all the study participants or their malaria or effects on malaria parasite invasion and growth. parents/guardians in the case of the children. Developing less expensive high throughput assays tar- geting the less common GYP variants will provide a better FUNDING understanding of the distribution and functional effects of The author(s) disclosed receipt of the following financial sup- these variants on susceptibility and pathogenesis of malaria port for the research, authorship, and/or publication of this and other disease causing pathogens that also use GYPB as article: The study was supported by the DELTAS Africa a receptor. Such understanding may prove useful in guid- grant (DEL-15–007: Awandare). The DELTAS Africa Initiative ing the design of vaccines or other therapeutic interven- is an independent funding scheme of African Academy of tions targeting the pathogen interactions with these GYP Sciences (AAS)’s Alliance for Accelerating Excellence in proteins on the erythrocyte surface. These assays are also Science in Africa (AESA) which is supported by the New important for identifying individuals with the various gen- Partnership for Africa’s Development Planning and otypes of GYPB (homozygous, heterozygous and wild Coordinating Agency (NEPAD Agency) with funding from type), which is necessary for investigating the functional Wellcome (107755/Z/15/Z: Awandare) and the UK govern- significance of these gene variations. ment. The Malaria Genomic Epidemiology Network Resource Centre in Oxford is supported by Wellcome AUTHORS’ CONTRIBUTIONS: The research was conceptualized (090770/Z/09/Z; 204911/Z/16/Z: Kwiatkowski) and by GAA, DPK, KAR, DSYA, and EML; while AEJ, CH, KR, Wellcome also provides a core award to The Wellcome KAR, DSYA, and CMM curated the data; the formal analysis Centre for Human Genetics (203141/Z/16/Z) and the was done by DSYA, KAR, and EML; the funding for the Wellcome Sanger Institute (206194). research was acquired by GAA and DPK; the experiment were The funders were not involved in the research design, carried out by DSYA, KAR, AEJ, CH, and KR; GAA and DPK method development, data collection and analysis, manuscript administered the project; AEJ, CH, KR, KAR, DSYA, CMM, preparation, publication decision or choice of journal. The NA, FA provided resources for the project; GAA, DPK, LNA, opinions and interpretations expressed in this publication are and KAR supervised the project; validation and visualization those of the author(s) and not the funders. was done by DSYA and KAR; writing of the original manu- script draft was done by GAA, LNA, DSYA and KAR; the ORCID iDs manuscript was reviewed and edited by all the authors, DSYA, Dominic SYAmuzu https://orcid.org/0000-0002-8920-0708 KAR, EML, CMM, FA, NA, CH, KR, AEJ, LNA, DPK, and Kirk A Rockett https://orcid.org/0000-0002-6369-9299 GAA. Collins M Morang’a https://orcid.org/0000-0002-6988-150X Christina Hubbart https://orcid.org/0000-0001-9576-9581 ACKNOWLEDGEMENTS Anna E Jeffreys https://orcid.org/0000-0002-6454-3093 We thank all the study participants for consenting to be part of Gordon A Awandare https://orcid.org/0000-0002-8793- this study. 3641 SUPPLEMENTAL MATERIAL DATA AVAILABILITY The authors declare that the data supporting the findings of Supplemental material for this article is available online. this study are available within the manuscript and its Supplementary information files. Information on data access REFERENCES for the Leffler et al.9 study is available from the MalariaGEN resource at https://www.malariagen.net/resource/23. The 1. WHO. World malaria report 2018. Luxembourg: World Health Organizaiton, 2018, p. 210 sequence data generated for GYPB DEL1, DEL2 deletions can 2. CowmanAF, Crabb BS. Invasion of red blood cells bymalaria parasites. be found in the supplementary files. Cell 2006;124:755–66 3. Gaur D, Mayer DG, Miller LH. Parasite ligand–host receptor interac- DECLARATION OF CONFLICTING INTERESTS tions during invasion of erythrocytes by plasmodium merozoites. Int J Parasitol 2004;34:1413–29 The author(s) declared no potential conflicts of interest with 4. Tham W-H, Wilson DW, Lopaticki S, Schmidt CQ, Tetteh-Quarcoo PB, respect to the research, authorship, and/or publication of this Barlow PN, Richard D, Corbin JE, Beeson JG, Cowman AF. article. Complement receptor 1 is the host erythrocyte receptor for Plasmodium falciparum PfRh4 invasion ligand. Proc Natl Acad Sci USA ETHICAL APPROVAL 2010;107:17327–32 5. Kwiatkowski DP. How malaria has affected the human genome and This study was performed in accordance with the Declaration what human genetics can teach us about malaria. Am J Hum Genet of Helsinki. Ethical approval for two ongoing studies on gly- 2005;77:171–92 6. Ndila CM, Uyoga S, Macharia AW, Nyutu G, Peshu N, Ojal J, Shebe M, cophorins and malaria was granted by the Ethics Committee Awuondo KO, Mturi N, Tsofa B. Human candidate gene polymor- for Basic and Applied Sciences, College of Basic and Applied phisms and risk of severe malaria in children in kilifi, Kenya: a case- Sciences, University of Ghana (CPN: ECBAS 037/18–19) using control association study. Lancet Haematol 2018;5:e333–e45 adults, and the Noguchi Memorial Institute for Medical 7. Malaria Genomic Epidemiology N. Reappraisal of known malaria Research Institutional Research Board, University of Ghana resistance loci in a large multicenter study. Nat Genet 2014;46:1197–204 928 Experimental Biology and Medicine Volume 246 April 2021 ............................................................................................................................................................... 8. Band G, Rockett KA, Spencer CC, Kwiatkowski DP, Malaria Genomic 26. Polin H, Danzer M, Reiter A, Brisner M, Gaszner W, Weinberger J, Epidemiology NA novel locus of resistance to severe malaria in a Gabriel C. MN typing discrepancies based on GYPA-B-a hybrid. Vox region of ancient balancing selection. Nature 2015;526:253–7 Sang 2014;107:393–8 9. Leffler EM, Band G, Busby GB, Kivinen K, Le QS, Clarke GM, Bojang 27. Kariuki SN, Marin-Menendez A, Introini V, Ravenhill Bj Lin Y-C, KA, Conway DJ, Jallow M, Sisay-Joof F. Resistance to malaria through Macharia A, Makale J, Tendwa M, Nyamu W, Kotar J. Red blood cell structural variation of red blood cell invasion receptors. Science tension controls Plasmodium falciparum invasion and protects against 2017;356:eaam6393 severe malaria in the dantu blood group. bioRxiv 2018;475442 10. Malaria Genomic Epidemiology N. Insights into malaria susceptibility 28. Amoah LE, Opong A, Ayanful-Torgby R, Abankwa J, Acquah FK. using genome-wide data on 17,000 individuals from Africa, Asia and Prevalence of G6PD deficiency and Plasmodium falciparum parasites in Oceania. Nat Commun 2019;10:5732 asymptomatic school children living in Southern Ghana. Malar J 11. Jaskiewicz E, Jodlowska M, Kaczmarek R, Zerka A. Erythrocyte glyco- 2016;15:388 phorins as receptors for plasmodium merozoites. Parasit Vectors 29. Williams TN, Wambua S, Uyoga S, Macharia A, Mwacharo JK, Newton 2019;12:317 CR, Maitland K. Both heterozygous and homozygous alphaþ thalasse- 12. Eder AF, Spitalnik S. Blood group antigens as receptors for pathogens. mias protect against severe and fatal Plasmodium falciparum malaria on Molecular biology and evolution of blood group and MHC antigens in pri- the Coast of Kenya. Blood 2005;106:368–71 mates. Berlin: Springer, 1997, pp. 268–304 30. Willemetz A, Nataf J, Thonier V, Peyrard T, Arnaud L. Gene conversion 13. Matrosovich M, Herrler G, Klenk HD. Sialic acid receptors of viruses. events between GYPB and GYPE abolish expression of the S and s SialoGlyco chemistry and biology II. Berlin: Springer, 2013, pp. 1–28 blood group antigens. Vox Sang 2015;108:410–6 14. Kathan RH,Winzler RJ, Johnson CA. Preparation of an inhibitor of viral 31. Gassner C, Denomme GA, Portmann C, Bensing KM, Mattle- hemagglutination from human erythrocytes. J Exp Med 1961;113:37–45 Greminger MP, Meyer S, Trost N, Song Y-L, Engstro€m C, Jungbauer 15. Burness A. Glycophorin and sialylated components as receptors for viruses. C. Two prevalent 100-kb GYPB deletions causative of the Virus receptors. Berlin: Springer, 1981, pp. 63–84 GPB-Deficient blood group MNS phenotype S–s–U–in black Africans. 16. Camus D, Hadley TJ. A Plasmodium falciparum antigen that binds to Transfus Med Hemother 2020;47:326–36 host erythrocytes and merozoites. Science 1985;230:553–6 32. Faruqi AF, Hosono S, Driscoll MD, Dean FB, Alsmadi O, Bandaru R, 17. Orlandi PA, Klotz FW, Haynes JD. Amalaria invasion receptor, the 175- Kumar G, Grimwade B, Zong Q, Sun Z. High-throughput genotyping kilodalton erythrocyte binding antigen of Plasmodium falciparum recog- of single nucleotide polymorphisms with rolling circle amplification. nizes the terminal Neu5Ac(alpha 2-3)gal- sequences of glycophorin A. BMC Genom 2001;2:4 J Cell Biol 1992;116:901–9 33. Oliphant A, Barker DL, Stuelpnagel JR, Chee MS. BeadArray technol- 18. Duraisingh MT, Maier AG, Triglia T, Cowman AF. Erythrocyte-binding ogy: enabling an accurate, cost-effective approach to high-throughput antigen 175 mediates invasion in Plasmodium falciparum utilizing sialic genotyping. Biotechniques 2002;32:56–8 acid-dependent and-independent pathways. Proc Natl Acad Sci USA 34. Piel FB, Patil AP, Howes RE, Nyangiri OA, Gething PW, Williams TN, 2003;100:4796–801 Weatherall DJ, Hay SI. Global distribution of the sickle cell gene and 19. Mayer DC, Cofie J, Jiang L, Hartl DL, Tracy E, Kabat J, Mendoza LH, geographical confirmation of the malaria hypothesis. Nat Commun Miller LH. Glycophorin B is the erythrocyte receptor of Plasmodium 2010;1:104 falciparum erythrocyte-binding ligand, EBL-1. Proc Natl Acad Sci USA 35. Piel FB, Howes RE, Patil AP, Nyangiri OA, Gething PW, Bhatt S, 2009;106:5348–52 Williams TN, Weatherall DJ, Hay SI. The distribution of haemoglobin 20. Peterson DS, Wellems TE. EBL-1, a putative erythrocyte binding pro- C and its prevalence in newborns in Africa. Sci Rep 2013;3:1671 tein of Plasmodium falciparum, maps within a favored linkage group in 36. Clarke GM, Rockett K, Kivinen K, Hubbart C, Jeffreys AE, Rowlands K, two genetic crosses. Mol Biochem Parasitol 2000;105:105–13 Jallow M, Conway DJ, Bojang KA, Pinder M, Usen S, Sisay-Joof F, 21. Maier AG, Duraisingh MT, Reeder JC, Patel SS, Kazura JW, Sirugo G, Toure O, Thera MA, Konate S, Sissoko S, Niangaly A, Zimmerman PA, Cowman AF. Plasmodium falciparum erythrocyte inva- Poudiougou B, Mangano VD, Bougouma EC, Sirima SB, Modiano D, sion through glycophorin C and selection for gerbich negativity in Amenga-Etego LN, Ghansah A, Koram KA, Wilson MD, Enimil A, human populations. Nat Med 2003;9:87–92 Evans J, Amodu OK, Olaniyan S, Apinjoh T, Mugri R, Ndi A, Ndila 22. Thompson JK, Triglia T, Reed MB, Cowman AF. A novel ligand from CM, Uyoga S, Macharia A, Peshu N, Williams TN, Manjurano A, Plasmodium falciparum that binds to a sialic acid-containing receptor on Sepulveda N, Clark TG, Riley E, Drakeley C, Reyburn H, Nyirongo the surface of human erythrocytes. Mol Microbiol 2001;41:47–58 V, Kachala D, Molyneux M, Dunstan SJ, Phu NH, Quyen NN, Thai 23. Ashline DJ, Duk M, Lukasiewicz J, Reinhold VN, Lisowska E, CQ, Hien TT, Manning L, Laman M, Siba P, Karunajeewa H, Allen S, Jaskiewicz E. The structures of glycophorin C N-glycans, a putative Allen A, Davis TM, Michon P, Mueller I, Molloy SF, Campino S, component of the GPC receptor site for Plasmodium falciparum EBA- Kerasidou A, Cornelius VJ, Hart L, Shah SS, Band G, Spencer CC, 140 ligand. Glycobiology 2015;25:570–81 Agbenyega T, Achidi E, Doumbo OK, Farrar J, Marsh K, Taylor T, 24. Rahuel C, Elouet JF, Cartron JP. Post-transcriptional regulation of the Kwiatkowski DP, Malaria GENC. Characterisation of the opposing cell surface expression of glycophorins A, B, and E. J Biol Chem effects of G6PD deficiency on cerebral malaria and severe malarial 1994;269:32752–8 anaemia. Elife 2017;6:e15085 25. Chen J-M, Ferec C, Cooper DN. Gene conversion in human genetic disease. Genes 2010;1:550–63 (Received September 20, 2020, Accepted October 5, 2020)