Introduction to Protozoan Infections
2.7 Genomic and post genomic exploration of protozoan biology
In 1996, the genome of Saccharomyces cerevisiae was published in Science, heralding the arrival of whole genome sequencing for eukaryotes. It was six years before the first parasitic protozoan genome was completed (P. falci- parum) but, within three years, seven more parasite genomes had joined P.
falciparum in GenBank. Additional projects rapidly joined the queues at the sequencing centres and, to date, there are over 20 completed genomes from parasitic protozoa (Table 2.2), as well as numerous genome sequences of free- living species.
Many of these projects proved extremely challenging because of genome com- position (i.e. the high AT content of Plasmodium sp.) or content (more than 50 per cent of theT. cruzigenome is repetitive elements). However, the effort to acquire the genome sequences has borne numerous fruits, increasing our knowledge of the biology of these organisms, as well as providing numerous novel drug or vaccine targets to the research community. In addition, it is now common for several pathogens within a single group to have fully sequenced genomes, and this has kick-started the era of comparative genomic analysis, facilitating the exploration of phylum and genus-specific biology.
Table 2.2 A selection of protozoa genome projects that have been completed (C) or are nearing completion (P).These have been listed along with the research institution(s) contributing to the project and associated references. There have been a large number of protozoa gene discovery and transcriptome-based projects that have also been published but are too numerous to be listed here.
Model Status
Sequencing
Institute References Amoebozoa
Entamoeba histolytica
human pathogen C TIGR, WSTI Loftus, Bet al. (2005). The genome of the protist parasiteEntamoeba histolytica. Nature
433(7028), 865–868.
E. dispar Commensal and model
Entamoeba species
P JCVI
E. invadens Reptile pathogen and model Entamoeba species
P JCVI
Acanthamoeba castellanii
human pathogen P HGSC
Excavata Giardia lamblia (assemblage A)
human pathogen C KI Morrison, HGet al. (2007). Genomic minimalism in the early diverging intestinal parasiteGiardia lamblia. Science 317(5846), 1921–1926.
Giardia intestinalis (assemblage B)
human pathogen C WSTI Franzen, Oet al. (2009). Draft genome sequencing of giardia intestinalis assemblage B isolate GS: is human giardiasis caused by two different species?PLoS Pathog 5(8), pe1000560.
Giardia intestinalis (assemblage E)
human pathogen C KI Jerlstrom-Hultqvist, Jet al. (2010). Genome analysis and comparative genomics of aGiardia intestinalis assemblage E isolate. BMC Genomics 11, 543.
Trichomonas vaginalis
human pathogen C TIGR/JCVI Carlton, JMet al. (2007). Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis. Science 315(5809), 207–212.
Naegleria gruberi
ModelNaegleria sp.
C CIG Fritz-Laylin, LKet al. (2010). The genome of Naegleria gruberi illuminates early eukaryotic versatility.Cell 140(5), 631–642.
Leishmania braziliensis
human pathogen C WSTI Peacock, CSet al. (2007). Comparative genomic analysis of threeLeishmania species that cause diverse human disease.Nature Genetics 39(7), 839–847.
L. donovani human pathogen P WSTI
Table 2.2 (Continued)
Model Status
Sequencing
Institute References
L. infantum human pathogen C WSTI Peacock, CSet al. (2007). Comparative genomic analysis of threeLeishmania species that cause diverse human disease.Nature Genetics 39(7), 839–847.
L. major human pathogen C WSTI Ivens, ACet al. (2005). The genome of the kinetoplastid parasite,Leishmania major.
Science 309(5733), 436–442.
Trypanosoma brucei
human pathogen C WSTI/TIGR Berriman, Met al. (2005). The genome of the African trypanosomeTrypanosoma brucei.
Science 309(5733), 416–422.
T. cruzi human pathogen C TIGR El-Sayed, NMet al. (2005). The genome
sequence ofTrypanosoma cruzi, etiologic agent of Chagas disease.Science 309(5733), 409-415.
Harosa
Babesia bovis Major cattle pathogen and model ofB.
microti
C WSU-CVM Brayton, KAet al. (2007). Genome sequence of Babesia bovis and comparative analysis of apicomplexan hemoprotozoa.PLoS Pathogens 3(10), 1401–1413.
Cryptospordium parvum
Opportunistic human pathogen
C BMGC Abrahamsen, MSet al. (2004). Complete genome sequence of the apicomplexan, Cryptosporidium parvum. Science 304(5669), 441–445.
C. homis Opportunistic human pathogen
C BMGC Xu, Pet al. (2004). The genome of
Cryptosporidium hominis. Nature 431(7012), 1107–1112.
C. muris Rodent model of Cryptosporidium infection
P TIGR/JCVI Heiges, Met al. (2006). CryptoDB: a
Cryptosporidium bioinformatics resource update.
Nucleic Acids Research 34(Database issue), D419-422.
Plasmodium berghei
Rodent model of malaria infection
C WTSI Hall, Net al. (2005). A comprehensive survey of thePlasmodium life cycle by genomic,
transcriptomic, and proteomic analyses.Science 307(5706), 82–86.
P. chabaudi Rodent model of malaria infection
C WTSI Hall, Net al. (2005). A comprehensive survey of thePlasmodium life cycle by genomic,
transcriptomic, and proteomic analyses.Science 307(5706), 82–86.
P. falciparum human pathogen C SGTC, TIGR
and WTSI
Summarised in Gardner, MJet al. (2002).
Genome sequence of the human malaria parasitePlasmodium falciparum. Nature 419(6906), 498–511.
(Continued)
Table 2.2 (Continued)
Model Status
Sequencing
Institute References P. knowlesi Primate model of
malaria infection.
Can also infect humans.
C WTSI Pain, Aet al. (2008). The genome of the simian and human malaria parasitePlasmodium knowlesi. Nature 455(7214), 799–803.
P. ovale P WTSI
P.vivax human pathogen C TIGR Carlton, JMet al. (2008). Comparative genomics of the neglected human malaria parasite Plasmodium vivax. Nature 455(7214), 757–763.
P. yoelii yoelii Rodent model of malaria infection
C TIGR Carlton, JMet al. (2002). Genome sequence and comparative analysis of the model rodent malaria parasitePlasmodium yoelii yoelii. Nature 419(6906), 512–519.
Toxoplasma gondii C TIGR/JCVI Unpublished but summarised in Gajria, Bet al.
(2008). ToxoDB: an integratedToxoplasma gondii database resource. Nucleic Acids Research 36(Database issue), D553–556.
Blastocystis hominis C GSP Denoeud, Fet al. (2011). Genome sequence of
the stramenopileBlastocystis, a human anaerobic parasite.Genome Biology 12(3), R29.
Abbreviations: CIG, Center for Integrative Genomics, University of California, Berkley, CA USA; BMGC, Biomedical Genomics Center, University of Minnesota, St. Paul, MN USA; GSP: Genoscope (CEA) and CNRS UMR 8030, Universit ´e d’Evry, Evry, France; HGSC, Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX USA; JCVI, J. Craig Venter Institute Rockville, MD. USA; KI: Karolinska Institute, Stockholm, Sweden;
SGTC, Stanford Genome Technology Center, Stanford School of Medicine, Palo Alto, CA USA; TIGR, The Institute for Genome Research (now J. Craig Venter Institute) Rockville, MD. USA; WSU-CVM, Washington State University, College of Veterinary Medicine, Pullman, WA USA; WTSI, Wellcome Trust Sanger Institute, Cambridge UK.
From these studies, there have been several global observations that are rele- vant to those interested in the immunology of these organisms.
First, many of these organisms have large gene families, encoding species- specific surface antigens. Many of these gene families are novel and were iden- tified during the genome annotation. It is unclear what their functions are, or what proportion of these gene families are utilised by the parasites at any given time. However, this data suggests that, like T. brucei, many protozoan parasites may frequently vary their antigenic determinants as an immune- avoidance mechanism. Surprisingly, many of these organisms have opted to or- ganise these gene families in large sub-telomeric arrays. InT. brucei, expression of a clone-specific VSG is controlled by its position within the genome. Whether other parasites have independently developed analogous mechanisms to regu- late the expression of polymorphic surface antigens is being explored.
Second, for some species, genomes of a number of different strains or isolates have been (or are being) sequenced, or genomic data has facilitated large-scale
genotyping efforts. Assessments from these projects have hinted at high levels of genetic diversity within some protozoan populations. These two observa- tions have important implications for the development of vaccines, which will need to target either highly conserved antigens or contain polyvalent compo- nents that are effective against a broad spectrum of the parasite population.
Raw genome data can be a tremendous resource, and has stimulated many new avenues of enquiry in the protozoan research community; indeed, stud- ies building on this platform hold the prospect of a number of important ad- vances. For instance, transcriptomic and proteomic analyses have become im- portant tools, allowing parasite gene expression and/or protein content to be assayed at any life cycle stage or from limited clinical samples. Subcellular pro- teomics now allow the biology of important organelles like the apical complex, hydrogenosomes or apicoplasts to be explored at a level of detail not previously possible.
Most importantly, methods for genetically manipulating many of the major protozoan parasites are now routinely used. This contrasts with the helminths, where, despite many years of research, these techniques are still being devel- oped. In a few species, such as T. brucei, transgenesis and RNA interference (RNAi) are efficient techniques for over-expressing or down-regulating gene ex- pression, allowing the function of potential pathogenicity factors and immune evasion genes to be directly interrogated. This, in combination with the ease of genetic manipulation of relevant model hosts such as mice, gives immunolo- gists an unprecedented opportunity to explore the details of how specific para- site or host genes shape the course of infections and offer potential intervention points.