Overview of Molecular Epidemiologic
al Methods for the Subtyping and Co
mparison of Viruses
Uses of molecular
epidemiological methods
Subtyping
- in some viruses, different subtypes are associate
d with different clinical manifestations e.g. enteroviruses, ad
enoviruses, and human papillomaviruses.
General Epidemiology
- by identifying the viral subtypes at
different times and geographical locations, one can detect m
ajor changes in the epidemiological patterns of infection e.g.
HIV and HCV.
Investigation of Outbreaks
- to support or disprove a link bet
Methods Used - Complete
or Partial genome?
For greatest degree of accuracy, the complete genome shou
ld be used for the purpose of comparison.
However, since viral genomes ranges from 3500 bp to over
200,000 bp, it would be highly impractical to sequence the
whole genome.
Certain simple methods are still used for the comparison of
complete genomes e.g. RFLP for CMV, HSV, and Adenovir
uses.
Nowadays in practice, a small part of the genome is amplifi
Strategies for identification of the
PCR Product (Commonly used
methods)
Sequencing of the PCR product
the gold standard but expensive and not widely available.
PCR product may be sequenced directly or cloned before sequencing.
However, it is the test of choice in outbreak situations where there are
serious public health and/or medical-legal implications.
Sequencing can be used to confirm results of other molecular
epidemiological assays. As a matter of fact, all other assays can be considered as simpler screening assays.
Restriction Fragment Length Polymorphism (RFLP) -
very simple,rapid and economical technique but the result may be difficult to read.
Hybridization with a specific oligonucleotide probe -
A wide variety ofPrinciples behind Restriction
Enzyme Analysis and
Hybridization Probes
EcoRI (GAATTC)
0 32 100
32 68
GAATTC Target
Target
Hybridization Probes
PCR-RFLP (PRA)
The gene target must be present in all viral strains.
It is amplified with primers directed against conserved areas in the targe
t gene so that all subtypes can be amplified.
The PCR product is then digested with one or more restriction enzymes
and on an agarose or polyacrylamide gel.
The species or genotype is identified from the restriction patterns seen.
Therefore PRA can be considered as probably the simplest DNA fingerp
rinting technique.
The principle of PRA is similar to that of RFLP of whole viral genomes
and pulse field gel electrophoresis.
It is quick, simple and cheap and this is why it is preferred by many mol
ecular biologists.
Nature of Restriction
Enzymes
4-cutter Enzymes (frequency of cutting = 1/256)
taq 1 TCGA
Hae III GGCC Sau 96I GGNCC
6-cutter Enzymes (frequency of cutting = 1/4096)
Eco RI GAATTC Hind III AAGCTT
8-cutter Enzymes (frequency of cutting = 1/65536)
Specific Oligonucleotide Pro
be
Simple to carry out, particularly suitable for large scale testing
Results are usually easier to read than REA and requires less skill to in
terpret
Preferred strategy by commercial companies e.g. INNO-LIPA HCV, S
orin DEIA, Roche Amplicor and Taqman.
Can be made into a highly automated closed system e.g. Roche-Ampli
cor.
Therefore more attractive than PRA for the routine laboratory but the c
osts could be prohibitive.
Specific nucleic acid probe assays are available where the specimen is
Choice of Genomic Region
The choice of genomic region to use for analysis is critical and
could affect the outcome of results.
Too conserved – will not be able to demonstrate any differences
between subtypes.
Too variable – may not be able to demonstrate a link between
source and recipient viruses in outbreak studies because of the high mutation rate.
In general, RFLP is not suitable to highly conserved regions
while nucleic acid probes are not suitable for highly variable
regions.
It is often advisable to use more than one gene region, especially
DNA Sequencing
DNA Sequencing is a gold standard of molecular
virological investigation.
Automated Sequencing facilities are now used in many
routine diagnostic laboratories.
The PCR product is directly sequenced without cloning.
A Blast search of the sequence is then carried out in
GENBANK. This is normally good enough for
identification.
For epidemiological investigations, a phylogentic analysis
GENBANK
GENBANK is the most important bioinformatics resource
– may be accessed through Entrez or Blast searches.
However, there are a number of problems.
Sequences not refereed
Many sequences are very old and obtained by manual sequencing t
echniques that were not very accurate.
Many sequences are from strains of microorganisms that had not b
een well characterized.
A lot of the sequences deposited in Genbank are contaminated with
plasmid vector sequences or PCR primer sequences.
Phylogenetic Analysis
DNA sequences of outbreak strains are compared to those of the
suspected source strains and reference strains.
A phylogenetic tree is drawn up with bootstrap resampling analysis.
Where the outbreak strains and source strains are similar, they should
be close together on the tree with a high bootstrap value.
The actual gene that is used and the length that is sequenced is critical.
Too conserved – will not be able to demonstrate any differences between
subtypes.
Too variable – may not be able to demonstrate a link between source and
recipient strains in outbreak studies because of the high mutation rate.
Most useful for RNA viruses such as HIV, HCV and Norwalk where
Camber-Gp2 Lorsdale-2 HAWAII-Gp2 Waiter SMA-Gp2B NOR92UK-2A NOR89JD-2A TV24-Gp2 Melksham-2 D82330 Oyster Patient3 Patient1 Patient2 SOV-Gp1 DSV-Gp1 M87611-Gp1 NOR89JB-Gp NOR84J-Gp1 99 98 72 76 100 72 85 61 100 99 82 58 39 36 39 0.05
Investigation of Norovirus Outbreak
Summary
A wide variety of molecular epidemiological methods are available, of which DN
A sequencing is the gold standard.
It is now usual to analyze a small part of the genome rather than the complete gen
ome. The target fragment is first amplfied by PCR before analysis.
The most widely used screening methods involve either restriction enzyme analys
is or hybridization with specific nucleic acid probes, or a combination of the two.
Other screening methods such as SSCP, dHPLC and other heteroduplex analysis
techniques are rarely used outside a research setting because they often suffer fro m poor inter-laboratory reproducibility.
The choice of the genomic region to use is critical: it is often advisable to use mor
e than one genomic region.
It is important to remember that all molecular epidemiological methods available
Points to Consider
Molecular epidemiology techniques are can be used to good effect to disprove a link between donor and recipient strains of a particular infectious agent but they cannot prove a definitive link.
Therefore a negative result is much greater predictive value than the positive result.
The probability of a link depends on many factors including the prevalence of that particular genotype and the methods used.
Where the outbreak carries huge medical-legal implications e.g. HIV transmitted through blood factors, the case would have to be argued on an individual basis in court, preferably with the help of a statistician. It is important to remember that molecular epidemiological