Sites under Positive Selection using Datamonkey

CHAPTER 3: Detection of Positive Selection Pressure in Acute Phase HIV Positive Treatment

3.4 Results

3.4.3 Sites under Positive Selection using Datamonkey

Figure 3.1: Positive selection detection for Sanger and UDP Sequences

numbers in the circles indicate the number of positive sites

Figure 3.1 shows that the REL method detected the most sites for positive selection of the conventional methods used by the HyPhy package in Datamonkey. It consistently did so for the SS and UDP sequence alignments and across all four models of substitution used to analyse the alignments at the 5% level of significance. In the SS alignments, REL identified five possible sites for positive selection using the F81 model, four sites for each of HKY85, TrN and REV models. For UDP alignments, REL identified five sites using F81 and REV;

and four sites for each of HKY85 and TrN93, and an additional site in REV (i.e. five sites).

Internal branch FEL (iFEL), on the other hand, only detected one site for positive selection in UDP alignments using MEME, but did not detect any sites in the SS alignments. The SLAC and FEL methods did not detect any sites for positive selection. A summary of the

0 0 0

3 3 3 3

2 2 2 2

4 4 4

4 4

F81 HKY TRN REV F81 HKY TRN REV

SANGER ULTRADEEP

Positive sites (n)

Model of Substitution IFELMEME

REL

positively selected sites identified in pol using REL, MEME and iFEL, together with the total frequencies detected by REL and MEME across the models of substitution, is illustrated in Figure 3.2.

Figure 3.2: REL, MEME and iFEL positively selected sites in pol across F81, HKY85, TrN93 and REV models of substitution

i) The table on the left has three broad categories of comparison (REL, MEME and iFEL in the top row).

Each of these has the substitution models that detected positive sites for them as subcategories (F81, HKY, TRN and REV). These comparison categories were used for both SS and UDP sequence alignments (highlighted rows). The first column of the table indicates the codon sites that were identified by the entire HyPhy analysis at the 95% CI.

ii) The figure on the left is a summary of all the positive sites detected by either the REL method or the MEME method across all four methods of substitution (i.e. combines both SS and UDP sites across the two methods).

Using the REL method, overall F81 detected six unique sites (i.e. a site that was common to both SS and UDP counted as one unique site) for positive selection (see Figure 3.2). Four of these, namely Pr19, Pr63, RT123 and RT169, were common to both SS and UDP alignments.

Similar to F81, TrN also identified six unique sites for positive selection. However, it only detected RT positions 36 and 169 as common sites to both SS and UDP alignments. REV detected five unique sites for positive selection, with four common to both SS and UPD alignments, specifically Pr19, RT36, RT169 and RT196. HKY also identified five sites, three of which were common to both SS and UDP, namely Pr63, RT36 and RT169.

The MEME model had the same number of positively selected sites across all models of substitution (i.e. adding both SS and UDP sites). All of the identified sites were in RT with

positions 36 and 207 common to both SS and UDP alignments. Internal branch FEL, on the other hand, detected Pr19 in UDP alignments as the only site for diversifying selection.

Reverse transcriptase codon 169 was the only site selected by all substitution models using the REL method for both SS and UDP alignments (refer to the diagram on the right in Figure 4.2). This was followed by RT36, which was selected seven times (four times in SS and three times in UDP). Protease codon 19 was selected three times by both SS and UDP, while RT196 was selected 3 times in SS and twice in UDP. Protease 63 and RT123 were each identified four times.

In MEME, both RT36 and RT207 were identified as positive sites across all four models of substitution in both SS and UDP alignments. Reverse transcriptase codon 195 was also identified as a positive site, but only in the SS alignments.

Shown in Table 3.6 are the collective AA mutations in either Pr or RT corresponding to the AA sites identified in the preceding figure. It should be noted that although Table 3.6 may appear to depict similar data as Figure 3.2, it differs in that its purpose is to relate the observed positively selected sites to their closest known SDRM.

Table 3.6: Positive selection sites in Pr and RT for REL, MEME and iFEL for SS and UDP sequences and nearest known Drug Resistance Site

Codon Site REL MEME iFEL Mutation(s) Nearest DRM site SS UDP SS UDP UDP wt∆mut^* wt∆mut^**

Pr19 √ √ √ T19VIL K20TV

Pr63 √ √ L63HPSVT Q58E

RT36 √ √ √ √ A36E E40F

RT39 √ E39DKT E40F

RT123 √ √ G123SND V118I

RT169 √ √ E169DA V179DEFL

RT195 √ I195N G190ASEQ

RT196 √ √ G196ER G190ASEQ

RT207 √ √ E207ATK L210W

* wt AAs based on Ref.C.ZA.04.04ZASK146.AY772699 consensus sequence; **DRMs taken from Stanford University HIV Drug Resistance Database (2017)

Observed mutations in Pr were T19VIL and L63HPSVT. Protease position 63 was the most highly variable locus of all the positively selected sites, toggling between six codons (i.e.

wild-type plus the five mutations). Several mutations were observed in RT. These were at

A36E, E39DKT, G123SND, E169DA, I195N, G196ER and E207ATK. The more conserved sites were at position 36 and 195. Reverse transcriptase position 36 kept toggling between A and E, whereas there was only a single mutation in one of the SS alignments at position 195. The more volatile sites for positive selection in RT were E39DKT, G123SND and E207ATK.

Dalam dokumen Sequence analysis of an HIV-1 subtype C acutely infected cohort from Durban, South Africa. (Halaman 71-74)