Materials and Methods - Teal

is not affected by oxidative stress conditions and that there is no homolog to SoxS [94].

These recent results suggest that endogenously produced small molecules may be ubiquitous and regulate a variety of systems. In particular, it appears that the E. coli SoxRS paradigm does not hold for microorganisms outside of the family of enterics and may instead act as a response regulator under conditions of low-nutrients and high cell-density, such as in biofilms. To gain insight into the functionality of SoxR across all types of bacterial species, we conducted bioinformatic analyses to determine the binding sites of SoxR and the genes that are potentially regulated by SoxR. Our binding site predictions matched with published data in P. aeruginosa, P. putida, E. coli, and S.

enterica [34, 69, 93, 94], and we demonstrate that that the majority of genes regulated by SoxR do not match with the E. coli paradigm, but are similar to those found in P. aeruginosa. SoxS is not found outside of the enterics, and instead genes regulated by SoxR are transporters and oxidoreductases. Experiments with Streptomyces coelicolor confirm our binding site predictions and show that these genes are activated in the presence of the endogenously produced antibiotic, actinorhodin, suggesting that the more pervasive use of SoxR in the bacterial domain may be as a response regulator to endogenous redox-active compounds.

Mer family. Therefore, the sequences were checked to determine if they had four cysteines in the C-terminal domain. Alignment of the SoxR proteins was made using ClustalW and displayed using CLC Protein Workbench 2.

6.3.2 Genomes

Bacterial genomes and protein table files were obtained for all completed genomes through April 29, 2007 - ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/all.fna.tar.gz and ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/all.ptt.tar.gz respectively. The genome for Pseudomonas aeriginosa PA01 was obtained from http://www.pseudomonas.com/, the updated Pseudomonas aeruginosa PA14 DNA Sequence (2005-Sept-27). Genomes searched were those containing SoxR. A list of these genomes is available at:

http://idyll.org/∼tracyt/sox/list-genomes.txt

6.3.3 Soxbox Matrix Construction

SoxR binds to a near-palindromic sequence known as the soxbox. Three 26bp soxbox sites from Kobayashi and Tagawa, 2004 [69] and three sites known to be regulated by SoxR in Pseudomonas aeriginosa PA01 were used as the initial sequences for creating a matrix:

E. coli (from Kobayashi and Tagawa 2004) TTTACCTCAAGTTAACTTGAGGAATT, Xan- thomonas axonopodis (from Kobayashi and Tagawa 2004) TTGACCTCAACTTAGGTTGAG- GCAGG, Chromobacterium axonopodis (from Kobayashi and Tagawa 2004) TTGACTTCAAGT- TAACTTGAACTTTG, P. aeriginosa upstream of PA2274 (from Pseudomonas.com) TTGAC- CTCAAGTTTGCTTGAGGTTTT, P. aeriginosa upstream of PA4205 (from Pseudomonas.com) TTGACCTCAACTTAACTTGAGGTTTT, and P. aeriginosa upstream of PA3718 (from Pseu- domonas.com)TTTACCTCAAGTTAACTTGAGCTATC.

An energy matrix for this binding site was created as in Brown and Callan, 2003 [17], from which the total binding energy E could be calculated.

i=1

σb(i),j

To calculate the position energy,σ_b(i),j for each position,i, in a site, the number of occurrences N_i(b) of each DNA basebin the list of sites is counted. Each matrix element is calculated using

E >0 [8, 9].

As an example:

Given a site Z with bases of C, C, C, T, C, and C.

A= ln6 + 1

0 + 1 = 1.946

G= ln6 + 1

0 + 1 = 1.946

C= ln6 + 1

5 + 1 = 0.154

T = ln6 + 1

1 + 1 = 1.253

Normalizing to zero, 0.154 is the lowest energy value, so this value is used for normalization and 0.154 is subtracted from each base pair, such thatA= 1.792,G= 1.792,C= 0, and T = 1.099.

The program openfill.py [17] implements this algorithm and was used to create the matrix, soxbox-sequences-long PWM.open, from the initial set of soxbox binding sites. openfill.pytakes as input the list of known sites and generates an energy matrix as output.

6.3.4 Genomic Distribution of Sites.

A modified version of the programpyscangen.py [17] was used to find the energy distribution of all sites in all genomes containing SoxR. This program takes as input a genome, its protein table (the NCBIpttfile containing information on the bounding coordinates and the names of all coding regions, or open reading frames, in that genome), the energy matrix, the initial list of known sites

and a binding energy threshold value. pyscangen.py determines the number of sites that occur in genes and the number of sites that occur in intergenic regions. These statistics are of interest, because functional binding sites should be enriched in intergenic regions.

6.3.5 Binding-site Search.

A modified version of the programpyscangenes.py[17] was used to find the binding sites for SoxR and the genes up and downstream of the predicted binding sites in each genome containing SoxR.

The program takes as input the energy matrix, a genome and a cutoff value. The output is a list of binding sites below the cutoff E value with their energy value and the predicted genes up and downstream of the binding site.

6.3.6 Determination of Background.

A background model was used to determine how many binding sites from a particular energy matrix would be predicted to occur at random in a given genome. Using the program, background-intergenic.py, the GC and AT content for the intergenic region of each genome was calculated. We used a theoretical genome model to calculate the number of predicted sites expected in a genome of that size and GC composition with the given matrix at that threshold. The input for this program is the energy matrix, a genome and a cutoff value. The output is the number of sites that were predicted to occur at random in that genome below a given cutoff (Brown, 2005, unpublished, http://cartwheel.idyll.org/).

6.3.7 Energy Matrix Refinement.

The energy values of predicted soxbox binding sites across all genomes containing SoxR were determined using the initial energy matrix, soxbox-sequences-long PWM.open with a cutoff of 12.

Regions with high energy values (> 8.0), upstream of soxR, confirmed as described above, were taken and used to create a refined energy matrix,soxbox-sequences-long+sub-soxRs PWM.open.

This energy matrix was used for all results presented.

6.3.8 Software and File Availability

All software and files, including programs, genomes, sequence lists, PWMs and output files are available at: http://idyll.org/∼tracyt/soxbox/

6.3.10 Bacterial strains and growth conditions

Streptomyces coelicolor A2(3) strains M145 (SCP1-, SCP2-, Pgl+) and M512 (M145 ∆redD ∆actII- ORF4) [38] were kindly provided by Andrew Hesketh (John Innes Institute, Norwich, UK). Strepto- myces were grown at 30^◦C on R5- agar plates (103g/l sucrose, 0.25g/l K2SO4, 10.12g/l MgCl2.6H2O, 10g/l H2O, 0.1g/l Difco Casaminoacids, 2ml Trace element solutuion, 5g/l Difco yeast extract, 5.73g/l TES buffer, 22g/l Difco Bacto agar; after autoclaving 7ml 1N NaOH were added).

6.3.11 RNA isolation and Q-RT-PCR experiments

S. coelicolor strains M145 and M512 were grown on R5- plates that were overlaid with cellophane membranes. After one day (prodiginine production) or three days (actinorhodin production) cells from three plates were resuspended in 1ml H2O and 2ml RNAProtect (Qiagen). The mixture was incubated for 5 min at room temperature, then centrifuged for 10 min at 5000xg. The pellet was frozen in liquid nitrogen, homogenized with mortar and pestle and total RNA was isolated and purified using the RNeasy Plant Kit (Qiagen), including the optional DNase treatment step. cDNA was generated in a random-primed reverse transcriptase reaction (iScript, BioRad) and subsequently used as template for quantitative PCR (Real Time 7300 PCR Machine, Applied Biosystems) using the Sybr Green detection system (Applied Biosystems). The signal was standardized to SCO4548 using the following equation: Relative expression = 2 (CTstandard CTsample),where CT (cycle time) was determined automatically by the Real Time 7300 PCR software (Applied Biosystems).

Primers (Integrated DNA Technologies) for Q-RT-PCR were designed using Primer3 software [109].

Criteria for primer design were a melting temperature of 60^◦C, primer length of 20 nucleotides, and an amplified PCR fragment of 100 base pairs.

Dalam dokumen Teal_thesis.pdf (Halaman 94-99)