• Tidak ada hasil yang ditemukan

of Dideoxynucleotides by a Modified T7 Polymerase

N/A
N/A
Protected

Academic year: 2023

Membagikan "of Dideoxynucleotides by a Modified T7 Polymerase "

Copied!
117
0
0

Teks penuh

In vitro, these enzymes are one of the essential tools of the modern molecular biologist. The molecular biologist uses the same effects of the same molecules (the dideoxynucleotides and their analogs) to determine the sequence of DNA molecules. The final part of this thesis is devoted to a statistical analysis of the data and investigating the role of sequence context in determining the incorporation of dideoxynucleotides.

Sanders presents statistics from a total of 903 bases sequenced by each of the 3 polymerases, but does not relate this to sequence context, except for s.

An Aside

This raises the level of characterization of this defect from merely anecdotal and qualitative to a statistically justifiable quantitative measurement.

Chapter 2

DNA, Polymerases and Dideoxy Sequencing

Chemistry of DNA and its Polymerases

In 195:3, Watson and Crick showed that DNA molecules naturally exist as double helices, consisting of two complementary ON A molecules held together by hydrogen bonds between the bases. The two DNA strands in a double helix are oriented opposite to each other, with the 5' end of one strand being adjacent to the 3' end of the other. Although the entire process of DNA replication is rather complicated, the main process involves a class of enzymes known as DNA polymerases.

A typical polymerase requires three main things: a piece of template DNA, a short piece of DNA (or RNA) complementary to the 3' end of the template (the "primer"), and a supply of the 4 deoxynucleotide triphosphates.

Sanger Dideoxy Sequencing Reactions

If a reaction mixture consisting of a template-primer complex is prepared, sufficient amounts of the 4 deoxynucleotide triphosphates and a percentage of e.g. dideoxythymidine triphosphate (ddT), assuming that the polymerase does not distinguish between dT and ddT (often an invalid assumption, it turns out), each A (A is complementary to T) in the template will have a one percent chance of completing the elongation process. The products of this reaction (the “T” reaction) will be dominated by DNA segments ending in ddT and complementary to the initial sequence of the template (Figure 2.3). If the polymerase molecule falls off in the middle of extending the primer, and neither it nor another polymerase molecule picks up where it left off, then the fragment may end up at an inappropriate position.

This loop inhibits the action of the polymerase, making it more likely to fall off in the wrong place and mistermine the primer.

F' . ddC

  • G e l Electrophor esi s
  • Detection
  • Chapter 3 Technologies
    • Sequenas e
    • The ABI 373 S e quence r
    • Running A Gel
  • Chapter 4

It then takes 0.50 seconds to turn around before returning to the other edge of the gel. There is some play in the horizontal position of the comb, so the strips are never in the same position on the gel from ride to ride. The surface of the gel often has a bump in the center of the dimple due to the teeth of the comb pushing the edges.

This appears in the image as an added glow at the edges of the lanes.

Generating the Data; A Sequencing Project

Cos mid 2-4 7

All data used in this thesis came from a cosmid that was sequenced as part of a project to sequence part of the mouse T-cell receptor locus; part of the immune system. Primarily sequenced between January and May 1992 by Don Seto, ct al., using four ABI 373 sequences at Caltech. The four colors in the primer are known as FAM, JOE, TAM RA and ROX, they are used.

Chapter 5

  • Common-Time Resampling
  • Gel Straightening

Tt will be important later, when the four-color data for each pixel is transformed into four-color data, so that the different colors are temporally matched. For each pixel, in each color, the 2 values ​​before the reference time plus the value after the reference time are fit with a quadratic function. The value of this square at the reference time is used as the value of that pixel in that color for this four-color scan (see Figure 5.1).

This effectively resamples all data in a four-color scan at the common reference time.

Path of scanning optics in space-time

Filter color in front +-of photomultiplier tube

  • Lane Finding
  • Horizontal Averaging
  • Comparison with ABI Software
  • Lane 1 2 3 4 5 6 7 8

From the left edge, the correlations between adjacent segments are calculated for a series of displacements of the right-hand segment. The algorithm then shifts over one segment to the right and repeats the process using the newly shifted segment as the left-hand segment of the pair. The purpose of the track finding procedure is to find the left and right boundaries of the bands.

To extend the Lbe edges along the gel, certain adjustments are necessary due to the tendency of the webs to bend more.

Fig ure  5.2:  Compa ri son  of  dat a  ex tracted  by  m y  soft ware  ( up p er  trace)  a nd  A BI
Fig ure 5.2: Compa ri son of dat a ex tracted by m y soft ware ( up p er trace) a nd A BI's Data Coll<'ci io n softwar<' (VNs ion 1.0

Extracting Incorporation Ratios

  • Dye-Space Transformation
  • Baseline Subtraction
  • Mobility -Shift Corrections
  • Base Calling (first two passes)
    • Filtering
    • Finding the Primer Peak
    • Finding Peaks
    • First-Pass B ase Calling
    • Graph Normalization
    • Second-Pass Base Calling
  • Base Calling (third pass )

This transformation varies slightly between machines due to variations in the exact transmission spectra of the filters. The presence of the dye moiety on the 5' end of the DA molecules also changes the mobility of the molecule. This is the second most important reason for calling bases incorrectly, but correcting for it would require a priori knowledge of the sequence.

That sequence is determined in the order of the different color peaks in the data. Specifically, Lite sC'qttence is that of the primer strand reading in the 5'-+3' direction with increasing scan number. The parameters of the two Gaussians vary as a function of the scan number to compensate slightly for the increasing peak width during the run.

This filtering scheme tends to sharpen peaks, but it is not intended to be interpreted as a "corrected" representation of the data. Since it represents the beginning of the real data for a lane, it is an important milestone. Later, after the second pass, I'll locate the small place of the fragment as a more accurate landmark.

When this ratio is the lowest, the primer peak is in the middle of the 100 scan window. I* This routine looks at a portion of the data to try relative strengths of the different color signals.

6.4.5  Graph  Normalization
6.4.5 Graph Normalization

6 .5.1 The Smai S ite

The primer position is too coarse for the final results; the main use I make of the primer position is to find the minor site.

  • Reference Spacing
  • Third-Pass Base Calling
  • Conse nsus Alignment
  • Computing The Spacing Graph
  • Quantitating Peaks
  • Chapter 7 The Data
    • lntro
    • Statistical Analysis

The primer position is too crude for the final pa~s, in fact the biggest use I make of the primer position is to find the Smal spot. was incorporated into his idea of ​​the proper base spacing, causing more errors to be rnadC' later. To a first approximation, t lw V<'location with which a molecule' moves through Lite gel is proportional to Lh<' <'l<'ctric ficld and inversely proportional to the viscosity of the buffC'r solution in the gel. I make the assumptions that the electric field of the tape is proportional to the recorded voltage and that the viscosity of the buffer solution is proportional to the viscosity of water at the recorded temperature.

NOTE THAT FBASE IS RELATIVE TO FRAGD->INSERT_BASE THIS MAKES FBASE COMPATIBLE WITH COMP_NOM_SPACING. It is necessary to compare the fragment to both the given consensus and its contents, as the fragment could come from both strands of the cosmid. You can then use that raw positioning information to align the entire fragment to the correct strand of the consensus.

One of the criteria used in the base call is peak h<•ight; peaks that are too low ar<' rejctcd everything within 20 bases of them.

Figure 7.2 is a graph of the mean intensity of the data versus the position of this choking fragment. The goal of flrsl is to get an idea of ​​how noisy the <:> data is.

Figure 6.:3:  Graph  of peak-to-peak  s paci ng  vs.  base  posi t ion.  The sp aci ng  is  normal- normal-ized  Lo  l.O  at  base  100
Figure 6.:3: Graph of peak-to-peak s paci ng vs. base posi t ion. The sp aci ng is normal- normal-ized Lo l.O at base 100

IIIII I I

7.2 .2 Which B ase Pos itions Are Important?

Effect of Particular Bases

Prom this point, I consider the data to avc-raglllg the intensities of the No fragment bases at <'ach consensus position. I sorted all the consensus positions by average intensity and divided the r<'s resulting list into bins, each having at least ;30 points and spanning at least 0.005 intensity units. Sequence Context Intensity Sequence Context Intensity CTTCCCATAC T CTGAGTGCTA 0.079 CCCCCAGCAG G TGCATCTTTT 2.016 AGCCAACTAT T CTTTAATTAT 0.146 AGTCGCTGAG G CTGCACTACC 1.856 TTTCCAGTGACTCA GGT0TCTGGACAT2. . 1.656 AACAGTCCTT T CTTTTTCCTA 0.281 TCTTTTGTGG G TGATTCAGTT 1.616 GTGTGCA GCATGC0. ACCTCTTC 1.595 TCTTTTATAT T CTGTTAGTGA 0.289 AGCAATTAGC G GGGGTTTTTG 1.590 TATCATATAC T CAAAATGCTT 0.295 TCCTCCAGTG G AGGCCCTGTC 1.581 TTTAATGTAC G ATTTTGATT1GT TTGGTTGGTG 1.528 GTTTGAATAT G CTTGGCCCAG 0.309 TTCCTAG C CCCCACACAA 1.479 CTCC ATTACATAC A CTAGCAAGAT 0.327 AAATGTTATC C CCTTTCCTGG 1.4 79 GCTCCTACACCAT CCTT3GTCTT0GT3GTTAG. 476 TATGCCTTAC T CTGGTATAGG 0.341 ATTATTTACA G AATCTCAATA 1.447 AGTAATGTAT G CAGCTTGAAT 0.343 CCCTTCCAAAG G ACAGCCATGC 1.423 TCCAGAATAC G TGACTCACGG 0.3TCCT4CT GGGCT 1.423 TTTCCTTTAT T CTCATTACAC 0.348 GCATCTGCCG C CACCACTTCT 1.421 AGCC AACTAA G CCCTCCTGGT 0.352 AATGGCTGAG C ATGGACCATG L.406 TTTATGACTT T CTTTGTCCTG CTCCTATACCCTAA 9. TGCTGCACTT 0.362 GTGACTCACG G TCTACAACAA 1.391 TTTCCTATAC A ATGTATTCAT 0.369 TTTCTGTAATG T CACCAAGGAG 1.388 TTTAGACTAT G TTCACTGTGA 0.375 CATTACAACT T TCAGGATTGT 1.385 GGTACCGACA G GTTCCTCTTC 0.376 TCCCCATGTT T TTGAGTAATA ] .382 CCATCCATAC T CATGTACCAA 0.377 TATGT ATGTG T TGAATCACTA 1.366 CCAAGACAAT G CTGAAAAGGA CTGAAAAGGA CCTTATG 0.377 TATGTATGTG T. C T CGCCCATCCA 0.386 TTTTTAAAAA A AGAAAGAAGA 1.359 TTTTAAATAT T CACAGCTAAG 0.387 CTTCCCATTG T TGAACATTTC 1.358 CAAATATTAT T CACTTTCCAG 0.391 TGGTTGTATT T TGTTGTAGAC 1.:358 Table 7.

Figure 7.:3:  Posit ion -7
Figure 7.:3: Posit ion -7

Chapter 8 Conclusions

Oxyg<>n atom at th<> 2' position. Is the same mechanism responsible for the distinction between dATP and ddATP, two molecules that differ by one oxygen atom at the :3' position?. Kristensen [1\risten:;en d al J 988) suggests that knowing<> the effect of the sequence on top h('ight) could be useful in calling bases.

Appendix A

Escherichia Coli thioredoxin confers processivity on the DNA polymerase activity of the gene 5 protein of bacteriophage T7. Effect of manganese ions on the incorporation of dideoxynucleotides by bacteriophage T7 DNA polymerase and Escherichia Coli DNA polymerase 1. DNA sequence analysis with a modified bacteriophage T7 DNA polymerase (effect of pyrophospholysis and metal ions).

Gambar

Figure 2.1:  Molec ular st ru ct ures  of  the  four  d eoxy nucleotidcs and  their  didcoxy  and  triphosphate  analogs
Figure  5 .1:  Sp ace-Time  diagr am  of gel  scannin g.  Quadr a ti c  in terpolation  is  used  to  resample a ll  th e  data  to make  it appear  as  if a ll  th e  data for  a  single  scan  had  b een  sample d  at  the  reference t ime
Fig ure  5.2:  Compa ri son  of  dat a  ex tracted  by  m y  soft ware  ( up p er  trace)  a nd  A BI's  Data Coll&lt;'ci io n softwar&lt;'  (VNs ion  1.0
Figure  5.3:  Gel  images  showing  the  effect  of  band  straightening.  Top  photo  is  raw  data,  bottom  is  after  straightening
+7

Referensi

Dokumen terkait

This is an open access article under the CC BY-NC-ND license http://creativecommons.org/licenses/by-nc-nd/4.0/ Peer-review under responsibility of the scientific committee of the