for
Fromsecondary structureto three-dimensional structure:
An improved dihedralangle probability distribution function
foruse with energy searchesforthe native
structures of p olyp eptides and proteins
Betty Cheng x
,Akbar Nayeem 3
, Harold A. Scheraga y
Baker Lab oratory of Chemistry,Cornell University,Ithaca, N.Y., 14853-1301.
2 Proteins and p olyp eptides used to derive the trivariate probability distributionsfor the twenty amino acids.
Protein PDBidentier
Porcineinsulin A chain 1INS/A
Porcineinsulin B chain 1INS/B
Bovinepancreatic phospholipase 1BP2
Rice ferricyto chrome C 1CCR
Crambin 1CRN
Subtilisin Carlsb ergcomplex with Eglin-C 1CSE/A
Eglin-C (see ab ove) 1CSE/B
L7/L12 50 S Rib osomalprotein (C-terminal domain) 1CTF
Erythro cruorin (hemoglobin) 1ECD
Humanimmunoglobulin KOL region (heavychain) 1FB4/B
Pepto co ccusaerogenes ferredoxin 1FDX
3
Protein PDBidentier
Bovineglutathione p eroxidase A chain 1GP1/A
Bovineglutathione p eroxidase Bchain 1GP1/B
Chromatiumvinosum high p otential ironprotein 1HIP
ormhemerythrin 1HMQ
Streptomyces tendaealpha-amylase inhibitor HOE-467A 1HOE
Humanlysozyme 1LZ1
Poplar plasto cyanin 1PCY
Porcinep epsinogen A chain 1PSG/A
Porcinep epsinogen C chain 1PSG/C
Scorpion neurotoxin 1SN3
BovinePancreatic b eta-Trypsin 1TPO
Humanubiquitin 1UBQ
Actinidi n 2ACT
4
Protein PDBidentier
Penicillop epsin rst domain 2APP
A. denitricans azurin 2AZA
Humanerythro cyte carb onic anhydrase 2CAB
Rho dospirillummolischianumcyto chromeC 0
2CCY
D.vulgaris cyto chromeC3 2CDV
Barley seed serine proteinase inhibitor 2 2CI2
Cyto chromeP450CAM 2CPP
Carp parvalbuminE-F hand 2CPV
Porcinecitrate synthase 2CTS
Yeast cyto chromeC p eroxidase 2CYP
Humandeoxyhemoglobinalpha chain 2HHB/A
Humandeoxyhemoglobinb eta chain 2HHB/B
Lampreyhemoglobin 2LHB
5
Protein PDBidentier
Silver pheasantovomucoid 2OVO
Cupredoxin fromAlcali genes Faecalis 2PAZ
Proteinase K 2PRK
Immunoglobulin lambda variabledomain 2RHE
Streptomyces griseus protease 2SGA
Staphylo ccal nuclease 2SNS
Copp er,Zinc sup eroxide dismutase 2SOD
Trprepressor 2 RP
P. aeruginosacyto chrome C-551 351C
Rhizopus chinensisaspartic proteinase with reduced p ep etide inhibitor 3APR
Rho dospirillumrubrumcyto chrome C2 3C2C
Lactobacill us casei dihydrofolate reductase 3DFR
6
Protein PDBidentier
Clostridial Flavo doxin 3FXN
Humanglutathione reductase 3GRS
Rat mastcell protease 3RP2
B. thermoproteolyticus thermolysin 3TLN
heat germagglutinin isolecti n 2 3 GA
Ferredoxin 4FD1
alpha-chymotrypsinA chain 5CHA/A
alpha-chymotrypsinB chain 5CHA/B
alpha-chymotrypsinC chain 5CHA/C
Bovinepancreatic carb oxyp eptidase A 5CPA
Bovinepancreatic trypsin inhibitor 5PTI
Clostridium pasteurianumrubredoxin 5RXN
Trop onin C fromturkeyskeletal muscle 5TNC
7 Distributions of conformational states in the (; ;
1
) space forindividual residues. a
N b
conf. c
1
1
1
1
1
(deg) (deg) (deg) (deg) 2
(deg) 2
(deg) 2
(deg) 2
(deg) 2
(deg) 2
ALA
575 -67.36 -32.86 200.73 -136.30 245.90
337 -101.83 141.67 1341.46 -124.89 530.02
16
3
53.69 41.43 50.82 -93.52 369.15
2
3
8
(deg) (deg) (deg) (deg) 2
-76.96 -28.33 -64.27 473.02 194.63 -18.01 376.34 -18.78 109.67
16 -77.13 -20.98 61.30 590.69 -355.62 -67.19 451.64 17.86 193.00
26 t -65.64 -43.45 -176.33 62.21 1.98 20.91 48.84 6.69 93.14
69
0
-107.98 49.04 -65.48 635.30 -108.42 80.96 511.39 -67.00 172.60
19 -132.80 159.67 63.71 962.39 -56.75 102.05 136.81 7.91 99.33
45 t -105.53 125.41 -176.94 1278.05 -123.35 -44.80 315.63 77.81 185.38
4
3 0
55.50 38.62 -51.20 99.77 -201.74 28.61 435.19 -110.46 126.05
9
(deg) (deg) (deg) (deg) 2
-71.48 -29.36 -71.23 208.54 -175.45 -27.53 316.09 49.09 182.48
71 -94.23 -4.03 62.25 640.11 -283.38 -28.85 499.25 66.50 159.97
37 t -67.26 -38.22 -168.48 454.39 -109.26 -57.95 348.05 -75.38 643.77
68
0
-84.74 140.49 -70.88 451.85 191.29 20.03 815.60 92.80 233.45
44 -109.41 164.26 63.70 1028.19 138.21 46.17 1570.51 87.55 223.73
147 t -102.78 112.97 -173.44 895.89 -146.47 -58.50 861.96 88.39 165.95
18
3 0
65.01 36.18 -73.98 253.58 -212.12 -100.84 373.87 79.62 234.10
1
3
38.18 33.29 12.07 *** *** *** *** *** ***
6
3
t 58.44 36.65 -155.96 45.41 -66.94 17.74 219.87 132.74 533.95
10
(deg) (deg) (deg) (deg) 2
-72.19 -31.04 -66.75 323.30 -190.96 -31.49 341.79 49.23 330.25
42 -70.15 -24.25 57.61 226.95 -108.45 5.63 216.00 96.51 505.57
108 t -63.23 -40.50 -172.41 165.53 -63.46 27.27 119.01 6.18 391.01
76
0
-101.60 137.73 -65.40 807.73 -172.99 -108.40 587.76 27.44 240.15
27 -122.39 148.49 51.16 1304.84 -466.66 -19.98 814.37 16.21 568.69
71 t -97.47 128.25 -172.48 1032.97 0.78 62.21 334.94 67.25 453.14
7
3 0
61.10 33.67 -63.64 31.71 -41.33 -35.71 223.45 38.62 51.83
0
3
*** *** *** *** *** *** *** *** ***
2
3
t 143.36 89.32 -171.24 2.12 -3.21 -31.41 4.87 47.58 465.13
11
(deg) (deg) (deg) (deg) 2
-78.63 -24.10 -68.63 468.50 -357.58 -84.72 436.55 70.19 170.77
7 -100.41 -18.60 63.76 662.31 17.34 -64.85 58.27 -56.61 87.58
97 t -62.70 -45.16 -177.85 143.06 -20.03 -5.92 86.01 0.11 112.40
121
0
-106.77 132.86 -66.60 533.16 14.54 -71.19 755.61 -15.44 121.99
41 -145.32 160.22 60.98 312.49 -46.81 -45.59 108.34 -12.54 90.78
44 t -102.88 126.61 -173.83 1424.87 -44.25 62.29 586.75 -29.03 133.96
7
3 0
64.97 33.98 -60.87 102.63 -52.69 24.71 92.54 11.95 62.13
12 N
b
conf. c
1
1
1
1
1
(deg) (deg) (deg) (deg) 2
(deg) 2
(deg) 2
(deg) 2
(deg) 2
(deg) 2
GLY
189 -73.60 -29.30 601.80 -284.54 601.92
253 -121.11 173.08 1901.94 -382.08 914.62
359
3
83.96 8.22 272.05 -230.90 493.41
196
3
13
(deg) (deg) (deg) (deg) 2
-87.47 -23.60 -69.57 554.14 -251.34 -81.82 682.91 -45.23 188.70
15 -73.34 -16.02 69.95 125.08 -105.45 45.87 190.96 55.32 163.62
41 t -68.65 -44.17 -173.10 329.46 -50.76 -23.13 150.98 30.38 190.79
61
0
-111.33 132.63 -63.15 577.98 -8.14 -82.69 1297.17 -10.74 182.53
12 -148.93 163.35 55.53 560.51 -4.90 1.02 74.66 23.64 180.36
42 t -101.00 127.00 -174.54 1370.77 -3.62 -45.83 614.01 27.39 150.79
9
3 0
57.75 38.59 -52.51 81.92 -99.39 -1.70 231.84 -6.91 55.49
14
(deg) (deg) (deg) (deg) 2
-64.81 -44.72 -66.27 130.52 -25.78 10.02 127.79 -26.89 108.04
36 -97.74 -6.37 70.35 262.04 -101.16 41.83 230.26 -136.85 319.06
22 t -64.60 -30.90 -162.81 73.16 -28.98 20.32 102.25 64.63 182.78
195
0
-105.75 124.37 -58.86 472.19 -17.15 -18.57 203.29 -21.64 113.33
59 -117.61 154.53 63.07 544.59 -66.87 -34.63 387.81 18.03 226.80
42 t -125.04 140.17 -177.03 740.76 -67.79 -157.54 104.32 83.02 466.70
15
(deg) (deg) (deg) (deg) 2
-74.78 -27.37 -70.22 364.65 -254.21 -86.78 362.74 84.73 317.49
25 -73.48 -24.27 75.54 427.13 -110.72 55.73 306.74 -173.51 421.14
122 t -62.80 -42.20 -172.27 175.57 -75.30 43.11 131.94 -27.41 406.49
140
0
-100.80 143.97 -64.71 621.98 -63.58 35.14 613.74 34.55 322.24
25 -115.16 152.15 66.29 1449.51 -415.44 155.50 498.40 -309.45 595.61
90 t -101.40 125.74 -171.20 1231.23 -113.01 34.14 292.90 72.49 323.21
10
3 0
67.41 39.14 -52.49 322.68 -16.37 215.44 532.86 266.31 522.68
0
3
*** *** *** *** *** *** *** *** ***
6
3
t 68.52 72.13 -161.40 873.89 -60.69 570.32 499.10 -39.64 456.99
0
16
(deg) (deg) (deg) (deg) 2
-73.38 -28.43 -71.44 270.30 -226.65 -61.80 359.86 63.02 225.60
2 -120.38 1.67 64.03 1203.44 -123.63 -273.75 12.70 28.12 62.27
137 t -64.19 -42.60 -170.54 140.14 -28.00 -30.99 81.33 37.24 315.30
244
0
-98.23 138.07 -65.94 481.65 36.61 -5.76 508.16 2.12 248.38
12 -147.98 151.68 57.32 151.07 -196.23 -20.45 1697.74 211.98 188.70
118 t -95.50 122.73 -176.04 651.32 105.60 107.60 225.76 36.28 206.05
5
3 0
71.13 17.08 -66.81 1151.03 1562.97 -903.39 2173.16 1232.76 790.22
17
(deg) (deg) (deg) (deg) 2
-71.39 -34.56 -69.50 221.52 -127.08 0.84 360.34 -0.14 125.97
4 -72.13 -18.64 68.51 552.62 -184.82 -228.26 74.45 100.68 150.48
27 t -63.39 -40.14 -173.16 330.79 -48.21 -38.03 338.04 -105.04 381.69
48
0
-106.11 135.91 -71.47 620.19 -69.42 -26.07 601.33 55.96 326.86
8 -143.04 159.22 64.66 152.55 -36.06 82.12 72.79 -30.75 118.26
30 t -106.28 119.55 -169.70 1078.67 -125.22 129.97 891.49 -194.22 496.05
3
3 0
52.73 51.51 -51.57 32.60 -163.71 30.42 951.99 -108.28 43.64
18
(deg) (deg) (deg) (deg) 2
-75.71 -24.24 -70.77 320.41 -324.52 -48.57 489.45 52.16 191.31
40 -104.86 11.61 64.30 534.11 -248.25 43.04 199.90 -49.55 175.93
36 t -79.33 -27.12 -165.29 734.79 -250.97 -281.87 1012.48 291.63 517.29
64
0
-102.07 131.29 -68.72 516.52 35.04 5.91 1064.85 51.68 197.27
37 -118.61 172.99 64.09 895.59 -221.37 35.33 1292.18 21.76 68.74
114 t -106.26 114.83 -172.12 958.27 191.14 46.16 758.97 84.60 181.27
51
3 0
59.92 32.56 -69.58 80.40 -94.43 -2.63 304.85 -43.73 252.84
0
3
*** *** *** *** *** *** *** *** ***
22
3
t 60.12 36.59 -155.39 95.75 -190.42 -27.41 732.45 144.44 189.35
4
3 0
55.71 -145.21 -53.23 203.21 -258.87 -105.11 798.66 38.21 78.09
0
3
*** *** *** *** *** *** *** *** ***
19 N
b
conf. c
1
1
1
1
1
(deg) (deg) (deg) (deg) 2
(deg) 2
(deg) 2
(deg) 2
(deg) 2
(deg) 2
PRO
209 -63.05 -26.73 128.47 -127.07 220.44
266 -66.43 145.96 117.82 -33.15 384.16
0
3
*** *** *** *** ***
0
3
20
(deg) (deg) (deg) (deg) 2
-73.77 -28.85 -68.35 314.73 -186.84 -38.36 285.89 69.29 298.22
13 -69.52 -27.94 64.94 185.02 -138.54 51.13 344.87 -163.60 641.80
68 t -64.61 -39.33 -171.28 263.65 -68.53 18.03 106.54 33.23 338.84
101
0
-104.43 142.96 -62.58 598.65 -35.35 -101.04 395.59 22.89 230.87
20 -142.56 152.25 58.37 928.17 -99.19 -122.43 180.63 88.89 413.13
51 t -95.80 129.54 -174.14 958.28 -85.36 -23.76 388.96 21.86 245.07
9
3 0
54.84 44.68 -60.12 178.93 -138.12 143.06 276.44 -226.36 247.44
0
86.19 179.95 -70.62 2636.83 -4376.08 -937.52 7262.54 1555.91 333.34
1
3
115.25 -96.32 76.07 *** *** *** *** *** ***
21
(deg) (deg) (deg) (deg) 2
-74.95 -27.58 -67.43 324.47 -253.02 -25.91 321.54 7.87 209.22
16 -75.40 -20.72 75.12 400.61 -92.87 77.65 78.34 1.93 485.21
69 t -62.96 -42.14 -170.86 209.54 -74.99 -24.45 175.67 117.00 313.16
94
0
-107.64 140.03 -65.97 655.58 11.12 23.65 678.62 1.89 256.30
17 -140.96 162.82 64.70 988.79 3.75 208.55 52.31 20.55 260.25
52 t -102.88 129.33 -171.96 1054.85 -61.80 0.44 354.13 6.04 404.64
5
3 0
59.11 35.24 -63.82 62.96 -55.26 -4.20 167.78 -18.82 15.82
22
(deg) (deg) (deg) (deg) 2
-74.22 -26.85 -65.59 365.42 -244.42 -43.03 420.35 -14.19 274.70
213 -75.15 -18.16 66.86 379.13 -200.88 -1.11 265.32 0.00 189.71
66 t -69.23 -43.87 -176.91 488.46 80.68 123.93 505.63 22.15 521.44
111
0
-96.54 138.37 -66.27 743.62 58.87 -23.83 574.49 -12.48 295.71
160 -115.48 157.88 66.14 1178.20 -94.08 31.08 696.92 64.04 253.66
130 t -109.38 135.68 -179.60 1214.12 -90.52 -5.11 495.97 -14.10 256.54
8
3 0
53.35 41.42 -58.27 106.89 -126.96 -174.14 264.55 164.04 508.72
4
3
75.34 25.72 70.78 2523.36 -1090.23 -308.79 1488.18 -393.66 508.18
3
3
t 57.22 41.28 -167.63 46.80 -26.77 117.42 24.92 -20.91 517.31
23
(deg) (deg) (deg) (deg) 2
-67.61 -41.03 -65.37 243.03 -92.48 61.97 167.28 -79.66 238.31
157 -90.13 -12.51 63.70 441.78 -191.43 12.33 369.10 -35.08 129.97
14 t -83.60 -30.21 -170.75 1280.06 -384.19 65.99 243.30 71.06 1579.36
187
0
-101.20 129.03 -57.59 579.68 -23.95 -8.93 213.78 5.99 147.35
143 -109.03 160.77 65.61 553.75 44.52 -38.81 440.50 34.37 127.57
42 t -135.29 153.74 -171.84 559.22 -212.36 -35.24 681.50 65.54 260.68
0
3 0
*** *** *** *** *** *** *** *** ***
2
3
43.30 47.62 67.30 316.51 -819.96 150.96 2124.22 -391.087 72.00
1
3
t 74.56 -53.60 130.31 *** *** *** *** *** ***
2
3 0
86.62 138.46 -69.26 2136.62 416.73 863.54 81.28 168.43 349.01
24
(deg) (deg) (deg) (deg) 2
-92.75 -16.57 -58.81 547.15 -364.03 74.56 448.26 -71.05 366.59
24 -70.93 -23.67 81.06 446.70 -197.97 -275.91 209.10 152.67 274.98
222 t -67.00 -43.18 171.24 175.65 -15.77 3.28 66.16 -21.17 119.54
103
0
-116.54 157.21 -60.68 612.01 -93.93 22.24 168.65 -69.80 224.59
51 -119.11 136.88 55.54 562.54 -105.73 -7.33 208.06 150.21 753.18
341 t -107.02 125.70 176.89 452.02 -0.83 -28.83 159.39 -28.29 96.91
25
(deg) (deg) (deg) (deg) 2
-82.18 -14.59 -70.71 578.52 -490.48 46.34 688.04 -20.24 95.69
15 -69.39 -22.93 63.38 332.74 -319.77 163.19 345.30 -134.01 347.18
29 t -63.82 -44.53 179.99 73.01 -20.61 21.94 47.14 19.18 150.62
42
0
-106.33 132.89 -66.29 474.86 17.09 3.58 870.55 22.85 58.19
14 -141.54 162.93 65.01 292.62 -72.82 8.37 144.05 -51.37 64.98
23 t -95.73 139.87 -177.69 1355.58 -202.13 -84.92 210.24 83.02 87.72
26
(deg) (deg) (deg) (deg) 2
-90.00 -17.11 -66.29 488.87 -226.93 -89.05 540.02 87.47 137.89
15 -72.38 -17.97 64.24 158.85 -124.76 42.80 180.89 -57.64 155.03
65 t -63.36 -43.78 -179.46 173.62 -23.70 -6.97 92.44 1.55 91.33
123
0
-110.73 138.15 -66.00 388.26 95.26 -63.28 652.54 5.03 104.47
43 -140.19 162.83 61.77 570.16 12.10 6.59 239.45 -96.26 114.09
79 t -91.28 125.49 -179.00 1034.43 33.54 -93.69 302.92 -71.29 102.80
15
3 0
67.09 35.19 -54.41 368.84 -234.18 30.01 529.13 159.16 203.69
27 a
The quantities
i and
ij
are denedin equations 7and 8, resp ectively.
b
N is the numberof o ccurrencesof the conformational state. For example, the Cysresidue is found in the 0
state 78 timesin the data set.
c
The 12conformational states aredened in the section \Description of Conformational States".
d
Boththe cysteine (-SH) and the cystine (S-S) residues are countedin this Table.
e