While most current algorithms for the problem are heuristic in nature, the algorithm under study provides a solution with theoretical bounds on both execution time and accuracy. Each pair of points in Q will then be used as an axis of rotation about which to rotate P. At each angle, for each point in P we know a set of points in Q that it can match.
For each such rotation interval, the problem is to find a bijection that matches at most points in P and Q. To solve this, a bipartite graph is constructed where the vertices are points in P and Q, and an edge is placed between each. two points that can be matched; the maximum bipartite matching of the graph gives us the optimal bijection. An exhaustive search for the optimal bijection for all rotation axes and rotation angles gives us the solution to the problem.
The algorithm further uses some geometric properties and overlaps in the intermediate solutions to reduce runtime complexity. This project provides a full implementation of the algorithm in C++, with very minimal use of external libraries.
INTRODUCTION
Motivation and Problem Statement
Project Objective and Contribution
Background information
LITERATURE REVIEW
Literature Review
To improve the performance of the process, the size of each AFP should be within 30 residues of the current alignment ends (Martin, Captriotti, Shindyalow & Bourne 2009) and the maximum spacing should be set to an empirically determined value of 8 (Shindyalov &. Bourne 1998). MAMMOTH (MATching Molecular Models Obtained from THeory) transforms protein structure coordinates into a set of six unit vectors calculated from the Ca trace of the successive heptamers. After the alignment of the two structures is completed, the server identifies conformational invariant regions using a genetic algorithm (Jogn & Wendy 2004).
SALIGN represents proteins through a set of properties or characteristics calculated from protein sequences and structures, or arbitrarily defined by the user. The Sequential Structure Alignment Program (SSAP) algorithm uses a dual dynamic programming optimization to compare 3D structures based on the atom-to-atom vectors. Instead of using alpha carbons, SSAP uses the C atoms to generate a series of vectors for all residues except glycine. A dummy C is used.
By applying dynamic programming to each resulting matrix, a set of selected matching positions is determined. The final structural alignment is then computed over the S-score matrix by the second dynamic programming (Martin, Captriotti, Shindyalow & . Bourne 2009).
Polynomial Time Approximation Scheme to Implement
We call and a radial pair iff is the farthest point among all the points in P, which is written as. Therefore, an error on introduced by T on p1 and p2 with respect to T will introduce at most an error of 3 on the rest of the points in P, and therefore it is sufficient that we search for the transformation in a discretized solution space to obtain a solution with accuracy up to 3. With a rotation axis fixed, we find the angle , about the axis through pi and pj such that the score S is maximized.
For a given p P and q Q, we find the angles of rotation that move p in and out a distance of q, as shown in Figure 3. Given an axis of rotation and an angle of rotation, we want to find the maximum number of a - a match between points in P and Q. To find maximal one-to-one matches, it suffices to find maximal two-way matching of G (Schneider 2002).
Therefore, we only need to construct bipartite graphs for the angles at which points come into or out of contact. However, we do not know which pair is a radial pair of P, nor do we know their matching points in Q.
METHODOLOGY AND TOOLS
Methodology and Tools
ALGORITHM IMPLEMENTATION
Program Input and Output
Main Procedure of Program
- Get user input
- Read .pdb file
- Get total length of common residue ID of two input structures (P and Q)
- Create Structure for the input structures
- Identify radial pair candidates from P and Q
- Transform structure Q so that it is along the y-axis
- Translation
- Rotation
- Cross Product
- Rotation axis
- Rotation angle
- Exhaustive search of positions for p 1 in discretized D c sphere around q 1
- Forming a sphere cap for p 2
- Output the maximum number of matching of both structures
In this function, the total number of common residue ID of both structures and the index of the common residue will be stored. The coordinates of the common residues will be copied to create two structure objects as shown in Figure 4.2.4.1. Therefore, I examine the distance from each grid point to the center of the sphere; when the distance is more than , the grid point will not be considered further.
After attaching p1 to a mesh point, a sphere cover must be formed for all possible positions of p2, centered on p1 with radius ||p1 – p2||, and the sphere cover must be inside the sphere q2. If the grid point is inside q2, they roughly match and no sphere shaping is necessary. To satisfy the conditions, all possible positions of p2 form a sphere inside the sphere q2.
The ball cap is generated in the program as follows: starting with a point that forms a straight line with p1 and is parallel to the y-axis (Figure 4.2.8.3). The step size of the movement is set according to the resolution of the discretization. The vector (vector1) will then be rotated around another vector (vector2) that passes through p1 and is parallel to the x-axis, to be rotated left or right, corresponding to the left and right movement of the point.
Form a ball shell for p2 on q2 grid, try all coordinates on ball shell oriP1: the original p1 before setting up grid. Then they are applied the rotation that moved p2 to its position on the ball shell around q2. To find the intersection of the circle and the sphere, the intersection of the circle (formed by the rotational path of p) and the cut circle (intersection of plane and sphere Q) is calculated as shown in Figure 4.2.9.5.
Suppose the radius of the incised circle is √ and the radius of the unit circle is √. After obtaining the intersection points, the angles are calculated using the Law of Cosines as shown in Figure 4.2.9.7 and Figure 4.2.9.8. Note that θ1 and θ2 come in pairs and θ2 must be next to θ1. For each angle, the matching of the corresponding p and q will be preserved.
When other θ1 is smaller than the first angle in storage, another angle will be created after the first angle in storage as shown in Figure 4.2.9.12 left. When the angle is greater than the next cumulative angles, a new angle will also be created (as in Figure 4.2.9.12).
Program Result
DISCUSSION AND CONCLUSION
Discussion
- Achievement
- Implementation Issues and Challenges
- Problem of Installing Cygwin
- Early conceptual mistakes in implementing the algorithm
- Lack of knowledge in circle-sphere intersection
- Implementation of Hopcroft-Karp algorithm
- Runtime Error and Memory Error
I translated the entire structure P so that p1 matches q1 and then checked the distance between p2 and q2 to see if it is within 2Dc. To begin with, I first form the grid on the cube of q2, then only form the ball cap by controlling the distance between the grid and the center of the cube (q2). However, this is incorrect as the ball cap could not be formed when the ball cap is not on the grid.
I didn't know how to use the rotation matrix to rotate a point about an arbitrary axis. I also didn't know how to rotate a vector around an arbitrary axis, which is to change the direction of the axis of rotation. In my initial tests of the program, the program did not terminate even for very small instances of problems.
Another part of the program that I had a hard time coding is the part where I need to find the intersection of a circle and a sphere. I searched for solutions to this problem in many forums on the Internet, but could not find any information about the problem. My supervisor told me to reduce the problem to finding a circle-circle intersection by changing the sphere to a circle via a plane-sphere intersection.
So I extended one circle into a plane and cut the sphere with the plane to get a circle from the sphere. The Hopcroft-Karp algorithm pseudocode is available on the internet and I found that someone has already implemented it in C++. I tried to run his code but I didn't get the expected result because I didn't know what the code input should be.
Next, I started to change the input of the function to the form required by my program, namely matching the two structures P, Q at a specific angle. But it didn't work so well since it requires a constant, while length is the parameter passed in the function. After trying a few times, I found that the program will work when vector
Conclusion
Li, 2008, 'Finding the largest well-predicted subset of protein structure models', Journal of Combinatorial Pattern Matching, vol. Zhang Y, Skolnick J, 2005, 'The protein structure prediction problem could be solved using the current PDB library', Journal of Proc Natl Acad Sci USA, vol. Martin-Renom, Capriotti, Shindyalow and Bourne, 2009, Structural Equation and Alignment, Journal of Structural Bioinformatics, vol.l 44, pp.
Shindyalov, I.N., Bourne P.E, 1998, "Alignment of protein structure by incremental combinatorial extension (CE) of the optimal pathway", Journal of Protein Engineering, vol. Kedem, K., Chew, L., and Elber, R., 1999, 'Unit-vector RMS (URMS) as a tool to analyze molecular dynamic trajectories', Journal of Proteins, vol. Schneider, 2002, 'A genetic algorithm for the identification of conformationally invariant regions in protein molecules', Journal in Acta Crystallogr D Biol Crystallogr, vol.
Myrvold, 2004, “On the Cutting Edge: Simplified O(n) Planarity by Edge Addition”, Journal of Graph Algorithms and Applications http://jgaa.info/, vol. Karp, 1973, “A maximum matching algorithm for bipartite matching in bipartite graphs”, Journal of SIAM J. Obtain the angle of rotation that moves P into and out of contact with Q, using the intersection between the Q sphere and the unit.