Ligand design through incremental construction

127

128 We have put in place the essential elements of Rosetta ligand design. These elements include code that (1) grows the small molecule by connecting fragments randomly chosen from a library of building blocks; (2) docks and scores the growing ligand, accepting or rejecting new growths based on Rosetta energy scores; (3) specifies growth termination criteria including molecular mass, number of hydrogen bond acceptors, number of hydrogen bond donors, number of heavy atoms, or total number of atoms;

(4) adds hydrogen to unsatisfied valences to terminate ligand growth.

With these elements in place we sought to implement the following algorithm, presented as pseudo-code…

generate fragment library with conformers for each fragment for fragment in fragment library

dock fragment sampling fragment conformers and side chain rotamers predict binding affinity

keep strongest binding affinity fragment as starting fragment while user defined growth cutoffs have not been met

connect random fragment at random connection point sample fragment conformers and side chain rotamers sample rigid body position for extended molecule

accept or reject new growth using a Monte Carlo approach minimize backbone, side‐chain, and ligand torsion angles

Protocol for incremental construction of small molecules. A) Ligand design begins with a protein binding site (crescent) and a library of small molecule fragments (shapes). B) Low-resolution search for starting fragment. C) refinement of complex.

D) Low resolution search for extension fragment. E) Refinement of complex.

129 Creation of ligand fragments

Before ligand design can occur, a library of small molecule fragments must be generated.

The script mol_to_params.py has been modified to allow creation of these small molecule fragments. Creation of small molecule fragments begins with the identification of MOL or MOL2/MDL files of molecules containing the fragments the user would like to add to his/her fragment library. To the bottom of these files are added lines of the format “M SPLT <ID_1>

<ID_2>”. ID_1 and ID_2 should be replaced with the atom ID in the MOL or MDL file between which you would like to create fragments. If this is a hydrogen bond, then only one fragment will be created.

GrowLigand

Ligand design begins with the docking of an initial molecule fragment from a library of fragments. This step proceeds using the standard docking procedure outlined in chapter 3. Next the initial fragment is extended using the GrowLigand XML element. GrowLigand at this point does not have any scoring functionality. It simple selects at random a fragment from the provided fragment library and attaches that fragment to the ligand with the specified chain. The attachment is made by randomly selecting a connection point on the growing ligand and on the fragment to be connected. Docking and scoring of the extended ligand occurs via ligand docking XML described in chapter 2.

Filters for terminating ligand design growth

Hydrogen bond acceptor and donor filters stop growth when a provided cutoff is surpassed. The HeavyAtom filter terminates growth when the number of non-hydrogen atoms has reached a cutoff value, while the AtomCount filter relies on total number of atoms including

130

hydrogens. Similarly the MolecularMass filter and MolarMass filters stop small molecule extension when their limits are exceeded. The CompleteConnections filter stops growth if there are no growths possible. The ChainExistsFilter is useful to ensure that ligand design only occurs for those starting fragments that have been positioned in the binding pocket.

AddHydrogens

After terminating growth of a small molecule, it is likely that the molecule contains connection points that are not connected to other small molecule fragments. These connecting atoms retain geometry that suggests atoms are missing. The AddHydrogens code adds hydrogens to these unsatisfied connection points.

Ligand design XML

Below is described a complete XML algorithm for designing a small molecule fragment- by-fragment using the new Rosetta code. This algorithm has not been tested or optimized. It is designed as a starting point for future users and developers to improve upon. In short, this algorithm docks a starting fragment, extends, docks, and extends again until cutoff criteria are reached. Finally ligand specific scores are appended to the output PDB.

<ligand_soft_rep weights=ligand_soft_rep>

131 <Reweight scoretype=rama weight=0.2/>

</ligand_soft_rep>

<hard_rep weights=ligand>

</hard_rep>

</SCOREFXNS>

<LIGAND_AREAS>

<docking_sidechain chain=X cutoff=6.0 add_nbr_radius=true all_atom_mode=true minimize_ligand=10/>

<final_sidechain chain=X cutoff=6.0 add_nbr_radius=true all_atom_mode=true/>

<final_backbone chain=X cutoff=7.0 add_nbr_radius=false all_atom_mode=true Calpha_restraints=0.3/>

</LIGAND_AREAS>

<INTERFACE_BUILDERS>

<side_chain_for_docking ligand_areas=docking_sidechain/>

<side_chain_for_final ligand_areas=final_sidechain/>

</INTERFACE_BUILDERS>

<MOVEMAP_BUILDERS>

</MOVEMAP_BUILDERS>

</CompoundStatement>

</FILTERS>

single movers

</StartFrom>

132

compound movers

</ParsedProtocol>

</ParsedProtocol>

</ParsedProtocol>

</ParsedProtocol>

</MOVERS>

Dock the starting fragment

Grow and dock in a loop, until cutoff filters are hit <Add mover_name=grow_loop/>

Add final ligand scores

</PROTOCOLS>

</ROSETTASCRIPTS>

Future directions

While many of the essential elements of RosettaLigandDesign have been implemented, the effectiveness of RosettaLigandDesign has not been demonstrated. This will first require the preparation of a database of design fragments. For each fragment a rotamer library similar to

133

amino acid rotamer libraries should be prepared. Rotamer libraries will allow flexibility through conformational sampling during design.

In order to evaluate RosettaLigand designs, it will be necessary to develop a design benchmark. This will consist of a collection of 20 protein/ligand complexes of known structure.

Each ligand in this dataset will be split into fragments. The first benchmark test will require Rosetta to reassemble the fragments using the correct connectivity and recovering the correct binding pose. The second benchmark will build upon the first by mixing all the fragments from the 20 ligand benchmark and requiring Rosetta to select the correct fragments from among this set. Finally Rosetta will be required to design small molecules using fragments from the complete ligand fragment library with rotamers. Results will be evaluated based on the number of key contacts that are recapitulated. Between each test it will be necessary to reevaluate and optimize the design algorithm. This will involve finding a balance between efficiency and accuracy. Fast grid based low-resolution screening of fragments will be used to increase efficiency.

134

Dalam dokumen GORDON HOWARD LEMMON - Institutional Repository Home (Halaman 142-149)