Ligand Docking Server Documentation
RosettaLigand is a tool for docking small molecules into proteins. RosettaLigand takes as input an SDF file containing the small molecule ligand to be docked, and a PDB file containing the protein the ligands should be docked into.
- By default, only the provided ligand conformers will be used. If you don't want to use conformers, provide an SDF file with a single ligand conformation
- Conformers can be generated using OpenEye Omega, the BCL, MOE or other conformer generation tools. All conformers in the same SDF file should have the same name. There's an option to have the server automatically generate a set of conformers using the BCL.
- All ligand conformers in the input file must have 3D coordinates and added hydrogens. (This applies even if the server-generated conformer option is being used.)
- The ROSIE Ligand docking protocol is not set up to do virtual high throughput screening (vHTS).
Each job submission should consist of a single small molecule being docked to a single protein.
Providing multiple, chemically distinct molecules in the input SDF file will result in an error.
- RosettaLigand cannot perform binding site detection. The approximate location of the binding site within 5 Å should be known.
- While RosettaLigand is usually capable of making accurate binding predictions, some protein systems are very difficult to dock into,
the following guidelines can help maximize the likelihood of obtaining high quality predictions:
- Docking performance his highly dependent on backbone conformation. If possible, use an X-ray crystal structure with resolution less than 2.0 Å as input
- Apo structures are more difficult to dock into. If possible, use an input structure co-crystallized with a bound inhibitor or native ligand
- If multiple crystal structures of the same protein exist, dock ligands into all of them
- RosettaLigand is not optimized for docking into shallow binding pockets, or predicting surface binding interactions. For
best results, use relatively deep binding pockets
- While RosettaLigand is capable of handling systems with co-factors, metal ions, or tightly bound waters at the protein-ligand interface,
these systems are enormously more complex, and the likelihood of RosettaLigand being capable of correctly handling these systems is reduced.
We strongly recommend against using this server for docking into protein-ligand systems with co-factors, metal ions, or waters at the interface.
- If a crystal structure with a bound inhibitor or native ligand exists, benchmark RosettaLigand by re-docking this inhibitor into the
crystal structure. If the lowest scoring model is not within 2.0 Å RMSD of the crystal structure, it is unlikely that Rosetta will be
capable of making accurate predictions with this protein system. See Interpreting Results for details.
- The RosettaLigand protocol used here typically requires about 200+ models to produce a high quality protein-ligand docking pose. (See DeLuca et al. PLoS ONE 10(7): e0132508 for performance details.)
In general, the default input parameters for the RosettaLigand server are reasonable. The parameters have the following definitions:
- Input PDB File -- **required** A PDB file containing the protein without the ligand present. Residues which Rosetta does not natively recognize
(such as waters, crystallographic reagents and cofactors) will be automatically removed, although for best results it is recommended to
edit the PDB to contain just the protein portion of the molecule to be docked to before uploading.
- Input SDF File -- **required** An SDF file containing the conformers of a *single* ligand to be docked. (At this time only the common 'V2000'-style SDF files are supported.)
If no conformers are available, the SDF file should contain only a single ligand conformation. Every record in the SDF file must have 3D coordinates and all hydrogens added.
If your SDF file is not being recognized by the RosettaLigand protocol, we recommend passing it through OpenBabel or opening and resaving with Avogadro
to normalize the file format. These programs can also be used to convert PDB, MOL2, and other formats to SDF format, as well as adding hydrogens and converting 2D representations to 3D.
- Generate ligand conformers with the BCL -- If checked, conformers will be generated with the BCL,
using the first structure in the provided ligand SDF file. These conformers will be added to the set of conformers present in the input file.
If unchecked, just the conformer library provided in the input SDF file will be used.
- Maximal number of ligand conformers to generate: -- If "Generate ligand conformers with the BCL" is checked,
this is the maximal number of conformers which will be generated.
In practice, fewer than this number will actually be generated, due to limits on conformational flexibility of the molecule.
- Use the starting coordinates in the SDF -- If checked, the starting position of the ligand (the center of the binding site)
will be taken from the first conformer in the input ligand SDF.
If you use this option, it's highly recommended to check that the first conformer is positioned roughly in the correct binding site by loading
both the protein PDB and the ligand SDF in a structure viewing program like PyMOL prior to submission.
(Only the coordinates of the first conformer are used - subsequent conformers will be re-aligned.)
- X,Y,Z coordinates of starting position -- If "Use the starting coordinates in the SDF" is left unchecked,
the cartesian coordinates where the ligand should be initially placed prior to docking must be manually specified.
This should be as close as possible to the actual ligand binding site (within the "maximum radius to search"). If a crystal structure is available with a bound
ligand in the active site, the geometric center of that ligand is usually a good starting place. If a bound ligand is not availible,
a good approach is to average the coordinates of atoms surrounding the desired binding pocked.
- Number of structures to generate -- The number of docking predictions to create. 200 is a good number of docking predictions for the current default settings.
- Ligand chain name -- This is what Rosie will call the ligand chain when it's added to the output structures.
The default value here is "X", which is almost certainly acceptable. You can change this if your protein has a chain "X"
- Maximum radius to search -- The radius from the starting position in Angstroms to sample during the initial phase of docking. 5 Å
is usually a good starting point
- Maximum number of cycles of low-resolution Monte Carlo Sampling -- Initial low-resolution sampling is a Monte Carlo process of rotation,
translation, and conformer selection. This is the number of Monte Carlo steps. 500 is a good default.
- Initial Perturbation -- The starting position and orientation of each output structure will be randomized, within a sphere of the given radius from the starting position.
- Width of low-resolution scoring grid -- For speed, the low resolution sampling stage has a pre-computed scoring grid.
For accurate scoring, this should cover the all positions which the ligand can sample - so this should be greater than the maximum
length of the ligand plus the search radius.
- Monte Carlo move/angle steps -- The maximum size (in Å/degrees) of a single translational/rotational perturbation step in low resolution Monte Carlo sampling.
The defaults should be good for most cases.
- Total cycles of highres docking to perform -- The total number of cycles of high resolution docking to perform. 6 is almost always a good number.
- Repack every nth cycle of highres docking -- High resolution docking consists of alternating cycles of repacking and small perturbation moves.
This option specifies how often repacking occurs. 3 is a good default.
In general, the interface energy is the best metric for discriminating between ligand binding poses. Because RosettaLigand only minimizes protein atoms within 7 Å of the ligand, and because the Rosetta energy function is evaluated as the sum of all residues in the protein, the total score is generally very noisy. Thus, we recommend that the poses with the lowest interface_delta score be selected. However, structures with abnormally high total score (as compared to the other structures in the run) may indicate a docked conformation which has contorted the protein in order to bind the ligand.
The transform_accept_ratio in the scorefile gives a rough diagnostic about how well the low-resolution stage performed. This should normally be between 0.2 and 0.8. Having more than a few structures with a transform_accept_ratio of 0 means that either the initial perturbation is set too high (for best results, this should normally be less than two thirds of the the pocket size), the grid size is too small (this should be normally set to more than the length of the ligand plus twice the pocket size), or the pocket is too small for the ligand size (try additional conformers or a different starting backbone).
If a structure co-crystallized with a bound inhibitor is present, the native ligand should be re-docked into the crystal structure using the same input settings as will be used in the experimental study. This re-docking will serve as a validation experiment to determine if RosettaLigand is capable of correctly modeling the protein system. If the lowest scoring model generated has an RMSD of more than 2.0 Å, it suggests that you are working with a protein system that Rosetta is unable to model effectively, and any predictions generated by this server should be viewed with skepticism.
When docking a ligand with unknown activity or binding position, generate at least 200 models, and select the best 1-20 models by interface_delta score (lower scores are better). The interface_delta score is the difference between the total Rosetta energy score with the ligand bound, and the ligand unbound. In general, RosettaLigand is capable of discriminating between well and poorly bound ligands based on score.
The small ensemble of top scoring models should then be evaluated visually using a tool like pymol. The predicted binding poses should be evaluated in the context of existing crystal structure information, and whatever experimental or structural data is available to you.
RosettaLigand is intended to dock small molecule ligands only (metabolite- or drug-like organic molecules). It is not intended for docking protein, peptide, or nucleic acid ligands.
- For protein-protein docking see the ROSIE docking server.
- For protein-peptide docking see the FlexPepDock server.
- There is not currently a server for protein-nucleic acid docking with Rosetta.
Please cite the following article when referring to results from our ROSIE server:
Deluca, S., Khar, K., Meiler, J. (2015).
Fully Flexible Docking of Medium Sized Ligand Libraries with RosettaLigand.
PLoS ONE 10(7): e0132508. doi:10.1371/journal.pone.0132508
Combs, S. A., Deluca, S. L., DeLuca, S. H., Lemmon, G. H., Nannemann, D. P., Nguyen, E. D., et al. (2013).
Small-molecule ligand docking into comparative models with Rosetta. Nature Protocols,
8(7), 1277–1298. doi:10.1038/nprot.2013.074
BCL conformer generation:
Kothiwale, S., Mendenhall, J.L., Meiler, J. (2015)
BCL::Conf: small molecule conformational sampling using a knowledge based rotamer library. J. Cheminform.,
7, 47. doi:10.1186/s13321-015-0095-1
Lyskov S, Chou FC, Conchúir SÓ, Der BS, Drew K, Kuroda D, Xu J, Weitzner BD, Renfrew PD, Sripakdeevong P, Borgo B, Havranek JJ, Kuhlman B, Kortemme T, Bonneau R, Gray JJ, Das R.,
"Serverification of Molecular Modeling Applications: The Rosetta Online Server That Includes Everyone (ROSIE)".
PLoS One. 2013 May 22;8(5):e63906. doi: 10.1371/journal.pone.0063906. Print 2013.
We welcome scientific and technical comments on our server. For support please contact us at Rosetta Forums with any comments, questions or concerns.
Modeling tools developed by the Meiler Lab at Vanderbilt University. The Rosie implementation was developed by Samuel DeLuca, Rocco Moretti and Sergey Lyskov.