HADDOCK |
|
| |
Introduction |
| |
|
The structure determination of protein-protein complexes is a rather tedious and lengthy process, both by NMR and X-ray crystallography. Several methods based on docking to study protein complexes have been well developed over the past few years. Most of these approaches are however not driven by experimental data but based on combination of energetics and shape complementarity. HADDOCK (High Ambiguity Driven biomolecular DOCKing) is an approach that makes use of biochemical and/or biophysical interaction data such as chemical shift perturbation data resulting from NMR titration experiments, mutagenesis data or bioinformatic predictions. This information is introduced as Ambiguous Interaction Restraints (AIRs) to drive the docking process. An AIR is defined as an ambiguous distance between all residues shown to be involved in the interaction. The accuracy of our approach was initially demonstrated with three molecular complexes. Since the original 2003 JACS publication, HADDOCK has been extended to deal with a large variety of data and complexes. Next to protein-protein docking, HADDOCK has been applied to the modelling of protein-DNA, protein-RNA, protein-oligosaccharides and protein-ligand complexes. A list of articles with various application examples can be found here.
The concept of HADDOCK is described in the following paper:
As stated, HADDOCK is not limited to case for which NMR experimental information is available.
Using HADDOCK together with information derived from various sources such as sequence conservation,
mutagenesis data, epitope mapping..., we participated to the "blind" protein-protein complex
prediction competition CAPRI. CAPRI is a community wide
experiment on the comparative evaluation of protein-protein docking aimed at evaluating the
performance of various approaches for predicting the structure of protein complexes from the
individual components (see http://capri.ebi.ac.uk).
Our information-driven docking approach distinguishes itself from other ab-initio docking
programs in the fact that the available information is used a-priori to drive the docking
process and limits the conformational search problem.
|
|
Using HADDOCK, we have been participating to CAPRI since the fourth round. In the fourth round,
HADDOCK scored on average at the top for the four targets (1st for target 10, 2nd for
target 11 and 3rd position for target 13 out of 200 submissions). For the most challenging
target, the low pH trimeric form of the tick-borne encephalitis virus glycoprotein,
an excellent model was obtained (2.8A RMSD) that matches closely the crystal structure
(EMBO J. 2004:23, 728).
|
 |
| |
Our results in CAPRI are described in (note that this entire issue of Proteins is dedicated to CAPRI):
HADDOCK has first been developed for protein-protein docking, but has since been applied to a variety of protein-ligands, protein-peptides and protein-DNA/RNA complexes (see publications). We have in
particular developed protocols for flexible protein-DNA docking. Intrinsic flexibility of DNA has hampered the
development of efficient protein-DNA docking methods. Our new protocols consists of two sequential docking runs: in the semi-flexible refinement state of the first docking run, DNA flexibility is allowed for all DNA nucleotides and the residues of the protein at the predicted interface. The resulting solutions are analysed and subsequently used to generate a library of pre-bent and twisted DNA structures that serve as input for a second docking round.
We evaluated our approach on the monomeric repressor-DNA complexes formed by bacteriophage 434 Cro, the Escherichia coli Lac headpiece, and bacteriophage P22 Arc. Starting from unbound proteins and canonical B-DNA we could correctly predict the correct spatial disposition of the complexes and the specific conformation of the DNA in the published complexes. The resulting top ranking solutions exhibit high similarity to the published complexes in terms of RMS deviations, intermolecular contacts and DNA conformation.

Best solutions of the unbound flexible docking using a library of pre-bent and twisted DNA structures
(blue) superimposed on the reference structure (yellow): Cro-O1R (left), Lac-O1 (middle), Arc-operator (right). The
structures were superimposed on all heavy atoms of the interface residues (Interface RMSD values: Cro, 1.62 A; Lac,
2.02 A; Arc, 1.90 A).
Our two-stage docking method is thus able to successfully predict protein-DNA complexes from unbound constituents using non-structural experimental data to drive the docking. For details see:
HADDOCK consists of python scripts derived from ARIA written by Michael Nilges and Jens Linge and makes use of CNS as structure calculation software. Additional scripts (csh, awk, perl) and/or (third party) software are also used to either prepare the data for HADDOCK or analyze the results (see software links).
|
| |
On our HADDOCK home page you will find:
- general information on HADDOCK
- tools to generate AIR restraint files and to setup projects
- instructions to obtain HADDOCK
- links to the various softwares required to run HADDOCK
- a manual describing the use of HADDOCK
- a frequently asked questions section
- various tutorials
|
Please send any suggestions or enquiries to Alexandre Bonvin
|
|