In order to validate the HADDOCK program, we initially performed the docking on three
known complexes (see the
HADDOCK paper.
We are providing in the haddock/examples directory the data for all three complexes.
This tutorial deals however only with the E2A-HPr complex. The structures in the free
form1,2 and NMR titration data3,4 are available. The complex structure
had been solved experimentally by NMR5 (PDB entry 1GGR).
Note that the example uses the NMR solution structure
of HPr (entry 1HDN) to allow the docking from an ensemble of
conformations. Because of that, the definition of active and
passive residues might differ from the one in the HADDOCK
paper since the filtering is based on the average accessibilities
(see the AIR restraints section).
You should follow the following steps to run this tutorial:
- Setup
- Defining the AIR restraints
- Setting up the project for HADDOCK
- Defining the parameters for the docking and running HADDOCK
- Analyzing the docking results
- Rerunning the HADDOCK analysis on a cluster basis
- Comparing the clusters and the reference structure
1. Setup
To start this tutorial, first copy the e2a-hpr directory from the haddock/examples directory:
cp -r $HADDOCK/examples/e2a-hpr .
Note: The $HADDOCK environment variable should be defined if haddock was
properly installed.
Go into the e2a-hpr directory.
You will find in there the PDB files of the E2A and the HPr proteins:
- e2a_1F3G.pdb
- e2aP_1F3G.pdb (with phosphorylated histidine)
For HPr, we will use the first 10 conformations from the PDB entry 1HDN. These can be found
in the hpr> directory:
- hpr_1.pdb
- ...
- hpr_10.pdb
The file containing the listing of those 10 structures for docking from an ensemble
of structures is provided (Note: edit it to modify the directory path!):
- hpr-files.list
The PDB file of the reference complex can be found in the ana_scripts directory
and is called
- e2a-hpr_1GGR.pdb.
The (average) per residue solvent accessibilities calculated with
NACCESS
(see the AIR restraints
section) are also provided for convenience
- e2a_1F3G.rsa
- hpr/hpr_rsa_ave.lis
You also find a number of additional files containing the AIR restraints that we
used for calculations,
- e2a-hpr_air.tbl
diffusion anisotropy restraints (simulated ones),
- dani.tbl
a number of Rasmol
scripts for visualization (see the "Defining AIRs" section) of the active and
passive residues for E2A and HPr,
- e2a_rasmol_active.script
- e2a_rasmol_active-passive.script
- hpr_rasmol_active.script
- hpr_rasmol_active-passive.script
two example files for setting up a new project
- new.html-example
- new.html-dani (for use with diffusion anisotropy restraints)
and to define all the parameters for docking
- run.html-refe
- run.html-dani
(see the Defining the parameters for the docking and running HADDOCK
section for usage and parameter definitions)
Various analysis scripts can be found in the ana_scripts directory. They will allow you
to calculate the interface RMSD (i-RMSD) (RMSD calculated on backbone atoms of all residues within 10A
from the partner molecule) and ligand RMSD (l-RMSD) (RMSD calculated on the smallest
component (hpr) after fitting on the largest component (e2a)). These are standard
CAPRI definitions. A script is also provided that allows
you to calculate the fraction of native contacts. For usage and details refer to the README file in the ana_scripts directory.
Finally, a script to copy data from an example HADDOCK run is provided as:
- copy_hadddock_files.csh
2. Defining the AIR restraints
Some general explanations about how to define the interface can be found
in the HADDOCK AIR restraints section.
For this particular example, based on the available NMR titration data3,4,
and the relative solvent accessibilities, we defined 11 active residues for E2A:
- active: D38, V40, I45, V46, K69, F71, S78, E80, D94, V96 and S141
and 10 active residues for HPr:
- active: H15, T16, R17, A20, F48, K49, Q51, T52, G54 and T56
Note: The active residues and the filtered ones
can be viewed in
Rasmol
using the provided Rasmol scripts, e2a_rasmol_active.script
and hpr_rasmol_active.script. For example for E2A, start rasmol with
rasmol e2a_1F3G.pdb
and then, at the rasmol prompt type:
RasMol> source e2a_rasmol_active.script
The active residues are colored in red and the filtered-out active residues (if any) in yellow.
For Hpr, three active residues were filtered out because they are not solvent accessible.
These are residues 50,53 and 55. Control in the hpr_rsa_ave.lis file giving the
relative solvent accessible surface area per residue for hpr that these three residues
are indeed not solvent accessible (The criteria used is +SD > 50%).
You will now have to define the passive residues. For this, for each structure
in turn (E2A and HPr), load a PDB file in Rasmol, highlight all active
residues (in Rasmol source the corresponding e2a/hpr_rasmol_active.script) and
pick all neighbors of active residues. Filter then the selected residues with their
solvent accessibility (see the
filtering residues or filtering
from an ensemble of structures sections).
Note: Rasmol scripts containing all the active and passive
residues that we defined for E2A and HPr are provided for
comparison (e2a_rasmol_active-passive.script and
hpr_rasmol_active-passive.script.
For both E2A and HPr we defined 9 passives residues:
- E2A passive: P37, V39, E43 G68, E72, E97, E109 and K132
- HPr passive: P11, N12, Q21, K24, K40, S41, L47, Q57 and E85
Having defined active and passive residues for both molecules, you can now generate the
AIR restraint file for use in HADDOCK. For this go to the HADDOCK
project setup section, click on
"generate AIR restraint file" and follow the instructions.
3. Setup a new project for HADDOCK
To start a new project in HADDOCK you have to define a number of files such as
input PDB files for both molecules, various restraint files and the path and name
of the project. For this, go again to the
project setup page, click on
"start a new project" and follow the instructions.
After saving the new.html file to disk, type haddock2.1
in the same directory. This will generate a run directory containing all necessary
information to run haddock. See the "a new project"
section on the HADDOCK home page for more information and for a description of the
content and organization of the generated run directory.
Note:An example of a new.html
file can be found in the haddock/examples/e2a-hpr
directory as
new.html-example.
4. Defining the parameters for the docking and running HADDOCK
The next step consists in editing the various parameters that will govern the
docking process in the run.cns file in the newly generated run directory.
For this, again go to the project setup page,
enter the path of the run.cns file and click on "edit.
Refer to the run.cns section for a description
of the various parameters.
Note: Most of the parameters in run.cns are set by default
for the e2a-hpr example and you will only need to modify a few of those.
- Check the various paths to your project directory and the filenames
to be used ("Filenames" section)
- Check the definition of the semi-flexible interface
("Definition of the interface"
section). The default in HADDOCK2.1 is now -1, meaning that the semi-flexible
interface will be automatically defined by HADDOCK from a contact analysis.
- Check the number of structures to generate at the various steps
and decrease those if needed (strongly recommended if you are going
to run all the structure calculations on a single processor).
For the present tutorial, keep the default parameters (1000 structures
for the rigid body docking stage and 200 afterwards). You will not perform
a complete HADDOCK run but simply copy the data from the provided example
run directory and move on with the analysis of the results
("Number of structures to dock" section)
- Check the queue commands for running the jobs, the location of the
cns executable and the number of jobs to run simultaneously.
("Parallel jobs" section)
For the present tutorial, use csh as queue command and 1 as number
of jobs. The CNS executable path should be properly set.
Once you have properly set all parameters, save the modified run.cns
file to disk.
Note:An example of a run.cns
file can be found in the haddock/examples/e2a-hpr
directory as
run.cns-refe.
You would now be ready to run HADDOCK by simply typing haddock2.1
in the same directory and the docking would start.
Instead, to save time and move directly to the analysis section, go back into your project
directory (e2a-hpr) and source the provided copy_haddock_files.csh csh script
that will copy all the necessary haddock data files from the example directory into your
current run directory. Provide as argument the name of your run directory, e.g.:
./copy_haddock_files.csh run1
This script should copy all data into your run directory (Note that this can take some
time since about 320MB of data need to be copied...). Once the run data have been copied,
when typing haddock2.1 in the same run directory you should see an output message telling
you that all files are present and that HADDOCK is finished:
|