Research highlights of Alexandre M.J.J. Bonvin

Modelling of Biomolecular interactions
NMR projects
Computational projects


Modelling of biomolecular interactions

(HADDOCK highlight)


The structure determination of protein-protein complexes is a rather tedious and lengthy process, both by NMR and X-ray crystallography. Several methods based on docking to study protein complexes have been well developed over the past few years. Most of these approaches are however not driven by experimental data but based on combination of energetics and shape complementarity. HADDOCK (High Ambiguity Driven proteinÐprotein DOCKing) is an approach that makes use of biochemical and/or biophysical interaction data such as chemical shift perturbation data resulting from NMR titration experiments or mutagenesis data. This information is introduced as Ambiguous Interaction Restraints (AIRs) to drive the docking process. An AIR is defined as an ambiguous distance between all residues shown to be involved in the interaction. The accuracy of our approach was demonstrated with three molecular complexes. For two of these complexes, for which both the complex and the free protein structures have been solved, NMR titration data were available. Mutagenesis data were used in the last example. In all cases, the best structures generated by HADDOCK, i.e. the structures with the lowest intermolecular energies, were the closest to the published structure of the respective complexes (within 2.0A backbone RMSD).

For details see:

Since the original publication, HADDOCK has been extended to make use of other NMR data such as residual dipolar couplings and NMR relexation data that both provide orientational information on the components of a complex. For a description see:


As stated, HADDOCK is not limited to case for which NMR experimental information is available. Using HADDOCK together with information derived from various sources such as sequence conservation, mutagenesis data, epitope mapping..., we participated to the "blind" protein-protein complex prediction competition CAPRI. CAPRI is a community wide experiment on the comparative evaluation of protein-protein docking aimed at evaluating the performance of various approaches for predicting the structure of protein complexes from the individual components (see http://capri.ebi.ac.uk). Our information-driven docking approach distinguishes itself from other ab-initio docking programs in the fact that the available information is used a-priori to drive the docking process and limits the conformational search problem.

Using HADDOCK, we started participating to CAPRI from the fourth round on. In the fourth round, HADDOCK scored on average at the top for the four targets (1st for target 10, 2nd for target 11 and 3rd position for target 13 out of 200 submissions). For the most challenging target, the low pH trimeric form of the tick-borne encephalitis virus glycoprotein, an excellent model was obtained (2.8A RMSD) that matches closely the crystal structure (EMBO J. 2004:23, 728). (capri10 HADDOCK solution)
Our results in CAPRI are described in (note that this entire issue of Proteins is dedicated to CAPRI):


Our participation to CAPRI also led us to develop an interface prediction program called WHISCY standing for "WHat Information does Surface Conservation Yield?". It combines surface conservation and structural information to predict protein-protein interfaces. The accuracy of the predictions is more than three times higher than a random prediction. These predictions have been combined with another interface prediction program, ProMate [Neuvirth et al. [J Mol Biol 2004;338:181-199]], resulting in an even more accurate predictor. The usefulness of the predictions was tested using HADDOCK in an unbound docking experiment, with the goal of generating as many near-native structures as possible. Unrefined rigid body docking solutions within 10 ligand RMSD from the true structure were generated for 22 out of 25 docked complexes. For 18 complexes, more than 100 of the 8000 generated models were correct. Our results demonstrates the potential of using interface predictions to drive protein-protein docking.

(WHISCY overview)


This work is described in:



  • View Slide show or download PDF file (8.5 Mb) (Lecture given at the 2003 Keystone meeting on "Frontiers of NMR in Molecular Biology VIII", Taos New Mexico, Feb. 2003).
  • View Slide show or download PDF file (4.7 Mb) (Lecture given at the 2004 CBG+SPINE meeting on "Structural Proteomics and Protein-Protein Interactions", Amsterdam, June 2004).
  • View Slide show or download PDF file (2.8 Mb) (Lecture given at the 2005 FEBS meeting, Budapest, July 2005).

  • Visit the HADDOCK home page.

  • Visit the WHISCY home page.

  • Movies of the three docking phases:
    • Ridig body docking (QT (2.6 Mb))
    • Semi-flexible simulated annealing (QT (14 Mb))
    • Explicit solvent refinement (QT (8.2 Mb))


  • Computational aspects of biomolecular NMR


    Direct Use of Unassigned Resonances in NMR Structure Calculations with Proxy Residues .
    J. Am. Chem. Soc., 128, 7566-7571 (2006).

    PROXIES
    We have developed a method that significantly enhances the robustness of (automated) NMR structure determination by allowing the NOE data corresponding to unassigned NMR resonances to be directly used in the calculations. The unassigned resonances are represented by additional atoms or groups of atoms that have no interaction with the regular protein atoms except through distance restraints. These so called 'proxy' residues can be used to generate NOE based distance restraints in a similar fashion as for the assigned part of the protein. If sufficient NOE information is available, the restraints are expected to place the proxies at a positions close to the correct atoms for the unassigned resonance, which can facilitate subsequent assignment. Convergence can be further improved by supplying additional information about the possible identities of the unassigned resonances. We have implemented this approach in the widely used automated assignment and structure calculation protocols ARIA and CANDID. We find that it significantly increases the robustness of structure calculations with regard to missing assignments and yields structures of higher quality. Our approach is still able to find correctly folded structures with up to 30% randomly missing resonance assignments, and even when only backbone and beta-resonances are present! This should be of significant value to NMR-based structural proteomics initiatives.


  • Visit the PROXIES web site to download the PROXIES libraries and scripts.

  • CANDID structure calculations with and without PROXIES



    Comparison of structures calculated with and without proxies for DWNN. The reference structures (labeled 'ref') is the final water refined structures, calculated with complete assignments, without the use of proxies. The rows show the 10 lowest energy structures of the first and the final cycle of structures calculated with and without the use of proxies, for various amounts of missing assignments. The 35% missing assignment run correspond to a case in which all sidechain (except beta) resonance assignments were replaced by proxy assignments.



    Rapid protein fold determination using secondary chemical shifts and cross-hydrogen bond 15N-13C' scalar couplings (3hbJNC').
    J. Biomol. NMR, 21, 221-233 (2001).

    The solution structure of CI2 was calculated using sparse experimental information available at the stage of backbone assignment. The experimental data consisted of backbone phi/psi dihedral angles predictions for 32 residues obtained from secondary chemical shifts analysis with TALOS and 18 hydrogen bond restraints identified from cross-hydrogen bond 3hbJNC' couplings. This information was sufficient to generate models as close as 1.3/2.0A backbone rms deviations from the crystal structure for secondary structure elements and complete backbone, respectively. The fold was, however, not uniquely defined. Correct folds could be identified from a combination of clustering and knowledge-based potentials, while geometric and stereochemical criteria failed in distinguishing between native and non-native folds. The discrimination ability of knowledge-based potentials was greatly improved after refining the structures in explicit water using full van der Waals and electrostatic energy terms.
    CI2 exp. H-bonds
    Experimentally detected hydrogen bonds (red arrow)
    and backbone dihedral angle restraints (yellow residue)
    from TALOS analysis.
    CI2 NMR vs Xray
    CI2 structure from cross-hydrogen bond couplings
    versus crystal structure (red)


  • View Slide show or download PDF file (7.8 Mb) (Lecture given at the first CCPN meeting, Edinburgh April 2001).


  • A few selected previous research projects:.




    Computer simulations of biomolecular structure & dynamics

    A pictorial overview...


    Atomic insight into the CD4 binding-induced conformational changes in HIV-1 gp120.

    Proteins: Struc. Funct. & Bioinformatics, 55 582-593 (2004).

    The entry of HIV-1 into a target cell requires gp120 and receptor CD4 as well as coreceptor CCR5/CXCR4 recognition events associated with conformational changes of the involved proteins. The binding of CD4 to gp120 is the initiation step of the whole process involving structural rearrangements that are crucial for subsequent pathways. Despite the wealth of knowledge about the gp120/CD4 interactions, details of the conformational changes occurring at this stage remain elusive. We have performed molecular dynamics simulations in explicit solvent based on the gp120/CD4/CD4i crystal structure in conjunction with modeled V3 and V4 loops to gain insight into the dynamics of the binding process. Three differentiated interaction modes between CD4 and gp120 were found, which involve electrostatics, hydrogen bond and van der Waals networks. A binding funnel model is proposed based on the dynamical nature of the binding interface together with a CD4-attraction gradient centered in gp120 at the CD4-Phe43-binding cavity. Distinct dynamical behaviors of free and CD4-bound gp120 were monitored, which likely represent the ground and pre-fusogenic states, respectively. The transition between these states revealed concerted motions in gp120 leading to:
    i) loop contractions around the CD4-Phe43-insertion cavity
    ii) stabilization of the four-stranded bridging sheet structure and
    iii) translocation and clustering of the V3 loop and the bridging sheet leading to the formation of the coreceptor binding site.
    Our results provide new insight into the dynamic of the underlying molecular recognition mechanism that complements the biochemical and structural studies.

    Movies of the concerted loop motions in gp120 upon CD4 binding:
    • QT (1.2 Mb)
    • orthogonal view QT (1.2 Mb)
    Correlated loop closure
  motions in gp120

    CD4 binding-induced conformational changes of gp120 extracted via essential dynamics analysis.


    Molecular dynamics studies of a molecular switch in the glucocorticoid receptor.

    J. Mol. Biol., 328, 325-334 (2003)

    The glucocorticoid receptor (GR) is a hormone dependent nuclear receptor that regulates gene transcription when bound to the glucocorticoid response element (GRE). The GRE acts as an allosteric effector, inducing a structural change in the glucocorticoid receptor DNA-binding domain (GR DBD) upon binding, thereby switching the GR to an active conformation. A similar conformational change can be induced by two single point mutations: Ser459Ala and Pro493Arg. Structural and dynamical aspects of the conformational switch have been investigated by molecular dynamics simulations in explicit solvent using the GROMOS96 force field and programs. Our results indicate that these two mutants, which share a similar phenotype, exert their action at a structural level through different mechanisms. The two single point mutation induce conformational rearrangements in the D-loop and the short second helical region that are proposed to decrease unfavorable protein-DNA and protein-protein contacts and allow unspecific DNA-binding leading to the squelching phenotype of the mutants. The GR DBD can thus exist in two states, a transcriptionally active and an inactive state. Switching between these states can be accomplished either by GRE binding or by the described mutations.
    GR Ser459Ala mutation induced
  conformational changes
    The Ser459Ala mutant
    The modulation of the structure by this mutation has as origin the disruption of the structurally important hydrogen bond between Arg496 and Ser459. The resulting rearrangements in the core are propagated to the second helix as can be clearly seen from an essential dynamics analysis of the Ser459Ala MD simulation. This mutant mimics the effect of specific DNA-binding upon which the Arg496-Ser459 hydrogen bond is broken because of the hydrogen bonding of Ser459 to the DNA.
    GR activation mechanism
    Control mechanism of the glucocorticoid receptor activation
    Schematic model of the control mechanism of the glucocorticoid receptor activation upon specific or unspecific DNA-binding of the wild type (top panel) and the two single-point mutants Ser459Ala and Pro493Arg (bottom panel).


    A molecular dynamics view of the structural water in the Trp-operator.

    J. Mol. Biol. 282, 859-873 (1998).

    Crystallographic studies of Trp-repressor-operator complexes (1,2) have revealed the presence of structural waters at the protein-DNA specific recognition interface. These latter mediate hydrogen bonds between the protein and the DNA bases. Conserved waters have also been identified at similar sites in the crystal structure of the free operator (3). Solution NMR studies of the free operator (4) have however indicated that no water resides for more than 500 ps on the DNA surface. To clear this apparent contradiction, we investigated the hydration of the free TRP-operator DNA from a 1.4 ns molecular dynamics simulation using the GROMOS96 force field.

    Our results show that the potential donors and acceptors in the DNA major and minor grooves are hydrogen-bonded to water molecules for about 80% of the time on average, with individual values up to 99%. The hydration sites on the DNA are well localized. These results are consistent with the observation of structural water in the crystal structures. The hydration is however a highly dynamical process, with an average hydrogen bond lifetime of 10 ps +- 15 ps, the maximum hydrogen bond lifetimes observed being in the order of 300 ps. This highly dynamic behaviour is consistent with the NMR observations.

    Molecular dynamics is thus able to reconcile crystallographic and NMR experimental observations and gives insight in the dynamics of DNA hydration.

    References:

    1. Otwinowski Z., et al., Nature 335, 321 (1988).
    2. Lawson C.L., and Carey J., Nature 366, 178 (1993).
    3. Shakked Z. et al., Nature 368, 469 (1994).
    4. Sunnerhagen M., et al., J. Mol. Biol 282, 847 (1999).


  • View Slide show or download PDF file (6.4 Mb).


  • Localisation and dynamics of sodium counterions around DNA in solution.

    Eur. Biophys. J. 29, 57-60 (2000).

    The localisation and dynamics of sodium counterions around the DNA duplex d(AGCGTACTAGTACGCT)2 corresponding to the trp operator fragment used in the crystal structure of the half site complex has been studied by a 1.4 ns molecular dynamics simulation in explicit solvent. A continuous and well-defined counterion density is shown to be present around the minor groove, while density patches are found in the major groove in regions where DNA bending is observed. A residence time analysis reveals the dynamic nature of these distributions. The resulting picture agrees with previous theoretical and experimental studies of A-tract DNA sequences, and is consistent with the polyelectrolyte condensation model.
    Minor groove Major groove
     
    ionic spine around minor groove ionic spine around major groove
     
    Sodium probability maps around the average DNA structure calculated from 1.0 (orange) and 0.5 (magenta) ns trajectory, respectively: left) minor groove, right) major groove. The probability map were calculated from 1000 (0.4-1.4 ns) and 500 (0.9-1.4 ns) snapshots tak en at 1 ps interval from the MD trajectory and are plotted at 2.5 standard deviations above the mean.



    Go back to home page of Alexandre Bonvin