Chapter Five
Validation of NMR Structures of Proteins and Nucleic Acids: Proton Geometry and Nomenclature 

Abstract 

A statistical analysis is reported of 1,200 of the 1,404 NMR-derived protein and nucleic acid structures deposited in the PDB until 1999. More than 100 entries have less than 95 % of the expected protons present and have been excluded, together with the entries that were not yet fully validated by the PDB. The aim is to assess the geometry of the protons in the remaining structures and provide a check on their nomenclature. Deviations in bond lengths, bond angles, improper dihedral angles and planarity with respect to estimated values were checked. Over 100 entries show anomalous protonation states for some of their amino acids. Approximately 250,000 (1.7 %) atom names differ from the consensus PDB nomenclature. Most of the inconsistencies are due to swapped prochiral labelling. Large deviations from the expected geometry exist for a considerable number of entries, many of which are average structures. The most common causes for these deviations seem to be poor minimisation of average structures and an improper balance between force-field constraints for experimental and holonomic data. Some specific geometry outliers are related to the refinement programs used. A number of recommendations for biomolecular databases, modelling programs, and authors submitting biomolecular structures are given. 

  


Doreleijers, J.F., Vriend, G., Raves, M.L., and Kaptein, R. (1999). Manuscript submitted to Proteins: Struct. Funct. & Genetics.