Entwicklung einer Datenbank und wissensbasierter Vorhersagemethoden zur Untersuchung von Wassermolekülen in Proteinstrukturen sowie ihrer Rolle in der Protein-Liganden-Bindung
Die vorliegende Arbeit befasst sich mit dem Aufbau der ersten Datenbank zur Charakterisierung von Wassermolekülen in Proteinstrukturen. Diese wurde als Modul der Rezeptor-Ligand-Datenbank Relibase+ konzipiert und erfasst alle Röntgenstrukturen der Proteindatenbank PDB. Diese Datenbas...
Saved in:
Main Author: | |
---|---|
Contributors: | |
Format: | Doctoral Thesis |
Language: | German |
Published: |
Philipps-Universität Marburg
2003
|
Subjects: | |
Online Access: | PDF Full Text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Table of Contents:
This study is dedicated to the development of the
first database for characterizing water molecules in protein
structures. It was designed as a module for the receptor-ligand
database Relibase+ and comprises all the X-ray structures of
the protein databank PDB. The data collection has subsequently
been utilized for the development and validation of
knowledge-based methods for the prediction of water sites. Of
particular interest are water molecules buried in the
protein-ligand interface, since these water molecules defy the
simple lock and key principle that most rational drug design
methods rely on.
Chapter 2 reviews the knowledge on water
molecules in protein structures gathered to date against the
background of the underlying experimental methods. Chapter 3
describes the conception of the water database as well as some
of the application examples for the tools implemented. The
tools developed for comparative analysis of solvation patterns
allow different references to be used. Both the structural
similarity of ligands as well as the sequential relationships
of proteins can serve as the reference for superimposition of
the respective structures in three-dimensional space. Although
recurrent solvation patterns (conservation of water molecules)
are predominantly determined by physicochemical properties
exposed on the protein surface, the influence of the ligand
should not be underestimated. Moreover, apart from classical
hydrogen bonds, weaker interactions such as CH-hydrogen bonds
can play a relevant role.
The water database also includes a
tool for the detection of crystallographically misassigned
water molecules, which enhances known methods (see chapter 7).
The method estimates as to whether or not some particles
assigned as water molecules might instead represent a sodium
(or magnesium) ion. The algorithm combines a set of
descriptors, which include the coordination geometry of the
particle, its B-factor and its electrostatic valence, which is
derived from the contact lengths to the atoms in the local
neighbourhood.
In Chapter 4, hydration structures in
proteins are examined by means of statistical methods. This
analysis reveals important conditions that predictive methods
in rational drug design have to meet in order to appear
promising. Water molecules buried in the protein-ligand
interface are mostly conserved with respect to the ligand-free
structure, while a significant shift of their positions upon
ligand binding is only rarely observed. Thus, water sites in a
ligand-free structure provide an indication as to where
feasible water sites in a protein-ligand-complex can be found.
However, the degree of conservation amongst water sites from
two sequence-identical protein pockets depends significantly on
the structural similarity of the two bound ligands.
Chapter
5 deals with the prediction of conserved water molecules in
different scenarios by means of a GA/knn algorithm. Using the
descriptors developed with the water database, a prediction
accuracy of 82% was achieved for the discrimination of
crystallographically determined water sites from non-solvated
positions on a protein surface. In this application scenario,
the approach outperformed all previously reported methods. A
similar improvement when compared with existing methods was
also achieved for the classification of conserved versus
non-conserved water molecules in a comparison of different
structures of the same protein (prediction accuracy 78%). To
this end, firstly, it was necessary to consider as many
structural comparisons as possible for a reference protein when
compiling the knowledge base and, secondly, to account for the
proven bias introduced by the individual crystallographers who
author the structures, respectively. For the discrimination of
water molecules conserved upon ligand binding versus others
that are non-conserved, a prediction accuracy of 73% was
obtained. In this scenario, it is primarily the influence of
the individual bound ligand that limits the performance of the
algorithm. Depending on the respective protein binding pocket,
the approach of pre-placing selected water molecules in fixed
positions in the setup of, e.g., a virtual screening, can thus
be inappropriate.
Therefore, chapter 6 focuses on a
methodical enhancement of the Particle Concept implemented in
FlexX. This approach allows for flexible placement of water
molecules during the build-up of the individual ligand in the
binding pocket. Implementation of an enhanced version of the
scoring function DrugScore, which takes water molecules into
account, improved the energy ranking of the generated solutions
significantly. The new scoring scheme not only outperforms the
originally implemented empirical function by Boehm, it also
yields a by 15% improved recognition of near native binding
modes on top rank (RMSD<= 1.0A) when compared with the
standard DrugScore version.