Secbase:A Novel Tool to Correlate Secondary Structure Elements with Ligand Binding
The main focus of this work is the development of new methods for computer aided drug design. This development and the final tools are described together with studies to show possible applications and to prove the usefulness of these tools. The first part of this work describes the integration of...
|Online Access:||PDF Full Text|
No Tags, Be the first to tag this record!
|Summary:||The main focus of this work is the development of new methods for computer aided drug design. This development and the final tools are described together with studies to show possible applications and to prove the usefulness of these tools. The first part of this work describes the integration of secondary structure element information together with geometric descriptions into a protein-ligand database (Relibase), which leads to the new modul Secbase. The python-based interface to Relibase (Reliscript) was used to add the information. Furthermore, the C++ core code of Relibase was extended to get access to this data and to add Secbase constraints to the substructure search. This leads to the opportunity to specifiy Secbase constraints within the substructure search accessible through the Relibase webinterface. The motivation for the development of Secbase is guided through two main ideas: Firstly, Secbase should provide means to analyse protein-ligand interactions with respect to secondary structure elements and, secondly, should allow analysis and discovery of functional similarity within related folding patterns. This is based on the knowledge that the function of a protein is often based on the structure and the spatial structure is more conserved in evolution than amino acid sequence. In general, Secbase, in combination with Relibase, can be used for knowledge discovery about the influence of secondary structural elements on protein-ligand interactions and should be valuable for structure based drug design and molecular modelling. Two major analyses were carried out using Reliscript and the Relibase Webinterface. The first analysis revealed some notable trends in hydrogen bonding geometry across the different secondary structural elements. The mean hydrogen bond length of accumulated hydrogen bonds in α-helices and parallel β-sheets decrease with increasing number of helix-turns and number of β-strands, respectively. The cooperative effect, which leads to a decrease of the mean hydrogen bond length, can be explained by a similar directionality of the peptide bond dipole vectors and the backbone hydrogen bonds in α-helices and parallel β-sheets. The second analysis describes a survey of water molecules next to the N-terminus of an α-helix and shows their involvement in ligand binding. Furthermore, the kinked backbone shows interactions between two neighboured backbone amide groups and carboxylate or phosphate groups, respectively. In agreement with theoretical calculations described in the literature, this analysis suggest that the first/last turn of an α-helix is the main source for charge stabilising effects, mainly by providing hydrogen bonds. This is in contrast to the widely used explanantion that the overall dipole of the helix has an influence. The second part of this work deals with turns as an irregular secondary structure element with a hydrogen bond or a specific Cα-Cα distance between the first and the last residue. Because of the irregularity, the classification into subfamilies changed over the last decades with growing data from protein structures and is not completely adapted to the actual data basis, yet. Additionally, there was a lack of an overall classification for all turn families. Therefore, a uniform classification for all normal (COi - NHi+n hydrogen bond), open (a Cαi-Cαi+n distance up to 10 Å) and reverse (NHi - COi+n hydrogen bond) turn families is presented based on current structural data. The emergent self-organizing maps (ESOM) were used to cluster all turn-conformations of a non-redundant protein chain dataset. In combination with β-sheet and helix classification on average 96% of the given protein chain is now successfully classified. The classification can be used for the identification of similar protein domains or structural motifs within different turn families and accordingly for the understanding of protein-ligand and protein-protein interactions. The created turn classification was used to classify the turn conformations within all protein structures. This information was also added to Secbase. Protein sequence-based turn prediction with high accuracy has already confirmed this new categorization based on machine learning methods as consistent and well-defined. Hopefully, this classification will also be supportive for protein fold and structure prediction.|