Applications of Comparative Genomics: Dissemination and Phylogeny of Coding and Non-Coding Gene Families

Comparative genomics is an interdisciplinary field of study comparing the genetic makeup across multiple species. It aims to understand the similarities and differences in the genomes of various organisms to gain insights into their evolutionary relationships, functional characteristics, and adaptat...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Klemm, Paul Moritz Johannes
Beteiligte: Lechner, Marcus (Dr.) (BetreuerIn (Doktorarbeit))
Format: Dissertation
Sprache:Englisch
Veröffentlicht: Philipps-Universität Marburg 2023
Schlagworte:
Online Zugang:PDF-Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Comparative genomics is an interdisciplinary field of study comparing the genetic makeup across multiple species. It aims to understand the similarities and differences in the genomes of various organisms to gain insights into their evolutionary relationships, functional characteristics, and adaptations. Some of the key applications of comparative genomics include phylogenetic reconstruction, where researchers construct evolutionary trees to visualize the evolutionary history of species or genes, and orthology predictions, where homologous genes with shared ancestry and similar functions are identified across different organisms. The highlighted work includes two biologically motivated projects that leverage bioinformatic tools from comparative genomics. Furthermore, advancements in sequencing technologies have revolutionized genomics by generating vast amounts of genomic data. On the one side, this data flood provides unprecedented opportunities for comparative genomics, allowing researchers to explore genomic diversity on a large scale. However, the sheer volume of data also poses significant challenges on the other side in terms of data processing and storage. The third project addresses this challenge of coping with the ever-increasing flood of genomic data by revising a critical tool of the field. In the first project was focused on investigating the Kiwellin protein family in plants, which plays a critical role in plant-pathogen interactions. The research aimed to understand the structural features of this protein family and distinguish it from closely related Barwin-like proteins. The outcomes of this project were published in the article titled “Evolutionary reconstruction, nomenclature and functional meta-analysis of the Kiwellin protein family”, introducing a systematic nomenclature that revealed three distinct sub-classes within the Kiwellin family. Additionally, a meta-analysis of publicly available transcriptome data revealed specific responses of Kiwellins in different plant tissues and cultivars, as well as their responses to biotic and abiotic stresses. This hints at the fact that this protein family may act as a general communication molecule in plants. This research provides a valuable foundation for further investigations into plant-microbe interactions. The second project centered around the small non-coding RNA known as 6S RNA, which is associated with stress-coping mechanisms in bacteria. Among the bacteria, the diverse group of lactic acid bacteria (LAB) plays a significant role in the food industry, serving as starter cultures for industrial fermentation processes or as probiotics among others. However, some LAB can also act as pathogens, posing a potential threat. The primary objective was to identify this non-coding RNA and characterize its features in LAB. The outcomes of this project were presented in the publication “Insights into 6S RNA in lactic acid bacteria (LAB)”. The research involved various methodologies, including secondary structure-guided alignments, synteny classifications, phylogenetic reconstruction, and a guide for identifying 6S RNAs. The findings from this work offer valuable insights for optimizing fermentation processes, developing growth stage markers, or designing putative antibiotic supplements. The third project revolved around the orthology prediction tool, Proteinortho, which holds significant importance in comparative genomics, particularly in relation to the two previous projects. Orthologs are homologous genes that evolved from a speciation event and are believed to have retained similar functions across different species. The inference of orthologs is a critical step in multiple applications, such as genome annotation, phylogenetic analysis, and supertree analysis. Due to the rapid increase in genomic data already mentioned above, it is necessary to constantly optimize the tools for data processing. In this project, we performed an algorithmic update of Proteinortho, with a specific emphasis on enhancing its primary stages: sequence comparison and clustering. The results of this project can be found in the article “Proteinortho6: Accelerating graph-based detection of (co-)orthologs in large-scale analyses”. Our improvements significantly enhanced the overall performance and scalability of the tool for current datasets and available computational resources. Additionally, the update increased the tool’s availability, interoperability, and usability, making it more accessible for researchers in the field of comparative genomics. In summary, the presented projects help to paint a clearer picture of two important biological entities with direct industrial applications and highlight improvements to an established tool that is essential to the field of comparative genomics.
DOI:10.17192/z2023.0572