Comprehensive analysis of methylation data in non-model plant species

One of the goals of plant epigenetics is detecting differential methylation that may occur following specific treatments or in variable environments. This can be achieved with a single-base resolution with standard methods for whole-genome bisulfite sequencing (WGBS) and reduced representation bisul...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Can, Sultan Nilay
Beteiligte: Rensing, Stefan A. (Prof. Dr.) (BetreuerIn (Doktorarbeit))
Format: Dissertation
Sprache:Englisch
Veröffentlicht: Philipps-Universität Marburg 2021
Schlagworte:
Online Zugang:PDF-Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:One of the goals of plant epigenetics is detecting differential methylation that may occur following specific treatments or in variable environments. This can be achieved with a single-base resolution with standard methods for whole-genome bisulfite sequencing (WGBS) and reduced representation bisulfite sequencing (RRBS). Another important goal is to exploit sequencing methods in combination with bisulfite treatment to associate genetics and epigenetics with phenotypic traits. In the past 19 years, this has become possible using so-called genome-wide association studies (GWAS) and epigenome-wide association studies (EWAS), the latter of which aims to reveal the potential biomarkers between phenotypic traits and epigenetic variation. In practice, such studies rely on software packages or “bioinformatics pipelines” which make the requisite computational processes routine and reliable. This thesis describes several such pipelines, developed within the framework of EpiDiverse, an Innovative Training Network (ITN) (https://epidiverse.eu/, accessed on 1 May 2021) carrying out comprehensive studies on pipelines for WGBS, differentially methylated region (DMR), EWAS, and single nucleotide polymorphism (SNP) analyses. Here I introduce the benchmark study with DMR tools, the EWAS pipeline, and bioinformatics pipelines implemented within the EpiDiverse toolkit. At first, by analyzing DMR tools with simulated datasets with seven different tools (metilene, methylKit, MOABS, DMRcate, Defiant, BSmooth, MethylSig) and four plant species (Aethionema arabicum, Arabidopsis thaliana, Picea abies, and Physcomitrium patens), together with the coauthors, we showed that metilene has a superior performance in terms of overall precision and recall. Therefore, we set it as a default DMR caller in the EpiDiverse DMR pipeline. Afterward, I introduced extended features of the EWAS pipeline beyond the GEM R package e.g., graphical outputs, novel missing data imputation, compatibility with new input types, etc. Then I revealed the effect of missing data with the Picea abies (Norway spruce) data and showed the pipeline presents logical missing data imputation. Furthermore, I obtained a significant overlap between the pipeline and Quercus lobata (valley oak) analysis results. By extensive benchmark with various tools, a group of pipelines became publicly available, whereby the EpiDiverse toolkit suits for people working with WGBS datasets (https://github.com/EpiDiverse, accessed on 1 May 2021).
Umfang:121 Seiten
DOI:10.17192/z2021.0481