Information Visualization Techniques for Metabolic Engineering
The main purpose of metabolic engineering is the modification of biological systems towards specific goals using genetic manipulations. For this purpose, models are built that describe the stationary and dynamic behaviour of biochemical reaction networks inside a biological cell. Based on these mode...
|PDF Full Text
No Tags, Be the first to tag this record!
|The main purpose of metabolic engineering is the modification of biological systems towards specific goals using genetic manipulations. For this purpose, models are built that describe the stationary and dynamic behaviour of biochemical reaction networks inside a biological cell. Based on these models, simulations are carried out with the intention to understand the cell's behaviour. The modeling process leads to the generation of large amounts of data, both during the modeling itself and after the simulation of the created models. The manual interpretation is almost impossible; consequently, appropriate techniques for supporting the analysis and visualization of these data are needed.
The purpose of this thesis is to investigate visualization and data mining techniques to support the metabolic modeling process. The work presented in this thesis is divided into several tracks:
-Visualization of metabolic networks and the associated simulation data.
Novel visualization techniques will be presented, which allow the visual exploration of metabolic network dynamics, beyond static snapshots of the simulated data plots. Node-link representations of the metabolic network are animated using the time series of metabolite concentrations and reaction rates. In this way, bottlenecks and active parts of metabolic networks can be distinguished.
Additionally, 3D visualization techniques for metabolic networks are explored for cross-free drawing of the networks in 3D visualization space. Steerable drawing of metabolic networks is also investigated. In contrast to other approaches for drawing metabolic networks, user guided drawing of the networks allows the creation of high quality drawings by including user feedback in the drawing process.
-Comparison of XML/SBML files.
SBML (Systems Biology Markup Language) has become ubiquitous in metabolic modeling, serving the storage and exchange of models in XML format. Generally, the modeling process is an iterative task where the next generation model is a further development of the current model, resulting in a family of models stored in SBML format. The SBML format, however, includes a great deal of information, from the structure of the biochemical network to parameters of the model or measured data.
Consequently, the CustX-Diff algorithm for a customizable comparison of XML files will be introduced. By customizing the comparison process through the specification of XPath expressions, an adaptable change detection process is enabled. Thus, the comparison process can be focused on specific parts of a XML/SBML document, e.g. on the structure of a metabolic network.
-Visual exploration of time-varying sensitivity matrices.
Sensitivity analysis is a special method used in simulation to analyze the sensitivity of a model with respect to its parameters. The results of sensitivity analysis of a metabolic network are large time-varying matrices, which need to be properly visualized.
However, the visualization of time-varying high-dimensional data is a challenging problem. For this purpose, an extensible framework is proposed, consisting of existing and novel visualization methods, which allow the visual exploration of time-varying sensitivity matrices. Tabular visualization techniques, such as the reorderable matrix, are developed further, and algorithms for their reordering are discussed. Existing and novel techniques for exploring proximity data, both in matrix form and projected using multi-dimensional scaling (MDS), are also discussed. Information visualization paradigms such as focus+context based distortion and overview+details are proposed to enhance such techniques.
-Cluster ensembles for analyzing time-varying sensitivity matrices.
A novel relationship-based cluster ensemble, which relies on the accumulation of the evolving pairwise similarities of objects (i.e. parameters) will be proposed, as a robust and efficient method for clustering time-varying high-dimensional data. The time-dependent similarities, obtained from the fuzzy partitions created during the fuzzy clustering process, are aggregated, and the final clustering result is derived from this aggregation.