How do different organs and tissues arise? What are the genetic and epigenetic mechanisms that drive this development?
To address these questions we design statistical methods and algorithms and apply them to large-scale, genome-wide data. Ultimately, our goal is to generate, test, and confirm hypotheses that are relevant to human health. Active projects include the modeling of epigenomic marks during differentiation and development, methods development to elucidate the role transcriptional enhancer sequences play in vertebrate left-right patterning, and bioinformatics and modeling in the context of characterizing the transcriptome of single heart cells .

At a glance


The role of enhancer sequences in vertebrate left-right patterning

During vertebrate development the breaking of bilateral symmetry establishes the left-right body axis, and errors in axis determination are associated with structural birth defects, such as congenital heart disease. RNA-sequencing data allows us to study expression signatures involved in this process without bias, and it highlights genes and molecular pathways involved in establishing and/or maintaining left-right asymmetry. In this context, the central hypothesis of this project is that more than a handful of known transcriptional enhancer sequences play a role in left-right patterning: Integrating RNA-sequencing data across multiple developmental time points with computational enhancer predictions based on annotation databases (like VISTA and FANTOM) will identify gene-regulatory enhancer sequences involved in left-right patterning. Further bioinformatics analysis of discovered enhancer elements and their target genes has the potential to yield novel insights into the molecular mechanisms underlying lateral symmetry breaking.
This is joint work with Cecila W. Lo (U Pitt) and funded by a March of Dimes Basil O'Connor Starter Scholar Research Award.

Statistical modeling of epigenomic marks during differentiation and development

During vertebrate development complex gene regulatory mechanisms implement the precise spatio-temporal RNA expression patterns that are required during embryogenesis. It is known that various context-dependent biochemical / epigenetic modifications to DNA like transcription factor binding, histone modifications, and DNA methylation play a role, but their respective interactions, dependencies and redundancies remain poorly understood. Recent studies have shown that it is now possible to collect data about RNA expression and biochemical modifications to DNA genome-wide across development spanning multiple lineages. These lineages are related by a lineage-tree that links precursor cell types to (often multiple) descendant cell types. We developed a statistical approach to model DNA methylation and RNA expression across such trees, and currently extend this approach to other epigenetic marks. Importantly, this work allows us to remove correlation induced by lineage relationships from genome-wide data sets, thereby making them more suitable for further downstream analyses. Further on, a unifying modeling framework for these types of data will enable the generation and testing of hypotheses about inter-relations of different epigenetic marks and gene expression.
This is joint work with John A. Capra (Vanderbilt University) and funded by the National Institues of Health (NIGMS, 1R01GM115836-01).

RNA sequencing of single heart cells

Single-cell sequencing approaches have the potential to transform biological knowledge. We use and develop computational and statistical approaches to understand gene regulation during cardiac development, disease, and regeneration at the single cell level. This specific project focuses on transcriptome profiling of single heart cells, where we are working on establishing a single-cell bioinformatics infrastructure consisting of data management, quality control, data processing, and data analysis. In our analyses we focus on (i) discriminating cardiomyocytes from other heart cells, (ii) assessing gene expression variation (between cells of the same individual, and between individuals), (iii) the influence of age on cardiomyocyte gene expression.
This is joint work with Bernhard Kühn (Pitt, UPMC) and receives support from Fund for Genomic Discovery from the Children’s Hospital of Pittsburgh Foundation.