A new preprint, “High-dimensional Bayesian network inference from systems genetics data using genetic node ordering”, is available on bioRxiv. Bayesian networks are statistical models for gene regulatory networks, and their inference from large-scale omics data is a major problem in systems genetics. In this paper we present an algorithm to solve this problem that uses causal inference, topological sorting and variable selection, and that is much more efficient than traditional Markov chain Monte Carlo algorithms. The algorithm is implemented in the Findr and lassopv software packages.
We contributed two chapters to a Methods in Molecular Biology book on Gene Regulatory Networks: a chapter by Lingfei about the use of Findr for the inference of transcriptome-wide causal networks, and a chapter by Pau about the use of lemon-tree for the inference of differential module networks.
There is a lot of research showing that genetic information in combination with gene expression data can be used to predict causal interactions between genes, on the basis that genetic variation among individuals causes gene expression variation but not vice versa (this PLOS CompBio article is a contribution to the field from our group and has links to earlier work). Anagha Joshi’s group asked if this principle could be extended to other contexts, and in a joint preprint “Causal gene regulatory network inference using enhancer activity as a causal anchor” an affirmative answer is given: variation of epigenetic activity at enhancer elements across multiple cell types or experimental treatments together with gene expression data also predicts causal interactions. The accompanying statistical methods have been implemented in our Findr software.
A preprint on “Model-based clustering of multi-tissue gene expression data” is available from arXiv. In this paper we present a Bayesian model-based clustering algorithm for large-scale, multi-tissue gene expression data, where expression profiles are obtained from multiple tissues or organs sampled from dozens to hundreds of individuals. Our model can incorporate prior information on physiological tissue similarity, and results in a set of clusters, each consisting of a core set of genes conserved across tissues as well as differential sets of genes specific to one or more subsets of tissues. The algorithm has been implemented in the Lemon-Tree software as a new “task”, revamp.
Whole-transcriptome causal network inference with genomic and transcriptomic data (on bioRxiv) describes a protocol for reconstructing causal gene networks from genome-wide genotype and gene expression data using the Findr software.
Learning differential module networks across multiple experimental conditions (on arXiv) reviews the theory of module network inference and describes how differential module networks across multiple experimental conditions can be learned using the Lemon-Tree software.
New preprint posted: Analytic solution and stationary phase approximation for the Bayesian lasso and elastic net. Related software here.
We have posted a preprint “Efficient causal inference with hidden confounders from genome-transcriptome variation data”. In this paper we introduce a new method for causal inference between gene expression traits using the DNA variations in cis-regulatory regions as causal anchors. The method has been implemented in the Findr software, and validated using the DREAM5 Systems Genetics Challenge and GEUVADIS datasets.
We published a paper describing the Lemon-Tree software in the PLOS Computational Biology Software article collection:
Bonnet E, Calzone L, Michoel T. (2015) Integrative multi-omics module network inference with Lemon-Tree. PLoS Comput Biol 11(2): e1003983.
We posted a preprint titled “Integrative multi-omics module network inference with Lemon-Tree” on the arXiv. The preprint describes the current status of our module networks inference software Lemon-Tree and demonstrates how it can be used to identify cancer driver genes from large-scale copy number variation and gene expression datasets such as generated by The Cancer Genome Atlas. All of this is joint work with Eric Bonnet.