Congratulations and goodbye to Lingfei who has joined Ramnik Xavier’s lab at the Broad Institute and Massachusetts General Hospital. Thanks for the good times and all the hard work, and all the best in your new position!
There is a lot of research showing that genetic information in combination with gene expression data can be used to predict causal interactions between genes, on the basis that genetic variation among individuals causes gene expression variation but not vice versa (this PLOS CompBio article is a contribution to the field from our group and has links to earlier work). Anagha Joshi’s group asked if this principle could be extended to other contexts, and in a joint preprint “Causal gene regulatory network inference using enhancer activity as a causal anchor” an affirmative answer is given: variation of epigenetic activity at enhancer elements across multiple cell types or experimental treatments together with gene expression data also predicts causal interactions. The accompanying statistical methods have been implemented in our Findr software.
A preprint on “Model-based clustering of multi-tissue gene expression data” is available from arXiv. In this paper we present a Bayesian model-based clustering algorithm for large-scale, multi-tissue gene expression data, where expression profiles are obtained from multiple tissues or organs sampled from dozens to hundreds of individuals. Our model can incorporate prior information on physiological tissue similarity, and results in a set of clusters, each consisting of a core set of genes conserved across tissues as well as differential sets of genes specific to one or more subsets of tissues. The algorithm has been implemented in the Lemon-Tree software as a new “task”, revamp.
We have posted a preprint Wisdom of the crowd from unsupervised dimension reduction on arXiv. In this paper we show that one-dimensional unsupervised dimension reduction, such as principal component analysis and Isomap, can be used to derive consensus predictions from the responses of multiple individuals to the same questions, and performs better than existing solutions. This is relevant for crowd wisdom applications in the social and natural sciences, including data fusion, meta-analysis, crowd-sourcing, and committee decision making.
Whole-transcriptome causal network inference with genomic and transcriptomic data (on bioRxiv) describes a protocol for reconstructing causal gene networks from genome-wide genotype and gene expression data using the Findr software.
Learning differential module networks across multiple experimental conditions (on arXiv) reviews the theory of module network inference and describes how differential module networks across multiple experimental conditions can be learned using the Lemon-Tree software.
New preprint posted: Analytic solution and stationary phase approximation for the Bayesian lasso and elastic net. Related software here.
A warm welcome to Sean who has joined the group as a PhD student in the Precision Medicine Doctoral Training Programme. Sean will work on a project to identify and study tissue-specific gene networks affected by genetic variation for plasma cortisol and causally associated with cardiovascular disease phenotypes and type II diabetes, in collaboration with Filippo Menolascina and Brian Walker.
Lingfei’s paper “Efficient and accurate causal inference with hidden confounders from genome-transcriptome variation” has been published in PLOS Computational Biology. Congrats Lingfei!
Lingfei has posted a preprint Comparable variable selection with Lasso on arXiv. In this paper we propose statistical tests to evaluate the quality of a set of p-values and to compare p-values across different experimental batches. We then use these tests to show that a newly proposed lasso-based variable selection statistic allows for a unified FDR control across multiple variable selection tasks, unlike existing methods.