CEDAS conference 2021

Does your work involve data science? Are you curious about research in data science in Bergen? The CEDAS conference 2021 will be a 2-day event featuring talks and discussion by leading international scientists and local experts on the interaction between data science, statistics, machine learning, and AI, as well as their applications in science and society. The conference will also provide a friendly virtual poster session to present your own work, and an opportunity to connect (meeting physically if regulations allow) with like-minded data scientists from around Bergen.

Welcome Mariyam!

A warm welcome to Mariyam Khan who joins us today as a PhD student!

Mariyam has a master degree in Mathematical Data Science from the University of Göttingen and will work on the NFR funded project “Intelligent systems for personalized and precise risk prediction and diagnosis of non-communicable diseases”.

The pandemic being what it is, Mariyam joins us remotely at first, and we look forward to welcoming her in Bergen properly once border restrictions ease.

Welcome again, Mariyam!

INTRePID project funded by NFR

Very pleased that the Norwegian Research Council will fund our project “Intelligent systems for personalized and precise risk prediction and diagnosis of non-communicable diseases” as part of its IKTPLUSS initiative.

The aim of this project is to create computer methods for risk prediction and diagnosis of non-communicable diseases using multi-omics data, by developing, implementing and validating novel algorithms for structure learning and inference in large-scale, multi-organ causal Bayesian gene networks. The project will integrate unique multi-omics data from three Nordic studies for a proof-of-concept application in cardiovascular medicine:

The project partners are:

Frontiers Genetics paper

Lingfei’s paper, “High-dimensional Bayesian network inference from systems genetics data using genetic node ordering” has been published in Frontiers in Genetics, in a Special Topic on Machine Learning and Network-Driven Integrative Genomics.

In this paper, we present a highly efficient approach for reconstructing Bayesian gene regulatory networks when prior information for the inclusion of edges exists or can be inferred from the available data. The method is implemented in the Findr software.

Bioinformatics paper

Pau’s paper “Model-based clustering of multi-tissue gene expression data” has been published in Bioinformatics. In this paper a method, called “revamp”, is introduced to find clusters (groups of genes with shared activity patterns) in multi-tissue data, where gene expression profiles are available from multiple tissues or organs sampled from the same group of individuals. Revamp improves existing methods by its ability to incorporate prior information on physiological tissue similarity, and by identifying a set of clusters, each consisting of a core set of genes conserved across tissues as well as differential sets of genes specific to one or more subsets of tissues. Revamp is implemented in the Lemon-Tree software.

Welcome Adriaan and Wouter!

A belated but nevertheless very warm welcome to Adriaan and Wouter who joined the group in October!

Adriaan is a postdoc with master and PhD degrees in physics who most recently worked as a postdoctoral researcher in the neurophysics group of Jordi Soriani in Barcelona. His focus there was on network inference methods for neuronal activity data, and he will use that experience to further develop our causal gene network inference methods for applications in systems genetics.

Wouter is a master student in bioinformatics and systems biology in Amsterdam who is joining the group for a 6-month internship. He will work on a method for inferring data-driven gene ontologies from gene expression data.

Welcome both!

RSOS paper

Lingfei’s paper “Accurate wisdom of the crowd from unsupervised dimension reduction” has been published in Royal Society Open Science. In this paper it is shown that wisdom of the crowd, the collective intelligence derived from responses of multiple individuals to the same questions, is analogous to one-dimensional unsupervised dimension reduction in machine learning. This means that many of-the-shelf dimension reduction methods, such as good old PCA, can be repurposed as crowd-wisdom methods, usually with (much) better performance than existing default crowd-wisdom methods. Perhaps one of the more surprising results concerned the classification of skin images as being cancerous or not. As part of the hype surrounding deep learning, it was recently found that a deep neural network trained on 130,000 images was better at classifying a test set of 111 skin images than 21 individual dermatologists. However, we found that by doing a simple PCA of the predictions of these 21 dermatologists, they collectively outperformed the deep neural network. As The Economist put it in their recent ad, “not all intelligence is artificial”. In fact some of it is collective.

Welcome Ramin!

A warm welcome to Ramin Hasibi who has joined the group as a PhD student. Ramin has a master degree in computer networks and a strong background in deep learning and machine learning more generally. Welcome!