Lingfei’s paper “Accurate wisdom of the crowd from unsupervised dimension reduction” has been published in Royal Society Open Science. In this paper it is shown that wisdom of the crowd, the collective intelligence derived from responses of multiple individuals to the same questions, is analogous to one-dimensional unsupervised dimension reduction in machine learning. This means that many of-the-shelf dimension reduction methods, such as good old PCA, can be repurposed as crowd-wisdom methods, usually with (much) better performance than existing default crowd-wisdom methods. Perhaps one of the more surprising results concerned the classification of skin images as being cancerous or not. As part of the hype surrounding deep learning, it was recently found that a deep neural network trained on 130,000 images was better at classifying a test set of 111 skin images than 21 individual dermatologists. However, we found that by doing a simple PCA of the predictions of these 21 dermatologists, they collectively outperformed the deep neural network. As The Economist put it in their recent ad, “not all intelligence is artificial”. In fact some of it is collective.
I’ll be at the EMBO symposium on regulatory epigenomics next week. Looking forward to it!
A warm welcome to Ammar Malik who has joined the group as a PhD student. Ammar has a degree in Computer Engineering and brings with him a lot of experience in machine learning. Welcome!
A postdoc position is available in my group, to develop machine learning methods for inferring causal gene networks from genome, epigenome and transcriptome sequencing data. For more information and application instructions, see here.
A new preprint, “High-dimensional Bayesian network inference from systems genetics data using genetic node ordering”, is available on bioRxiv. Bayesian networks are statistical models for gene regulatory networks, and their inference from large-scale omics data is a major problem in systems genetics. In this paper we present an algorithm to solve this problem that uses causal inference, topological sorting and variable selection, and that is much more efficient than traditional Markov chain Monte Carlo algorithms. The algorithm is implemented in the Findr and lassopv software packages.
Four PhD positions in computer science are available at the Department of Informatics. These positions can be held in any of the department’s research areas (algorithms, bioinformatics, machine learning, optimization, programming theory, security, and visualization). Applicants with an interest in computational biology and machine learning are welcome to contact me prior to submitting their application.
For more details and application instructions, see here.
The pdf of my NeurIPS2018 poster is available here.