Gaël Varoquaux

Sat 31 December 2016


Our research in 2016: personal scientific highlights

Year 2016 has been productive for science in my team. Here are some personal highlights: bridging artificial intelligence tools to human cognition, markers of neuropsychiatric conditions from brain activity at rest, algorithmic speedups for matrix factorization on huge datasets…

Artificial-intelligence convolutional networks map well the human visual system

Eickenberg et al (preprint), showed that convolutional networks –machine-learning tools developed in artificial intelligence for image analysis– map well the human visual system. This is interesting because it shows that cognitive vision and artificial computer vision have evolved to similar architectures. It is not that surprising, as they are both driven by the statistics of natural images. From the point of view of inference in neuroscience, what I found really interesting is that we demonstrated that our computational model of brain activity generalizes across experimental paradigms. This is something new to my knowledge.

Using brain activity at rest to predicting Autism status across clinical sites

Abraham et al (preprint) used resting-state brain activity to predict whether individuals were typical controls or diagnosed with Autistic symptoms. The important aspect of this study is that it was performed on a large data collection across many sites that had not concerted each other during the acquisition. Given that prediction was successful across sites, the study shows the viability of extracting predictive biomarkers across inhomogeneous multi-site data. I think that it is an important result for the future of psychiatric neuroimaging research. The paper also highlights the aspects of the predictive pipeline that were important for this success.

Dictionary Learning for Massive Matrix Factorization

On a pure machine-learning side, Mensch et al introduced a new algorithm for matrix factorization that gives 10 times speedups compared to the state of the art on absolutely huge datasets (Terabyte scales). The key aspect is to combine online learning with random subampling that exploits redundancies in the data. For neuroimaging, this algorithmic advances is needed to tackle larger and larger resting-state data. We will use it to scale predictive models to epidemiologic cohorts. The original paper was purely heuristic but later work comes with proofs and we will soon be submitting a very rich journal paper about this class of algorithms.

A guide to cross-validation in neuroimaging

We published a review on cross-validation for neuroimaging (preprint). While this may sound less leading edge than other of our work, cross-validation is central to everything we do. Doing it right is important. We learned some interesting tradeoffs while doing the experiments for the review. One of them is that for predictive models that are quite stable, such as SVMs, it may be profitable to use default hyper-parameters than to tune them by cross-validation. This is because with the small sample sizes typical of neuroimaging cross-validation is fairly noisy.

Though not in my team, Liem et al (preprint) collaborated with us for a beautiful study showing multimodal prediction of brain age from rest brain activity and brain anatomy. Interestingly, they showed that discrepancy between predicted age and chronological age captures cognitive impairment.

We have many interesting things in the pipeline, but it will be for next year. On an unrelated note, I’ve been doing more art photography on my free time in 2016.

Go Top