News and thoughts – Page 2

Hiring someone to develop scikit-learn community and industry partners

Note

With the growth of scikit-learn and the wider PyData ecosystem, we want to recruit in the Inria scikit-learn team for a new role. Departing from our usual focus on excellence in algorithms, statistics, or code, we want to add to the team someone with some technical understanding, but an …

2020: my scientific year in review

The year 2020 has undoubtedly been interesting: the covid19 pandemic stroke while I was on a work sabbatical in Montréal, at the MNI and the MILA, and it pushed further my interest in machine learning for health-care. My highlights this year revolve around basic and applied data-science for health.

Highlights …

Technical discussions are hard; a few tips

Note

This post discuss the difficulties of communicating while developing open-source projects and tries to gives some simple advice.

A large software project is above all a social exercise in which technical experts try to reach good decisions together, for instance on github pull requests. But communication is difficult, in …

Jean Dechoux, June 13rd 1923 – Feb 9th 2020

Jean Dechoux was born between the first and the second world wars, in a small French town, close to Germany. His family was that of poor farmers, who would work in coal mines to make up for the small size of their crops.

He grew to become a pulmonologist, heading …

Survey of machine-learning experimental methods at NeurIPS2019 and ICLR2020

Note

A simple survey asking authors of two leading machine-learning conferences a few quantitative questions on their experimental procedures.

How do machine-learning researchers run their empirical validation? In the context of a push for improved reproducibility and benchmarking, this question is important to develop new tools for model comparison. We …

2019: my scientific year in review

My current research spans wide: from brain sciences to core data science. My overall interest is to build methodology drawing insights from data for questions that have often been addressed qualitatively. If I can highlight a few publications from 2019 [1], the common thread would be computational statistics, from dirty …

Comparing distributions: Kernels estimate good representations, l1 distances give good tests

Note

Given two set of observations, are they drawn from the same distribution? Our paper Comparing distributions: l1 geometry improves kernel two-sample testing at the NeurIPS 2019 conference revisits this classic statistical problem known as “two-sample testing”.

This post explains the context and the paper with a bit of hand …

Getting a big scientific prize for open-source software

Note

An important acknowledgement for a different view of doing science: open, collaborative, and more than a proof of concept.

A few days ago, Loïc Estève, Alexandre Gramfort, Olivier Grisel, Bertrand Thirion, and myself received the “Académie des Sciences Inria prize for transfer”, for our contributions to the scikit-learn project …

2018: my scientific year in review

From a scientific perspective, 2018 [1] was once again extremely exciting thank to awesome collaborators (at Inria, with DirtyData, and our local scikit-learn team). Rather than going over everything that we did in 2018, I would like to give a few highlights: We published major work using machine learning to …

A foundation for scikit-learn at Inria

We have just announced that a foundation will be supporting scikit-learn at Inria [1]: scikit-learn.fondation-inria.fr

Growth and sustainability

This is an exciting turn for us, because it enables us to receive private funding. As a result, we will be able to have secure employment for some existing core …