Latest publications

Feed

News and thoughts

2024 highlights: of computer science and society

Note

For me, 2024 was full of back and forth between research, software, and connecting these to society. Here, I lay out some highlights on AI and society, as well as research and software, around tabular AI and language models.

As 2025 starts, I’m looking back on 2024. It …

When AIs must overcome the data

Improving conversational artificial intelligences or simpler prediction engines involves overcoming biases, that is, going beyond the limits of data. But the notion of bias is subtle, as it depends on the goals.

Note

This post was originally published in French as part of my scientific chronicle in Les Echos.

In …

Do AIs reason or recite?

Despite their apparent intelligence, conversational artificial intelligences often lack logic. The debate rages on: do they reason or do they recite snatches of text memorized on the Internet?

Note

This post was originally published in French as part of my scientific chronicle in Les Echos. I updated it with new …

CARTE: toward table foundation models

Note

Foundation models, pretrained and readily usable for many downstream tasks, have changed the way we process text, images, and sound. Can we achieve similar breakthroughs for tables? Here I explain why with “CARTE”, we’ve made significant headway.

Skrub 0.2.0: tabular learning made easy

We just released skrub 0.2.0. This release markedly simplifies learning on complex dataframes.

model = tabular_learner(‘classifier’)

Simple, yet solid default baseline

The highlight of the release is the tabular_learner function, which facilitates creating pipelines that readily perform machine learning on dataframes, adding preprocessing to a scikit-learn compatible learner …

Promoting open-source, from inria to :probabl.

Note

Open-source efforts around scikit-learn at Inria are spinning off to a new enterprise, Probabl, in charge of sustainable development of a data-science commons.

People underestimate how impactful Scikit-learn continues to be

Note

François Chollet rightfully said that people often underestimate the impact of scikit-learn. I give here a few illustrations to back his claim.

A few days ago, François Chollet (the creator of Keras, the library that that democratized deep learning) posted:

Tweet from François Chollet: "People underestimate how impactful scikit-learn continues to be"

Indeed, scikit-learn continues to be the most popular machine …

Comité de l’intelligence artificielle: vision et stratégie nationale

English summary

I have been appointed to the government-level panel of experts on AI, to set the national vision and strategy in France.


J’ai l’honneur d’être nommé au comité de l’intelligence artificielle du gouvernement Français.

La mission qui nous est confiée d’éclairer l’action publique …

2022, a new scientific adventure: machine learning for health and social sciences

A retrospective on last year (2022): I embarked on a new scientific adventure, assembling a team focused on developing machine learning for health and social science. The team has existed for almost a year, and the vision is nice shaping up. Let me share with you illustrations of where we …

My Mayavi story: discovering open source communities

The Mayavi Python software, and my personal history: A thread on Python and scipy ecosystems, building open source codebase, and meeting really cool and friendly people

I am writing today as a goodbye to the project: I used to be one of the core contributors and maintainers but have been …