scikit-learn posts

Skrub 0.2.0: tabular learning made easy

We just released skrub 0.2.0. This release markedly simplifies learning on complex dataframes.

model = tabular_learner(‘classifier’)

Simple, yet solid default baseline

The highlight of the release is the tabular_learner function, which facilitates creating pipelines that readily perform machine learning on dataframes, adding preprocessing to a scikit-learn compatible learner …

Promoting open-source, from inria to :probabl.

Note

Open-source efforts around scikit-learn at Inria are spinning off to a new enterprise, Probabl, in charge of sustainable development of a data-science commons.

Contents

  • Prelude: funding scikit-learn is hard
  • The birth of a new ambition
  • Probabl, a mission-driven enterprise
  • Probabl is already having an impact
  • My position within Probabl …

People underestimate how impactful Scikit-learn continues to be

Note

François Chollet rightfully said that people often underestimate the impact of scikit-learn. I give here a few illustrations to back his claim.

A few days ago, François Chollet (the creator of Keras, the library that that democratized deep learning) posted:

Tweet from François Chollet: "People underestimate how impactful scikit-learn continues to be"

Indeed, scikit-learn continues to be the most popular machine …

Hiring someone to develop scikit-learn community and industry partners

Note

With the growth of scikit-learn and the wider PyData ecosystem, we want to recruit in the Inria scikit-learn team for a new role. Departing from our usual focus on excellence in algorithms, statistics, or code, we want to add to the team someone with some technical understanding, but an …

Getting a big scientific prize for open-source software

Note

An important acknowledgement for a different view of doing science: open, collaborative, and more than a proof of concept.

A few days ago, Loïc Estève, Alexandre Gramfort, Olivier Grisel, Bertrand Thirion, and myself received the “Académie des Sciences Inria prize for transfer”, for our contributions to the scikit-learn project …

A foundation for scikit-learn at Inria

We have just announced that a foundation will be supporting scikit-learn at Inria [1]: scikit-learn.fondation-inria.fr

Growth and sustainability

This is an exciting turn for us, because it enables us to receive private funding. As a result, we will be able to have secure employment for some existing core …

Sprint on scikit-learn, in Paris and Austin

Two weeks ago, we held a scikit-learn sprint in Austin and Paris. Here is a brief report, on progresses and challenges.

Several sprints

We actually held two sprint in Austin: one open sprint, at the scipy conference sprints, which was open to new contributors, and one core sprint, for more …

Scikit-learn Paris sprint 2017

Two week ago, we held in Paris a large international sprint on scikit-learn. It was incredibly productive and fun, as always. We are still busy merging in the work, but I think that know is a good time to try to summarize the sprint.

A massive workforce

We had a …

Scikit-learn 2014 sprint: a report

A week ago, the 2014 edition of the scikit-learn sprint was held in Paris. This was the third time that we held an internation sprint and it was hugely productive, and great fun, as always.

Great people and great venues

We had a mix of core contributors and newcomers, which …

Scikit-learn 0.15 release: highlights

We have just released the 0.15 version of scikit-learn. Hurray!! Thanks to all involved.

A long development stretch

It’s been a while since the last release of scikit-learn. So a lot has happened. Exactly 2611 commits according my count. Quite clearly, we have more and more existing code …

Google summer of code projects for scikit-learn

I’d like to welcome the four students that were accepted for the GSoC this year:

  • Issam: Extending Neural networks
  • Hamzeh: Sparse Support for Ensemble Methods
  • Manoj: Making Linear models faster
  • Maheshakya: Locality Sensitive Hashing

Welcome to all of you. Your submissions were excellent, and you demonstrated a good will …

Scikit-learn 0.14 release: features and benchmarks

I have tagged and released the scikit-learn 0.14 release yesterday evening, after more than 6 months of heavy development from the team. I would like to give a quick overview of the highlights of this release in terms of features but also in term of performance. Indeed, the scikit-learn …

Update on scikit-learn: recent developments for machine learning in Python

Yesterday, we released version 0.11 of the scikit-learn toolkit for machine learning in Python, and there was much rejoincing.

Major features gained in the last releases

In the last 6 months, there have been many things happening with the scikit-learn. While I do not whish to give an exhaustive …

3 Google summer of code for scikit-learn and more…

The scikit-learn got 3 students accepted for the Google summer of code.

  • Imanuel Bayer will work on making our sparse linear models, for regression and classification, faster. His proposal Optimizing sparse linear models using coordinate descent and strong rules.
  • David Marek will implement multi-layer perceptrons for the scikit. His proposal …

Joblib beta release: fast compressed persistence + Python 3

Joblib 0.6: better I/O and Python 3 support

Happy new year, every one. I have just released Joblib 0.6.0 beta. The highlights of the 0.6 release are a reworked enhanced pickler, and Python 3 support.

Many thanks go to the contributors to the 0.5 …

Scikit-learn NIPS 2011 sprint: international thanks to our sponsors

The NIPS conference: time for a sprint. The NIPS conference, one of the major conferences in machine learning, is hosted in Granada this year. I believe that it is the first time that it is hosted in Europe. As many of the scikit-learn developers are part of the wider NIPS …

Hiring a junior developer on the scikit-learn

Once again, we are looking for a junior developer to work on the scikit-learn. Below is the official job posting. As a personal remark, I would like to stress that this is a unique opportunity to be payed for two years to work on learning and improving the scientific Python …

My conference travels: Scipy 2011 and HBM 2011

The Scipy 2011 conference in Austin

Last week, I was at the Scipy conference in Austin. It was really great to see old friends, and Austin is such a nice place.

The Scipy conference was held in UT Austin’s conference center, which is a fantastic venue. This is the …

Hiring a junior engineer on the scikit-learn

The scikit-learn is a Python module for machine learning. The project builds on the scientific and numerical tools of the scipy community to provide state-of-the-art data analysis tools. It is developed by a community of open source developers to which my research team (Parietal, INRIA) contributes a lot and is …

Scikit-learn sprint on April 1st

The scikit-learn team is organizing a sprint on April 1st (that next Friday). Join us in Paris, Boston, or on IRC!

With the rise of the data sciences, the scikit-learn, a BSD-licensed Python package for machine learning, is becoming an asset for more and more endeavors. Machine learning has traditionally …