scientific computing posts

Getting a big scientific prize for open-source software

Note

An important acknowledgement for a different view of doing science: open, collaborative, and more than a proof of concept.

A few days ago, Loïc Estève, Alexandre Gramfort, Olivier Grisel, Bertrand Thirion, and myself received the “Académie des Sciences Inria prize for transfer”, for our contributions to the scikit-learn project …

Beyond computational reproducibility, let us aim for reusability

Note

Scientific progress calls for reproducing results. Due to limited resources, this is difficult even in computational sciences. Yet, reproducibility is only a means to an end. It is not enough by itself to enable new scientific results. Rather, new discoveries must build on reuse and modification of the state …

Nilearn 0.2: more powerful machine learning for neuroimaging

After 6 months of efforts, We just released version 0.2 of nilearn, dedicated to making machine learning in neuroimaging easier and more powerful.

This release integrates the features of the july sprint, and more.

Highlights

Better documentation …

Job offer: data crunching brain functional connectivity for biomarkers

My research group is looking to fill a post-doc position on learning biomarkers from functional connectivity.

Scientific context

The challenge is to use resting-state fMRI at the level of a population to understand how intrinsic functional connectivity captures pathologies and other cognitive phenotypes. Rest fMRI is a promising tool for …

Nilearn sprint: hacking neuroimaging machine learning

A couple of weeks ago, we had in Paris the second international nilearn sprint, dedicated to making machine learning in neuroimaging easier and more powerful.

It was such a fantastic experience, as nilearn is really shaping up as a simple yet powerful tool, and there is a lot of enthusiasm …

MLOSS: machine learning open source software workshop @ ICML 2015

Note

This year again we will have an exciting workshop on the leading-edge machine-learning open-source software. This subject is central to many, because software is how we propagate, reuse, and apply progress in machine learning.

Want to present a project? The deadline for the call for papers is Apr 28th …

Publishing scientific software matters

Christophe Pradal, Hans Peter Langtangen, and myself recently edited a version of the Journal of Computational Science on scientific software, in particular those written in Python. We wrote an editorial defending writing and publishing open source scientific software that I wish to summarize here. The full text preprint is openly …

A journal promoting high-quality research code: dream and reality

Open research computation (ORC) was an attempt to create a scientific publication promoting high-quality and open source scientific code. The project went public in falls 2010, but last month, facing the low volume of submission, the editorial board chose to reorient it as a special track of an existing journal …

Want features? Just code

Somebody just sent an email on a user’s mailing list for an open-source scientific package entitled “Feature foo: why is package bar not up to the task?”. To quote him:

Is there ANY plan for having such a module in package bar?? I think (personally) that this is a …

Book review: NumPy 1.5 Beginner’s guide

Packt publishing sent me a copy of NumPy 1.5 Beginner’s guide by Ivan Idris.

The book actually covers more than only numpy: it is a full introduction to numerical computing with Python. The table of contents is the following:

  • NumPy Quick Start
  • Beginning with NumPy Fundamentals
  • Get into …

Joblib beta release: fast compressed persistence + Python 3

Joblib 0.6: better I/O and Python 3 support

Happy new year, every one. I have just released Joblib 0.6.0 beta. The highlights of the 0.6 release are a reworked enhanced pickler, and Python 3 support.

Many thanks go to the contributors to the 0.5 …

Cython example of exposing C-computed arrays in Python without data copies

Some advice on passing arrays from C to Python avoiding copies. I use Cython as I have found the code to be more maintainable than hand-written Python C-API code.

I found out that there was no self-contained example of creating numpy arrays from existing data in Cython. Thus I created …

Python at scientific conferences

Top notch scientific conferences are starting to add Python tracks to their program. This is good news. Indeed, it scientific Python conferences (namely Scipy, EuroSciPy and Scipy India) are doing great to get together people who have already heard about Python for science, but we need to reach out to …

Scikit-learn sprint on April 1st

The scikit-learn team is organizing a sprint on April 1st (that next Friday). Join us in Paris, Boston, or on IRC!

With the rise of the data sciences, the scikit-learn, a BSD-licensed Python package for machine learning, is becoming an asset for more and more endeavors. Machine learning has traditionally …

Interested in parallel computing and statistics? We are looking for a post-doc

My research group is kick starting a new project, called AzureBrain to do computational analysis of large brain imaging and genetics population-wise data. One of the goals of the project is to harness the power of grid computing to do statistical learning on fMRI data, finding features in an individuals …

Scientific publication for software development

The academic community seems to judge the validity and significance of any contribution by the number of papers published and the number of citations they get. To find funding, to get credit, you have to publish or perish. However, the natural output of software development tends not to be an …

ICA versus PCA in the scikit-learn: the value of code over pictures

When I was trying to get an intuitive feeling of the difference between Independent Component Analysis (ICA) and Principal Component Analysis (PCA), I wrote a few Python scripts producing some visualizations explaining the difference that have had a bit of success.

During the last sprint on scikit-learn, a machine learning …

Multitouch with VTK (and MedINRIA and Mayavi)

If the videos on this post are not showing, click through to see them.

A colleague of mine, Pierre Fillard, has just integrated multitouch in the next generation of the VTK-based medical imaging software MedINRIA. The nice thing is that it works on an Apple laptop out of the box …

Scikit Learn coding sprint

We have been really crap at communicating the next scikit-learn coding sprint. It’s next week!

The coding sprint will take place the 8 and 9 September at INRIA Saclay, near Paris, in the room K110 (building K).

For those who cannot make it, it will be possible to participate …

SVG Word map of countries

To be able to visualize some quantities attached to countries all over the world, I needed a image with various countries color-coded. The fantastic matplotlib basemap package was not an option as I really needed a static image.

So I generated an SVG image with all the countries. It was …