11 Jul

Scipy2008 Early-bird registration deadline ends today

I have been planning to make a more interesting post highlighting the large trends of the SciPy2008 conference, but it is 3AM local time, and I am still hacking on Mayavi, so I think I’ll keep it short.

As far a the conference program goes, we can see a few major themes emerging. There will be talks about the use of Python for scientific works, but also talks about the growing stack of Python scientific tools. Interesting trends are the non-purely-numerical tools: symbolic and graph theory, and the race towards more optimisation through compilation from Python code. In addition this year we see a major effort on documentation. I think this is the sign of a numerical stack that is maturing.

As for the tutorials, I am personnally very interested in the advanced track tutorial. The newest and coolest technologies, like Cython, are also not the one I know best, and we have the chance to be able to listen to their authors presenting them.

The early bird registration deadline is ending tomorrow, as I point out in my title. If you miss this deadline, the conference fees will be higher, and the reason is simply that late registration makes organisation harder and more expensive. I would be happier if everybody registered before this deadline and paid less. I am not too sure what the accountant would say.

27 Jun

Student sponsorship for the SciPy08 conference

I am delighted to announce that the Python Software Foundation has answered our call and is providing sponsoring to the SciPy08 conference.

We will use this money to sponsor the registration fees and travel for up to 10 college or graduate students to attend the conference. The PSF did not provide all the founds required for all 10 students and once again Enthought Inc. (http://www.enthought.com) is stepping up to fill in[1].

To apply, please send a short description of what you are studying and why you’d like to attend to info@enthought.com. Please include telephone contact information.

From my perspective, this is excellent news. First of all this means that the SciPy community is working a bit more closely with the PSF and the broader Python community. But the is also very dear to me as last year I was sponsored as a student to come to the SciPy conference. I got to meet fantastic people and discover thrilling new developments. This was the beginning of a move away of my core physics activity to a more software-related work, and the realization that yes, I could do it, I could maybe contribute something useful to the community (well, I’ll let you judge that). Thanks a lot to Travis Vaught from Enthought for bringing this project to a success.

[1] I feel like we (the SciPy community) are like an aging teenager, wanting a lot of independence, but still living a lot out of parent’s money (Enthought). And I feel the first concerned, as I am spending the summer working at Enthought to get a chance to work on interesting SciPy-related projects.

Disclaimer: The second part of this post reflects my own opinions, and not those of my employer (obviously) or the SciPy08 organising comittee.

15 Jun

Austin is different (weird ?)

Yes it is, the lizards are more green than back home:

This is what is running on the balcony, as I check my email.

I am slowly discovering the different world. Yesterday I settled down in my office. As you might know, Enthought’s offices are located in the center of Austin. They are absolutely superb, with large office, big desks, and nice views. I have a view on the capitol from my office. I didn’t really get anything done apart from setting up my computer, but I finished the day by enabling the twin large screens on my workstation… That feels good.

Today’s achievement was chopping for groceries. Remember, here things are different! Eric has left me the keys of his car, and I certainly couldn’t do anything without car, as I live way out in the country side. The first challenge was to open the garage door. I eventually found the switch. The second one was the car itself… you see, I am used to manual shift cars. But as a experimental physicist, after a short while, I had things under control. The biggest challenge was finding a store, or rather finding one, and coming back home. I had hoped that the car would have a GPS, but no. I couldn’t locate the street on the different maps I could get hold of, and internet was down (no google maps). So I went out with no maps, without knowing the country, not knowing where I was going. I memorized the way out, random walked until I found a store, and was able to drive back.

So indeed, Austin is different. It is supposed to be weird, but that I still have to discover.

Oh, and I almost forgot, I got a mobile. For those who know me it is like loosing my religion, but I have already found plenty of pratical uses for the phone. First of all it is a decent flashlight (I left all mines home).

PS: I usually make informative posts, I hope you don’t mind if I sometime dilute a bit the content…

14 Jun

Alex Martelli giving the SciPy2008 Keynote

On behalf of the SciPy2008 conference organizing committee, I am happy to announce that the Keynote at the conference will be given by Alex Martelli.

It is a pleasure for us to receive Alex. He currently works as “Uber Tech Leader” at Google and is the author of two of the Python classics: “Python in a nutshell” and the “Python CookBook”. Alex graduated in electronic engineering from the university of Bologna and worked in chip design first for Texas Instrument, and later for IBM Research. During the 8 years he spent at IBM, he gradually shifted from hardware design to software development while winning three Outstanding Technical Achievement Awards. Then he joined think3 inc., and Italian CAD company, as Senior Software Consultant where he developed libraries, network protocols, GUI engines, event frameworks, and web access frontends. After 12 years at think3, he worked for 3 years as a freelance consultant, mostly doing Python development, before joining Google.

Alex won the 2002 Activators’ Choice Award, and the 2006 Frank Willison award for outstanding contributions to the Python community.

Alex has also taught courses on programming, development methods, object-oriented design, and numerical computing, at Ferrara University (Italy) and other venues.  Alex’s proudest achievement is the articles that appeared in Bridge World (January/February 2000), which were hailed as giant steps towards solving issues that had haunted contract-bridge game theoreticians for decades.

This biography was loosely adapted from Alex’s autobiography (http://www.aleax.it/bio.txt), more information can be found on his website http://www.aleax.it .

13 Jun

Arrived in Texas

I just arrived in Austin, Texas. I need to settle down a bit more, blog about my fantastic holidays, but I wanted to give an update of where I was.

The hospitality here has been fantastic so far. I am sitting in a confy chair, sipping a fresh orange juice, after having spend a night in a very cosy bungalow and waking up to see two fawns in the garden. I would say that the Enthought guys really know how to treat there hosts. It seems to be something the Python scientific community is really good at, judging from my different experiences (Fernando, and JB Poline).

16 May

Offline for three weeks

I am leaving tomorrow for a three-weeks trip to central Asia with Emmanuelle. We are going to spend one week in Uzbekistan, visiting fabulous cities like Samrkand or Bukhara. After this, we will spend two weeks in Kyrgyzstan, back-country trekking.

I will be offline during all this time. It will be good to finally take holidays. I have been unemployed for two weeks, I took a few days in the mountains during the first week, with my parents (I climbed the Mont Blanc, that was fantastic, maybe I’ll find time to blog about it), but during the second week, I have been running between embassies, specialized shop, change offices, to sort out my various trip (this one, and the next, professional, to the States). In addition, I have been working hard to prepare a small surprise for the Scipy2008 conference (I do hope Jarrod will blog about it when it is ready).

Soooo, I am wasted. Enough said, I am not going to touch a computer for thee weeks, and that’s good.

05 May

Update on my life

I am currently changing jobs and changing countries. This is why I have been really bad at dealing with questions on the mailing-lists, bug-reports or feature requests.

Before

So far I have been working as a physicist, doing atomic physics (Bose Einstein Condensation). I studied quantum physics, mostly theory, and I did a PhD in an experimental lab, building a couple of experiments on Bose Einstein Condensation and atom interferometry. After this, I moved to Florence to do a post-doc also on a BEC experiment.

A colleague working on the experiment in Florence

This kind of work is very experimental. These experiments are monsters that you have to keep alive doing a lot of homemade mechanics, optics, and electronics. I thought I would love that, because I used to like working with my hands, but I grew tired of it. I wanted to work more with abstractions. And in addition I am computer geek, the parts of my job I preferred were related to computers.

This summer

My contract has ended at the end of April, and I have not renewed it. I was missing my girlfriend and wanted to find an excuse to come back to Paris. So now I am jobless, living at the expense of my girlfriend. I decided to take some time without a job, as I have the feeling I have been working without stopping for the last few years, not having time to travel and visit the world as I like to. We are planning a three weeks trip to Uzbekistan and Kyrgyzstan in two weeks.

After this i am going to devote my summer to hacking. The big news is that I am going to be going to the states. I will spend most of my time in Austin, working for Enthought. I am very excited about this, as I see this as the occasion to learn more about building scientific GUIs with Python. Building usable scientific programs is something that I am passionate about. I will also spend some time at Berkeley, with Fernando Perez, hopefully to work on Ipython1. I need to thank Enthought for making this possible for me, as they are providing the money. With some luck, this summer I will be productive on the free software side.

Of course right now I am battling with moving houses, fighting for visas, trying to fall back on my feet and organize the summer. I still don’t have my visa for the states, and it is making me nervous. I would really hate to have to cancel my trip to Kyrgyzstan because of visa problems with the states: I take time off work, I expect to spend it enjoying myself, and not waiting for visas.

The future

So I am quitting atomic physics. I am starting a new adventure in something totally new for me. Starting from October, I will be working with JB Poline and Bertrand Thirion, at Neurospin, on neuroimaging. This work is mostly data processing, even though it has a lot of interplay with the physics of NMR. This is something very new for me and I will have to discover a new field. The good news is that a lot of the work is centered on computers, and one of the core technologies used at Neurospin is Python.

28 Apr

Docs using Sphinx

After Ipython and Sympy, Mayavi is now using sphinx to build its docs. Sphinx is very neat because it allows for high quality pdf and html from the same restructured text source. The killer feature is that the resulting html pages have a builtin search that works with javascript, and thus works on the client without the need of a server.

In addition, the developer is very reactive and dedicated to making sphinx versatile-enough to generate high-quality docs for many packages. As a result many Python projects are switching to sphinx. First Python itself (that’s what sphinx was created for), but now more and more. It seems that zope is even considering it. One great side effect is that documentation for different Python modules will be consistent, with the same look and feel (although you can tweak sphinx output if you want).

We don’t have a server serving the html docs yet (it is planned, we just need a bit of time), but you can check out the pdf generated here.

12 Apr

Of packaging, installation and dependencies

I have been struggling for the last few days trying to understand the issues behind packaging and installing the Enthought Tool Suite. I think have been making progress, though only in my head, no actual code or packages so far are terribly satisfying.

The problem

If you are developing a Python-only program, with only dependencies on the standard library, you have no problems with packaging. You can ship tarballs, MSi installer, eggs, … all this works.

However, if you want to develop a rich program that provides many features in a closely integrated and consistent way to the user, you will have to depend on external packages. I know that many projects work around this by including the external dependencies inside the project, or simply reinventing the wheel. Well this does not scale. We cannot expect to develop a major scientific tool and community this way. Reuse is the key to scalability, in my opinion. Thus comes the problem, how to we ship our program?

The problem can be very well seen with the Enthought Tool Suite (ETS). The ETS is a suite of many different packages, all pretty much geared towards building interactive scientific application. In house, Enthought, the company (disclaimer: I do not work for Enthought) uses these packages to develop domain-specific applications for customers. They have broken up the suite in a set of small packages, to enable assembling applications by requiring only the features you need. This is important because if you want to use ETS’s 3D plotting package (TVTK or Mayavi), but you want to stick with MatPlotLib to do 2D plotting, and not use Chaco, you should be able to download only what you need.

As a result the ETS is made of a set of interdependent packages. Maybe they went a bit too far in the modularity, and there are almost 50 packages. The dependency graph looks like this:

Just to reassure you, the next version of the ETS has a much reduced number of packages, just because some packages where grouped, and the dependency graph indeed is sane:

As you can see, there is a complex dependency graph. So how do you ship this to the user? Another problem that should not be underestimated is: how do you make it easy for people who distribute your projects to package this?

Setuptools

Python has no good answer for this problem, but setuptools do go part of the way. Dependencies in the ETS are declared using setuptools, and installing the ETS strongly relies on setuptools.

Setuptools provides a way of automatically downloading dependencies. However, it is not a full packaging system replacement. The reason I say this is that it does not have the knowledge of a dependency graph, it just downloads packages, introspects them to find their dependencies, and recursively tries to satisfy them by downloading more. Phillip J. Eby (the author of setuptools) has been quite clear that he does not want to write an APT replacement, tough people keep getting it wrong and making the equation “easy_install = apt for Python” (IMHO this is due to bad communication on setuptools webpage).

Moreover, setuptools does not provide an easy to use API to extract all the information it has about packages, dependencies, and download URLs. It is thus not trivial to plug packages shipped with setuptools in an other package manager like rpm or apt. This is why bothers me most, because this is strongly limiting the exposure the ETS is getting in distributions (whether they be Linux distributions, or scientific computing “superpacks”). Recently I have had discussions with somebody on how to ship Mayavi in a monolithic distribution he has developed. He agreed to ship setuptools with the distribution, so now I need to give him a list of eggs to provide. There is no obvious way to get this list using setuptools (insert here big big rant). So I thought that an option was to install Mayavi in a virtual environment to trac the eggs added, and use this information. However, this person’s internet access was possible only by login on dumbed-down servers for security reasons. So we hit a wall. And for me this wall is a wall we keep hitting with setuptools: setuptools does everything for you, the download, the building the install. It does have flags to control these processes, but it does not expose the information you need to do this without using it. I actually think the reason it does not expose this information is that it does not know it a priori. Looking at the code it does seem so. In addition, the structure of the packages make it hard to do.

From packages to repositories

On the other side, Dave Peterson, at Enthought, has been working on a tool to allow checking out of the ETS SVN only the projects you are interested in. I played a bit with it, and modified it to generate the dependency graphs. I quickly found out that I actually like this tool much more than setuptools, even though it was pretty much using the same concepts. It took me a while to understand what I like about the tool. It is that it uses a map file to gather all the package and dependency information. As a result, it has the equivalent of a dependency graph. This makes it possible to do the operations I am interested in, eg listing all the packages required for installing a given project without actually downloading them.

The reason this is possible is that with the ETS we are not dealing with an open set of packages, like PyPI, in which packages can come and go, and no consistency is enforced. We are dealing with one suite of multiple projects that are made to work with each other. The base entity is thus a project set, on which we can make a “project map”.

What Dave has done works fantastically for development, I would like to push it further for distribution. What we expose to the user can now be a repository, in the sens of APT: a set of packages with consistent inter-dependencies, and a way of retrieving easily this information. The difference between the two, and the implications of the difference, is not something I had clearly in my mind in the beginning, but it is becoming clearer that having a repository with a project map gives a lot of added value for distributing. I’ll see if I can reuse Dave’s work to build such a tool, but do not hold your breath, I am not willingly in the business of packaging, and will probably not spend enough time on this to make it a good tool.

Edit: Correct Phillip’s name.

05 Apr

Objects, modules and Traits and Envisage

I have been reading an article about a new language paradigm (Erasmus, a modular language for concurrent programming). The authors discuss the limitations of objects in terms of modularity. To sum up their point (and most probably distort it completely), the limitations with objects comes from the fact that you can’t be sure what is modifying what: suppose you have a method foo of an object bar that you call in a method of an object baz, you cannot be sure that this method hasn’t modified private attributes of your object baz, as foo could have called a method of your object. This does happen in large code bases. Of course, best practice tries to reduce this to a minimum, but this reduces modularity, and thus limits both code reuse and concurrency (as side effects are not well controlled).

Erasmus’s solution to is adopt a new container, that they call modules rather than objects, and that are based on message passing rather than method calls. These modules live in separate processes and can themselves be made of more conventional code (I am extrapolating a bit from the original article here).

This strikes me as being related to a pattern that I see more and more in my code that uses Traits. The objects deriving from HasTraits have a very easy and cheap way of coupling callbacks to the modification of their attributes. This induces a programming style know as reactive programming that is entirely callback-driven. In addition, this is a nice way of ensuring that the internal state of an object is always consistent. This is a first step to message passing and decoupling: you no longer call methods, you just set attributes and let the object do the rest. The limitation of this model in a large code base is that you have to carry around references to the objects you are interested about, and their attributes. Traits has patterns to help you do this (delegation, namely), but it is still a limitation.

This is where the Envisage framework comes into play. Envisage introduces the notion of plugins which provide extension points. These extension points are special traits attributes that are published in a registry (which can be application-wide, or not, in Envisage3). You can query the registry to retrieve these extension points and contribute to them. After that, the traits callback mechanism triggers an action in the plugin contributing the extension point.

This contribution mechanism could be based on message passing between processes quite easily (although for GUIs it breaks down, because AFAIK you cannot assemble a consistent GUI from different widgets living in different process space, without using some Xwindows-specific tricks). Of course this does not give me hard guaranties of decoupling and control of the side-effects, as a call to a plugin can induce calls to other plugins inside it. This is where best practice comes along: core plugins should be able to run and provide their basic functionality outside of Envisage, as normal objects. Envisage should only be a thin wrapper allowing them to expose this functionality and extend other plugins. This is introducing a distinction between objects and method calls, that do not need to be arranged in self-consistent entities and which you use very often , and plugins and extensions contribution, that form standalone entities and should be used more sparsely.

Of course Envisage cannot go too far in terms of providing guaranties for decoupling. It gives a mechanism, best practices, could even help plugin decoupling by having them live in different processes, but as long as it does not enforce rules in the semantics of the language, it cannot achieve what projects like Erasmus are trying to do. I however think it is good to have a look at the work done in these projects to see what we can learn.

PS: Web apps suck! I made a few sortcut mystakes under wordpress, wanted to undo them and hit “Ctrl-R”, which is “redo” under vim, and lost all my post. I strongly don’t believe in web apps, amongst other things because they don’t allow me to use vim.