2009-01-05

What would you like to see in a book about Matplotlib?

I received the interesting proposal to author a book on Matplotlib, the powerful 2D plotting library for Python.

While preparing the arguments list, I'd like to hear even your opinion, because different points-of-view will lead to a better product.

Some basic question I'd like to ask are:

- what are you using matplotlib for?
- what are the things you like the most of matplotlib, that you want to give emphasis to? And why?
- what are the (basic) things that, when you were beginning to use matplotlib, you wanted to see grouped up but couldn't find?
- what would you like to see in a book about matplotlib?
- what are some those advanced feature that made you yell "WOW!!" ?
- what are the things you'd like to explore of matplotlib and never had time to do?

Your suggestions are really appreciated :) And wish me good luck!

12 comments:

Fabio E. Tonti said...

Maybe "Matplotlib and Sage"?

Sandro Tosi said...

@Fabio: thanks for your comment. The target of the book is more about introducing Matplotlib and embed its plots into applications, but I'll propose a chapter about its usage in scientific arguments.

If you could be a little bit more specific about what you'd like to see, I'd be pleased :)

Dániel Ábel said...

> what are the (basic) things that, when you were beginning to use matplotlib, you wanted to see grouped up but couldn't find?

Good, well commented, easy to understand (i.e. understandable by reading the code once) examples which provide nice pictures. I.e. I want to look at pictures, note one I like and understand the 5 lines that make it work by skimming the code once (the code will be longer than 5 lines, of course, but the important part shouldn't be that much longer, and that important part should be highlighted so that it is easy to find among the other boilerplate.) Then I want to be able to run that demo for myself (copy-paste the code from the accompanying CD or webpage) and play around with it. Without having to care about the boilerplate.

Even better if this is presented in some simple 'narrative', i.e. 'first we take this large data, here are simple distributions, then we start analysing it, here are more interesting distributions, then we draw timeplots of it, then we fit some functions to some parts, then we plot the _parameters_ of the fits .....'. The examples must still be trivial to understand independently, but arranging it like this might make it easier to emphasize the 'new part' in each plot, to make each plot present a single feature.

now that I took a look at http://matplotlib.sourceforge.net/gallery.html that might be very much like that. Its biggest lack is that the 'important parts' are not highlighted in the sample code. So understanding the sample code would take multiple readings. Also, it does not have that 'progress of examples' structure which might be the main reason isolating the 'important part' is difficult.

As far as I see, matplotlib is used in by some by embedding it in some larger program, but by others it is used interactively, from ipython by typing in commands. These are two very different usecases, and they require very different APIs (since what is easy to use for one is really unwieldy for the other) matplotlib handles this somewhat (with pylab), but somehow pylab didn't feel perfect for me. (but thats not the point I want to make, maybe I simply didnt grok pylab)

So: seperating these two is needed, but, even more importantly, describing how one morphs into the other would be nice. So for example, what a given pylab feature does behind the scenes, if I find a nice way of doing something with pylab, how can I turn that into something that I can fit into a larger program (going for interactive to embedded usage).

Also, I think most users won't care much about matlab-similarity; some documentation (for example the pyplot tutorial) I saw on matplotlib appeared to be geared towards people who were used to matlab. I, for one, am looking for a pythonic interactive-plotting framework (i.e. pythonic to use from ipython) so matlab-similarity is not a feature for me (and sounds like an excuse for non-pythonic details). Emphasize the pythonicity of the api and show how to use it in a pythonic way. Leave the matlab-similarity to an appendix / seperate section.

Also (although this might be out-of-scope for this book): how can I get a system where the data has metadata attached and that is used to label the plots. I.e. If I have a function that produces datapoints of some f(t) function, I don't want to label the plots with the units, etc. by hand. I want to be able to apply transformations interactively in an ipython prompt, and have all my plots labeled with units etc. automatically.

There are some libraries for python that implement parts of this but the integration is missing.

> what are some those advanced feature that made you yell "WOW!!" ?

The contours-with-labels-on-them.

That looks awsome and I didn't see that supported in other plotting libraries (gnuplot, chaco, etc.), and the placement of the labels is I assume highly non-trivial.

Is it possible to use such labeling on non-contour plot? i.e. I have several line-plots of x,y data, and instead of using different line-styles, I want to label them, or label some of them, etc.
(I guess making a fake contour-plot would be a possible way to do that, but it sounds hackish)

> what would you like to see in a book about matplotlib?

'what not to use matplotlib for'; i.e. define the niche of matplotlib and refer to alternatives which would be more suitable for usecases matplotlib was not designed for.

Also, how matplotlib works with other python packages, like scipy, numpy, ipython etc. I.e. I would like to see example code for connecting them and be able to copy-paste that code and have all the boilerplate taken care of for me.

samtygier said...

looking forward to the book.

it would be nice to see some recipes for getting data from what you already have to something that can be plotted.

eg)
*list of coordingate pairs
[[x1,y1],[x2,y2],[x3,y3]...]
this needs to be converted to an array of x and an array of y before you can plot it. (the is zip() in python, but maybe it is better to put it into a numpy array and take slices)

* data files
if i have a data file with columns of numbers, and i want to plot col 2 agains col 4.(the is load() and save())
what if the data file has column headings?
what if some of the columns are not numbers?

how to deal with backends. this can be a big pain in some cases.
* if you run code somewhere without X, then some of the backends wont work.
* if you have to run on a machine with some old libraries, some backends dont work
* my code often runs for a while, then tries to make a plot, then fails due to a backend problem, and i have to start again.

Sandro Tosi said...

@Dániel: thanks for you long post! I'm trying to reply to all of it here (with no quoting, sorry).

Yeah, the idea is to have code+image next to it, maybe we can even decide to print on the book only the relevant part of the code, without header/footer (while leaving the full source code in the archive companion), but your idea of "highlight" is interesting.

gallery.html is really complete, and it's useful once you know what you want to achieve ("I want this, and this image seems to be what I'm looking for, let's look at the code") but it's not so good for introductory matters.

the first part of the book is dedicated to learn mpl, with examples, code, images, all run in interactive mode; the second part is dedicated to embedding mpl in applications (being gtk, wx or even web one) so there's room for both the aspects.

we'll even present the difference between "Matlab" and "OO" style, just for the reference, while we'll code in OO-style.

All the last suggestions are really interesting, and we'll evaluate how to merge them into the book.

Thanks a lot for taking the time to reply with such a high quality level. Whatever comes to your mind, please let us know, we'll appreciate.

Sandro Tosi said...

@samtygier: thanks for your comment! It has many interesting points we will make into the book

Rezwan said...

Do you have anything in this book for real time plotting? I have used matplotlib embedded in wxPython before to plot TCP thoughput in real time but I had problems of crash issue. My application tool has two sections. One part will show real time plotting and another part will show real time data update in every second. It works fine for 5/10 mins and then it stopped updating the data. I used differnt threads but anyway I have a hardware background so wanted to create this just as a helping tool. Any tips regarding real time dynamic plotting would be highly appreciated for this book.

Sandro Tosi said...

@Rezwan: yes, I show how to real-time plot with GTK+, Qt4 and wxWidgets.

You might also like to see the TOC at http://www.packtpub.com/matplotlib-python-development/book

Joseph E. Curtis said...

Just today on the way home for work I was wondering if there was a book about matplotlib that I could purchase.

Please take the time to find a group of diverse and dedicated readers to help you polish and test the final product. Take the step beyond what we can find on the web by including enough details and thoughtful examples that can train new users to get results from start to finish. Remember that the first roadblocks that a new user hits will dictate whether or not they will stay committed to your book and hence to matplotlib. If enough details are left out and the barriers mount, new users will become frustrated and put your book down.

Perhaps the best example of a book that is able to grab a hold of a new user and hold onto them throughout is Rappin & Dunn's "wxPython In Action". They thought through just about every step that one needs to take to generate a full GUI, from start to finish. Seemingly small details of knowledge that most books would gloss over are spread through each chapter. The beauty of their book is that they wrote in a pedagogically sound order. They had a clear idea what a new user would be able to absorb at the time that they discussed the material. Reading their >500 page book was like sitting next to them while I learned the language.

Another sound example is H. P. Langtangen's "Python Scripting for Computational Science". This is an academic quality book with excellent examples and it was full of hard earned advice.

It is nice to have example code and many of these books often build up to a monolithic project that has a very specific application. You know, the "spreadsheet" or "text editor" examples. Boring. Scientists live for data and data representations. Not bar-charts that you get in Excel. Present examples of scientific data but make sure that the data sets are abstracted approrpriately so that readers in diverse scientific fields can see the underlying matplotlib elements.

I'll by your book and I'd be happy to read a draft early on if it would help.

Sandro Tosi said...

@Joseph: thanks for your comment!

Anyhow, the book has already been published, so what's done is done :)

The editor wanted to target the book mainly for applications developers, so at embedding Matplotlib in their apps; but the first half of the book is dedicated to start using Matplotlib from the very basic ground up to some interesting examples. The whole source code from each example is provided and explain, many times line per line, so to not leave any doubt in the reader.

If you're still interested in purchasing the book, you can find the book like in a box at the top of this blog.

Thanks for your interest!

Joseph E. Curtis said...

Hi,

I bought it today at work. Great job!

Joseph

Sandro Tosi said...

@Joseph: thank you!

and to everyone else that bought it :)