I am sorry guys, I feel a strong need to share my frustration with you today. I have discovered yet another infovis library to create the most beautiful visualizations in the world and instead of being excited I am depressed. That's great I really champion the effort of these good guys but a tough question keep hammering in my head: why so many libraries and so few tools? Libraries are great and really needed to speed up the development process but here I perceive a dangerous trend: there are a lot more libraries than real tools written with them!
There are a lot of people out there waiting for our useful tools to come and I think it is time we realize that developing real tools for real people is more important than writing toolkits. Personally, I am totally ready to accept a world with no toolkits and lots of tools.
I give a look around on the web and I cannot find a decent visualization tool freely available, only few expensive highly technical commercial tools. As Stephen Few pointed out in his talk at InfoVis a couple of years ago, there is a whole bunch of casual users out there whose job includes the need to analyze data. So, what are we waiting for? Who is expected to build these tools?
I am worried by this short sighted view and this auto-referential culture where infovis people build things for other infovis people, that's it. We develop libraries and then set up fancy examples to show to ourselves and our peers how good we are. Ok, this is useful and needed to some extent. It helps building a community, sharing knowledge and to consolidate good practices. But if we want to go to the next level and let infovis go beyond the toy tool stage, we have to go one step further and embrace the much riskier and tough question: who will use it?
I see so few examples around that I'm kind of embarrassed to talk about it. Can you list any serious and freely available tool that an average user could use in his or her daily activity? Do we maybe have something that minimally resembles a free Spotfire? We have a myriad or little toy vis scattered around on the web and nothing in our hands.
There are very rare exceptions. Robert Kosara has recently published its Parallel Sets in his EagerEyes and plans to keep the burden of maintaining it over the next follow up versions. This is a great thing. Parallel Set would not solve the analytic problems of the entire world but it is a step towards this direction. Therefore bravo Robert!
Another tool I've seen around recently is Verifiable. A very nice and well done tool to create charts directly on the web. Nothing really revolutionary, but what it does it does it well and with an extremely clear interface.
These have the shape of tools made for end users and this is what we need. C'mon folks, libraries are great but we need to show yet what we are able to do to the entire world. Let's develop tools, tools, tools!!!
Comments (4)
Thanks for the kind words! :)
But still, I wonder: what toolkits are you talking about? All of the open-source ones I know have been abandoned a long time ago (e.g., the InfoVis toolkit), or are badly documented and about to be abandoned (prefuse). I completely agree that we need more tools, but we also need a decent, well-documented, well-maintained, open source toolkit. I don't think we have that.
Verifiable isn't bad, and it actually seems to have a few features that make it more interesting than many of the other sites (small multiples, and I believe even cross-tabs). But as you say, it's all "nothing revolutionary" - all these sites effectively let you create the same four basic chart types that have been around for over 100 years, and in the public consciousness for 40. Treemaps are a bit advanced for many people, but I don't understand what's so difficult about histograms, for example.
Unfortunately, there is basically zero academic value in maintaining a good toolkit, and all those websites (except for Many Eyes) are run by people outside the visualization community, and apparently little knowledge of visualization. I don't know how we can change that, but we need to figure it out.
Posted by Robert Kosara | June 22, 2009 3:16 PM
Posted on June 22, 2009 15:16
This is really a great post and a much need post. I would like to commend Robert too. We need more efforts like that.
Since I've started working on the medical imaging side of things where they too have seem to have small scripts but no really software that they distribute freely. NIH actually has started an effort called (NITRC) - Neuroimaging Informatics Tools and Resources Clearinghouse, which basically funds the development and maintenance of human brain mapping software. They also provide a clearinghouse which allows users to come to their website and find the best/most suitable tool for them. Professors are interested in developing and more importantly maintaining and releasing their software now since the NIH is funding such efforts. Maybe NSF could start something to this effect on a smaller scale to encourage sustained development and maintenance of infovis tools.
Posted by Alark Joshi | June 22, 2009 8:56 PM
Posted on June 22, 2009 20:56
As someone who produces a "toolkit" for statistical graphics (ggplot2, http://had.co.nz/ggplot2), I would disagree very strongly that maintaining a good toolkit has zero academic value. Here are four benefits off the top of my head:
1. People actually use your research! I love it when people use my work to better understand their data, and it really is my prime motivation for doing research.
2. When people use your framework or toolkit, they will run into its limitations and complain to you. A few of the complainers will have great ideas for improvement. Both complaints and ideas are great fodder for further research.
3. The discipline of preparing your code (bug-checking, documenting etc) makes it easier to use for future-you (i.e. you are less likely to look at your code in 6 months time and have no idea how it works), for your students (so they can start to do useful research faster), and for your colleagues (so they better understand and cite your work).
4. You will be invited to give workshops. These are a great opportunity to expose a new audience to your work, and give you some extra spending money.
Posted by Hadley Wickham | June 22, 2009 11:26 PM
Posted on June 22, 2009 23:26
I'm not saying that toolkits are useless or not needed! To the contrary I agree with Robert when he says that we still need a solid one. The point is that all these toolkits around are shallow and if only all this effort could be focused either on creating and maintaining a good one or just developing good tools for people to use, this would be a better world IMHO.
There are a good number of toolkits around. Apart from the InfoVis Toolkit and Prefuse I can remember Flare, BirdEye, Protovis, JavaScript InfoVis Toolkit, and I bet there are many more I cannot remember now. It's frustrating. I want to use tools and let people see what we are able to do and there's nothing around ready to use.
I don't know if there is any academic value in developing strong toolkits but maybe we should learn from other fields. One notable example is Weka, the data mining library. It comes form a pure academic effort and today it has become a sort of standard. I'd bet the authors have gained really a lot in terms of visibility and research. But then I don't know ... what's the recipe? I understand that from a research perspective it is a daunting task.
Posted by Enrico Bertini | June 23, 2009 11:25 AM
Posted on June 23, 2009 11:25