« September 2008 | Main | November 2008 »

October 2008 Archives

October 9, 2008

Visual Poetry 2008 ... better this time!

A couple of years ago I posted here a quite harsh critique of Visual Poetry 2006 titled: "Visual Poetry (mimicking TextArc the bad way?)" in which I argued that data visualization, even when its purpose is beauty and art, must strive to be informative and enthrall people with intriguing bits of information that can be extracted from the screen. Visual Poetry 2008 seems to be a lot more interesting and closer to my idea of informative art.

In my old post I criticized the use of visualization without careful thinking. My post generated quite some interesting discussions. Robert Kosara posted a response on his EagerEyes: "The Visual Mapping of Poetry" and I replied with another post here: "Visual Poetry (part 2): must visualization necessarily convey information?".

Here is my review of Visual Poetry 2008.

Backround

From the author Boris Müller's:

"Poetry on the Road is an international literature festival which is held every year in Bremen, Germany. Since 2002 I am commisioned to design a visual theme for the festival. While the theme itself is changing, the underlying idea for the visuals is always the same: All graphics are generated by a computer program that turns texts into images. So every image is the direct representation of a specific text. The design and the development process are a collaboration with the design agency jung und pfeffer."

Description

This is the visualization designed for this year's edition:

visualpoetry08_s.png

It represents multiple texts at the same time and attempts to compare word frequency distributions among them. Each horizontal line is a single poem and each element on the line is a single word. The words are sorted by their frequency in the text and mapped to line width. Each line connects the same word from one poem to another; normally with tapered lines since the same word has different frequencies in different poems. Words which appear only once are represented with an "X".

Critique

What I like

  • Simplicity: compared to the visual poetry 2006 this mapping is a lot more comprehensible, very simple. A line is a poem, a dot is a word, same words in different poems are linked with a line. Since I am a big fan of simplicity this is the first thing I like to mention.

  • Beauty: though probably related to simplicity, beauty is a great feature of this image. The chosen color is attractive, the sinuous curves are aesthetically pleasing, and I think there is the right balance between chaos and structure. Not too chaotic to be disturbing, not too structured to be dull.

  • Patterns: I can clearly see some patterns! This is what I meant in my older post. Here I perceive simplicity and beauty and yet I can see some patterns. If you try, for instance, to follow the path of a single word from poem to poem you can see how certain words "sinuously" become very frequent or infrequent. Another pattern: the second poem from the bottom contains quite a lot of "X" words which are not used in other poems. One natural question would be: "How does it use such an amount of singular words?", "What poem is this?"

What is missing
In one word INTERACTION. Oooh it's a pity! The only thing we can do with the "interactive" version is to type a word and have it highlighted.

I understand that the original visualization is probably designed only to be printed on paper and not as an interactive artifact. One might argue then that interaction here is just not wanted or possible. Nonetheless, in the project's web page you can find a small interactive demo (see at the bottom of the page). My critique here thus refers to this demo exclusively.

  • Hovering: recognition vs. recall is at stake here. I don't want to remember one word first and then search for it in the text, I'd rather like to discover which words are those that expose some interesting patterns. Even a simple added feature like "hovering" onto a lines to see which word it is, would be great to have.

  • Filtering: a bit more advanced as a feature and not necessarily needed in such a piece of art but imagine to have one simple slider to isolate the words within a certain frequency range. That would be even more fun.

  • Labels: this is a bit more serious. Why not adding few small labels to at least say which poem is which? It would need 5 labels! Only 5: one for each poem. Adding labels to words is obviously more problematic but again, why not showing the word associated to a line when hovering with the mouse pointer?

Conclusion

The Visual Poetry 2008 is beautiful piece of art. Simple to understand and still complex enough to attract. It's a pity that some simple basic features are not added but I can understand they are probably not needed given the context of the picture and that it is primarily conceived to be printed on paper. Well done this time!

October 29, 2008

Charts: OECD Education at a Glance

Inspired by some recent government interventions on the Italian public school and the consequent large development of protests all around the country I have designed few charts to see if I can better understand the issue from the data and communicate some results.

Background

I have used the data from the Organization for Economic Co-operation and Development (OECD), which is often referred to as one of the main trusted authority for whatever concerns the education systems of a country. More precisely the data comes from the OECD report: "Education at a Glance".

At the origin of the protest there is the reduction of the number of main teachers per class from 3 to 1, with a consequent reduction of the public personnel. The government says that having less teachers will not influence the quality of the studies and that quite a lot of public money will be saved. The protesters believe that the opposite is true and that the savings should not come from these cuts.

The goal of these charts is not to provide a solution to the debate, rather it is a very small and focused view on the problem. I just tried to find some hints on two related questions that came to my mind:

  • How efficiently does the Italian system spend its money?
  • Is proportion of students to teachers the cause of poor performance?

How efficiently does the Italian system spend its money?

The first chart replies to the first question. At least partially. The chart is a scatter plot of the OECD data on efficiency of school systems based on the following data:

Scientific performance: called PISA (Programme for International Student Assessment) and defined as "an international study conducted by the OECD which measures how well young adults, at age 15 and therefore approaching the end of compulsory schooling, are prepared to meet the challenges of today's knowledge societies." It is supposed to be a good indication of how well our schools do.

Expenditure per Student: It is defined as the equivalent US dollars expended per student.


chart1_s.png

I have drawn two lines to divide the space into 4 quadrants with respect to where Italy is placed. Of these quadrants I have highlighted the bottom right because it represents all countries who can perform better in terms of the PISA index and spend less. In other words all the countries in the quadrant not only are able to spend less but they also use this money more efficiently because they produce better students.

The sad truth is that my lovable country performs very bad. Greece and Portugal are valid companions but at least they spend less.

In oder to be sure that these results are not affected by the economic level of countries, I have also produced a second chart where the expenditure is normalized with respect to GDP (gross domestic product).


chart2_s.png

Unfortunately the result is even worse: Greece and Portugal perform worse but almost all the other countries are better. From the chart we can also see (in the bottom right) that Finland performs exceptionally well and that New Zealand, Netherlands and Australia performs very well too but spending less money.

Is proportion of students to teachers the cause of poor performance?

Since at the center of the debate there is the question of whether more or less teachers affect the quality of an education system, I created a bar chart comparing the ratio of students to teachers for the countries shown in the scatter plot.

Here are two bar charts, one for primary school and one for secondary school. Again I have highlighted Italy in the chart to make the comparison with it easy.


barchart1_s.png


barchart2_s.png

As you can see Italy has one of the lowest ratios both in primary and secondary school, meaning that there are quite a few students for each teacher or, in other word, that teachers are not very overloaded compared to other countries. The comparison with other countries is quite interesting. Finland, Netherlands and New Zealand (Australia is missing in the data) which are very efficient, as we have seen in the scatter plots above, have quite higher values compared to Italy. Can we say then that at the root of the poor Italian performance there is the number of teachers? Or can we say that a small number of students per teacher is necessary to produce a school of high quality? I don't know ... but at least the graphics instill some doubts.


Technical Notes

The charts have all been done with Excel. After all it is always the best and most readily available tool. There is always a bit of a hassle in doing certain things, especially the defaults are crazy (like strong dark backgrounds), but in the end it works great.

I have used the XY Chart Labeller to reduce label overlaps on the bar charts. This is also a bit cranky but in the end it does its job well.

The annotations on the charts have been done with the graphic tools in Excel and externally within SnagIt, which I use to screencapture the charts. Yes I've used screen capture! I know I could use VBScript stuff or similar things to save the charts into images but it's always a kind of pain and less flexible than just press PrintScrn and edit the image.

Disclaimer

With these charts I don't pretend to demonstrate anything, it's more an interesting exercise for me to create data graphics and to show how easily we can reason about data that pertains to facts related to our social life.

The charts might show and evident bias towards judging the government interventions appropriate, but this is not my intent. Rather I would be very curious to see other charts that better clarify the issue and show with data and graphics arguments opposite to mine.

Final Reflection

In order to build these charts I have invested very very few time (I invested a lot more time to write this post though!). I was able in a few clicks to clarify to myself some things on an issue which is quite hot during these days in my home country and which I dare about. The same thing might be done by millions of citizens if only instructed appropriately. And that would mean having a population of informed people, able to ground their protests on hard data and to communicate their arguments with the vividness of well done data graphics.

Unfortunately this is very far to come. Simple techniques like these are never used by politicians or protesters, they prefer to use thousands and thousands of words in place of few well done charts. It's a pity for us and it's a pity for them.

5 False Myths of InfoVis

I have just read this post from Stephen Few's blog "Are visual analysis tools poised to become pervasive?" in which he speaks about infovis flawed principles as reported by Christian Chabot, co-founder and CEO of Tableau Software, during his last InfoVis keynote.

Inspired by these thoughts I have found the courage to speak out about 5 false myths (FMs) I believe we have in infovis.

  • FM1 - InfoVis is about data exploration: I have heard it millions of times since I started reading papers and books on visualization, it is a sort of mantra: "infovis is there to support people in data exploration". Me myself I have also described infovis in these terms tens of times in papers and reports. But is data exploration a real activity or goal? Nobody really wants to explore data for the sake of it (apart from us infovis geeks who derive pleasure from it). Data exploration tells nothing about the goal of a user and the reason why he is willing to invest time in learning and using an infovis tool. Biologists don't want to explore data, they want to understand how genes react to certain interventions. Security analysts don't want to wander through millions of alarms, they want to spot intruders and react as fast and accurately as possible.

  • FM2 - InfoVis is about discovery: This is another mantra of infovis, repeated millions of times. While it is true that infovis can help discovering new facts, its true value does not come from discovery but rather from understanding. The main reason why an infovis tool is useful is because it helps make sense of data and because it does it in a more efficient way. It permits to efficiently understand what the data has to tell. And its quality can (should) be measured in terms of how effectively and efficiently this process is supported. I remember John Stasko having said in his presentation at BELIV'06 something like this: the main activity supported by infovis is to learn about a domain. This is what we mostly want to do with infovis and this is what should be supported.

  • FM3 - InfoVis is about new visualization techniques: InfoVis has already hundreds of techniques available which we can draw from. The real challenges we are confronted with are: 1) understand how to use and customize the techniques we have now to make them useful to specific problems and people; 2) how to combine different techniques in composite tools able to integrate them and get the best out of their composition (as an example why nobody tried, as far as I know, to integrate all those n-dimensional visualizations we have out there?). That said, I am not saying that inventing new techniques does not have its role or that it is a waste of time. I just believe it's time to shift a bit the focus.

  • FM4 - InfoVis is about vision: I have already talked about this point in one of my posts some time ago titled: "the neglected role of interaction in information visualization". InfoVis is by no means only about visual things it is also about the way we interact with a dynamic display that is able to react as we interact with it. It is this level of interaction that permits us to efficiently manage screen real estate and allows us to reason about a domain. The big challenge for an infovis designer is not only to map data items on the screen in clever ways but also to support through careful interaction design the very tasks it is designed for. We know quite well the perceptual issues and the design principles needed to design of a visual mapping, but when we come to the point of designing interaction we are lost. The only support we have is to draw from simple ideas developed in other designs (hovering, link&brush, dynamic filtering, etc.)

  • FM5 - InfoVis is about the data: We tend to see infovis as a way to support a one directional channel: from data to our brain. But this view underestimates the role of the knowledge we put into the process. When a user interacts with a visualization, he brings his assumptions, background knowledge and skills that play a large role in the interpretation of what is seen on the screen. This is the reason why two different persons can very likely end up seeing different things from the same visualization. There is another hidden channel that goes in the opposite direction, from the human mind to the data, enriching it with the knowledge that is already in our heads. Notably, infovis tools fall short terribly when there is the need to manage this knowledge and let it play a role in the analysis.

This is what I had to say in the urge to react to the blog post I read.

Pleeease ... feel free to harshly critic my false myths!!! They are here to be dismantled ... or even better to be enriched by your views :-)

Take care.

-----

UPDATE -- November 3, 2008

I have received quite some interesting references from Christopher Collins related to the FM5: "InfoVis is about the data". From his post:

"...there are some good examples from the VAST community where prior knowledge can be explicitly entered into the analysis process (e.g. i2's 'Analyst's Notebook' or IBM's Research's HARVEST project). My U of C colleague Torre Zuk has also done some analysis of how a physician's prior knowledge affects their decision making when presented with a visualization."

Thanks Chris for your references!

About October 2008

This page contains all entries posted to Visuale in October 2008. They are listed from oldest to newest.

September 2008 is the previous archive.

November 2008 is the next archive.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type 4.1