« July 2006 | Main | September 2006 »

August 2006 Archives

August 14, 2006

Potential faults of interpreting web log visualizations with linear scales

Jakob Nielsen has a very nice set of simple (but powerful) diagrams showing website's page popularity (using his useit.com website as testbed). The common diagram showing pages ranked by popularity on the x-axis and the number of page views on the y-axis can hide interesting information if plotted on linear scales.

zipf_visualization_logarithmic.gif

The diagrams shows exactly the same data but the one on the right side, which is on a log-log scale, tells you something more about the website that just couldn't be inferred from the linear one (on the left side).

It's now clear that we have a drooping tail: the site simply doesn't have enough content to supply the predicted demand at the low end.
Without this fancy log-log plot, we would have never seen the site's potential for increasing traffic by adding large amounts of low-volume content. I'm amazed at how often articles analyzing Web traffic or "long tail"-type businesses use linear plots that fail to show what's really going on.

There is another related article from Nielsen pushing the analysis a but further, showing statistics on search engine queries issues towards useit.com and incoming traffic from other websites. The same rule on graphics still holds.

Interestingly enough you can see from the visualization that queries from Google are disproportinately high and that the distribution of incoming traffic doeas not drop-off at the lower end of the tail.

August 19, 2006

Bootchart: visualizing the linux boot process

Here is bootchart a nice visualization tool visualizing the performance of the linux boot process. The tool meets one important need for all linux techies (yes, those obsessioned by compiling a new kernel at least once per week :-)) that need to understand the performance of the boot process.

The challenge is to create a single poster showing graphically what is going on during the boot, what is the utilization of resources, how the current boot differs frm the ideal world of 100% disk and CPU utilization, and thus, where are the opportunities for optimization.

bootchart.sortreadahead.png

There are graphics of any linux fashions:
SUSE | fedora | Debian

The application collects data from the boot process and passes the information to a Java application rendering the thing:

The log tarball is later passed to the Java application for parsing and rendering the data. The CPU and disk statistics are used to render stacked area and line charts. The process information is used to create a Gantt chart showing process dependency, states and CPU usage.

Since the amount of data is very large some pruning techniques are used:

A typical boot sequence consists of several hundred processes. Since it is difficult to visualize such amount of data in a comprehensible way, tree pruning is utilized. Idle background processes and short-lived processes are removed. Similar processes running in parallel are also merged together.

August 23, 2006

Solutions in search of problems

There is an interesting intervew with John Stasko at mentegrafica published recently. I really liked the point made by John about "solutions in search of problems" style of research in InfoVis. Since recently (and today still) we have lived with this way of producing visualizations: inventing a new technique for the sake of it and then trying to find a practical application, i.e., an appropriate problem.

There is a plethora of examples, especially on the web. And many are pointed by this blog too, because they are just fun! After all, why are we so attracted by visualizations? You can respond in many ways but I'm sure that the deepest feeling and the real reason is that they are just cool: beautifull to see, engaging, colorful, etc. However, I agree with John's point and I also believe that in order to make a dent in this world (quoting steve jobs) we must care about the utility of things we make and struggle to find approriate solutions for specific people.

The same and similar points were raised at the BELIV'06 workshop we organized last May 2006 in Venice (colocated with the AVI Conference). It was a really enjoyable event and we discussed a lot about these things in such terms.

Anyway it's worth to note that this is not unique to infovis only but also to all the sciences which actually are engineering (thus almost any areas of computer science). This is just the way new scientific fields develop. At the beginning some small niche of people get interested in the concept and produce new ideas, new prototypes and the rest. Then, when the field becomes more mature, people start asking themselves if these applications are useful or not. I am glad today we are facing these questions because it means we are entering a new phase and thus the field is getting more mature.

One side effect of this is that it is becoming more and more difficult to publish papers in good conferences (like IEEE InfoVis Symposium). Now the reviewers expect to find strong claims about the utility of the proposed technique, real improvements over related solutions and some sort of experiments/tests that demonstrate the quality of your work in practical settings. Sure, I understand this approach can be questioned in that it might discourage the production of really novel ideas, but still this is the classic evolution of research fields. The same has happend and it is still happening at Siggraph for example (probably the biggest conference out there), which has a much longer tradition than infovis. See for example this guy who retired because disgusted about the way papers get reviewed at Siggraph, leaving only this laconic message in his page "in summer 2006 I will be leaving Stony Brook University ... you are interested to read above my reasons, click here". Similar complaints happen at CHI conference too. See these funny arguments made by Henry Lieberman, and Shumin Zhai's response.

So ok, I don't want to push this thing too far ... anyway I really liked John's expression and I think I will use it many times in the future for explaining why nice ideas are not enough, why beautiful images are not enough (apart from artistic purposes?), and thus why our ultimate purpose as researchers/designers is to help people accomplish their tasks better (more efficient, more fun, more effective), quoting F.Brooks's famous Computer Scientist as Toolsmith (pdf):

If we perceive our role aright, we then see more clearly the proper criterion for success: a toolmaker succeeds as, and only as, the users of his tool succeed with his aid. However shining the blade, however jeweled the hilt, however perfect the heft, a sword is tested only by cutting. That swordsmith is successful whose clients die of old age.

I know some people do not agree with me, but this is the way things progress: debating about different ways to view the world. Enjoy! :-)

About August 2006

This page contains all entries posted to Visuale in August 2006. They are listed from oldest to newest.

July 2006 is the previous archive.

September 2006 is the next archive.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type 4.1