Main

critique Archives

November 4, 2006

Visual Poetry (mimicking TextArc the bad way?)

Today I returned to this post found at infosthetics about the visual theme developed this year for Poetry on the Road, an international literature festival which is held every year in Bremen, Germany.

poetry06_plakat.png

At first, I was really impressed by the beautifulness of this image but I couldn't easily get the meaning of it all (which is already a bad sign when looking at a new visualization). Now I've spent some more time on it and I am less and less convinced of its design.

I try to summarize how it is made:

  1. Each word appearing in a poem is encoded with a number. Assigning "a numerical value to every letter of the alphabet. Adding the values of all letters, one gets a number that represents the overall word. (For example, the number 99 would represent the word "poetry".)"

  2. Each poem is arranged around one ring in a way that the diameter is proportional to the poem's lenght. "So you can see the short poems in the centre of the poster, while the longer ones form the outer circles."

  3. Each number is depicted by a red ring, whose thickness is based on the number of words corresponding to the number "(poetry shares the 99 with words like thought and letters)"

  4. Each word of the poem (the red ring) is connected to another by grey lines, following the the sequence of the text. "So solid lines represent repetitive patterns in the poem."

I cannot get any interesting trend or inspiring emerging pattern from it. Such kind of visualizations are nice when you can exclaim: "Ah ah! Here is something in the text I couldn't really get without a visualization!". Here the only visible trend I can get is the darkest lines, which are supposed to represent repetitive patterns in the poem; a quite standard pattern in pieces of literatures like these.

As often happens, interaction is almost completely neglected. It would be nice to have a way to highlight the repetitive patterns, so that they can be readily explored. Or even select one word and see which repetitive patterns it generates with other words. And also, given that short poems are in the center, where resolution is low, it is not clear if the darkest lines are a product of extreme overlap or because there are many repetitive sequences.

Finally, it is not clear why a whole circumference is used when only half of it is used to lay out the small red rings. The reason why it is so, I guess, is that the words are connected by "splines" which are used to avoid crossing the center of the circumference, but that at the same time cannot make a whole circle.

The whole thing reminds me of TextArc, probably the most famous visualization of such kind. But it is incredible how much much more informative and cleverly designed (and still beautiful, inspiring, and artistic) it is compared to this one.

Creating beutiful images to impress people is relatively easy, while making visualizations to explore, enable profound insights, and see the invisible, is extremely harder and requires a lot more devotion than this.

December 5, 2006

Visual Poetry (part 2): must visualization necessarily convey information?

A good thing about criticizing something is that your critic can be criticized too. Obvious, right?

Robert Kosara posted and interesting response to my post of few days ago about Visual Poetry, where I argued that even if it was a very pleasing picture, it was nothing really interesting in terms of visualization.

He criticizes my point observing that in fact it is not a visualization but just a way to create a beautiful image to put on the cover of a book. I found interesting his idea of considering the context in which a visualization is used and also I liked his parallel between visual mapping and bijective/injective functions. Visual mapping, in this case, can be considered a sort of hash function: you cannot return to the original object but still you can recognize the unique visual id it generates (like if it was a fingerprint).

I think Robert is right. The purpose was not to analyze data but to create a beautiful and unique image out of it. Given that, however, I still think there is a too common practice of visualizing data just for the sake of it, without worrying too much of potentially better designs. Especially on the web, one can find thousands of little visualizations that are just beautiful and nothing more.

I don't consider myself a purist of visualization, a great designer neither, but I still think it is far too easy to get some data and give it "a" shape, whatever it is. Visualization is careful design, resources are extremely scarce and must be used with parsimony. In commenting the Visual Poetry I was guided by the feeling of frustration I often have when seeing novel visualizations that are just nice. It feels like a whole bunch of knowledge is mistreated or just neglected.

There is a second point. The relation between art and visualization is a very interesting one. One can argue that when visualization is used for artistic purposes it should not necessarily convey useful information. I am not sure about that, but I am sure the best case for visualization, especially when used for artistic purposes, is when staring at the picture one can discover subtle information or relationships, create new associations, and experience one of those rare "ah ah!" moments that let us vibrate. It stimulates curiosity and exploration ... it's a much much richer experience! And an artistic one! Martin Wattenberg's art page is a good example of what I am saying.

In the end what I really thought about the Visual Poetry is that it was a real pity it was designed like that. I strongly felt that if it was designed a bit differently it could retain its beautifulness and still be informative. The current design reduces this opportunity, resources are wasted, and in the end it limits the experience a viewer can get out of it.

March 14, 2007

When a treemap turns into a map (and not a tree)

Ever wondered whether a treemap must necessarily be a tree? I don't know ... this might be plain obvious to you but it isn't to me. Only recently I started thinking that a treemap must not necessarily represent hierarchical data but simply a plain n-dimensional dataset or, in other words, a map without a tree structure. And I'm surprised how few interest it has been devoted to this aspect.

After all, there are not so many other layout algorithms that permit to fill the whole screen (that is, without any empty space between items) with meaningful data and still use additional visual dimensions such as color, size, texture, etc.

In this respect, ordered treemaps are of particular interest. They employ some kind of algorithms that: "create rectangles in a visual order that matches the input to the treemap algorithm". Because the objects' placement depends on the order in which they are presented, their position is fixed and carries some additional useful information.

Initial attempts to improve treemaps involve rather the "squarification" of rectangles, that is, find clever strategies to make rectangles as close as possible to squares to improve their readability. Ordered treemaps retain this initial idea and add the constrain of presenting data with an order. In a 2001 paper, Bederson, Sheiderman, and Wattenberg proposed and compared various strategies (e.g., slice and dice, squarified, pivot, strip, etc.) and evaluated them by means of quality metrics:

ordered-treemaps.png

  • aspect ratio: how "squarified" the treemap is
  • change: how much change is involved in a data update
  • readability: how hard is to visually scan the tree in the input order

However, not much emphasis is put on the fact that this kind of treemaps can nicely present non hierarchical data. Nonetheless, some nice examples exist:

image002.jpg toptenweb-small.jpg roomformilk-small.jpg

One notable domain where these features are useful is network security, where information must be visually scanned in real time to cope with potentially dangerous network events.

Maintaining the input order means, for example, that in the specific case where rectangles represent network hosts, if they are presented to the algorithm in a network IP order , the elements under the same network will fall in similar positions. The stability of the layout is even more important in this case. Because data change often and thus the position of the elements can change, having a stable layout means that an observer can easily track items' position when data is refreshed.

One interesting application of treemaps in network security (with some additional geographical constraints) is Hierarchical Network Maps where network aggregates such as autonomous systems (backbones) are represented by rectangles whose size represent number of hosts and color traffic volume. Here is a screen shot.

as-conn-incoming-29-11-05_small.png

I couldn't find however an example where rectangles' size (and thus position) changes with time. In HNM the authors claim it is not a good idea because layout stability is of primary importance. Nonetheless, I think that having the objects requiring attention represented as the biggest items on the screen could really improve the analyst's job.

I think there are many ways to elaborate over this idea in the future. After all, when the constrain of having a containment/recursive visual structure is relaxed, there might be tens of other clever criteria to arrange items on the screen.

October 17, 2007

Real-time view of living cities: Wiki City Rome and visualization challenges

It's not the first time I post something about visualization of real-time events, I must admit it: I am a great fun of it. Under the term locative and animations in this blog it is possible in fact to find other posts of this type.

I think there is a clear trend today of visualizing the flood of data streams that are generated everyday from us and the object that surround us and the thing becomes more and more intriguing when data are displayed on a map in real-time time, where objects can appear, move, leave traces, and fade away.

This is the case of Wiki City Rome a real-time view of a living city.

mappreview.jpg

Wiki City Rome has been publicly projected on Sept 8 2007 during the "Notte Bianca". Data obtained from cell-phones, GPSs, and other sources is used to display the dynamic behavior of people, events, transportation systems, etc. In the map one can see the density of active cell-phones and other additional information and observe their evolution during that night.

The project is of particular interest to me because: (1) it is developed in my city of origin Rome, which I deeply love, (2) because it is maybe the most mature project I have seen in this area.

The project is part of a larger Wiki City project where the goal is to give people access to the information of a living city through powerful visualizations and to let people act as agents that can influence the evolution of the city through cooperative and competitive behaviors. From the project's web page:

"Although the city already contains several classes of actuators such as traffic lights and remotely updated street signage, a much more flexible actuator would be the city’s own inhabitants."
"Consequently, we are creating a new platform for storing and exchanging data which are location and time-sensitive, making them accessible to users through mobile devices, web interfaces and physical interface objects. This platform enables people to become distributed intelligent actuators, which pursue their individual interests in cooperation and competition with others, and thus become prime actors themselves in improving the efficiency of urban systems."

Among the many challenges related to the project (e.g., data extraction and integration from multiple heterogeneously sources) the ones related to visualization and interaction are of specific interest to this blog.

I see (at least) the following big challenges. For each I add one or more references to potentially useful research:

  • Data density and sampling: with such an amount of data to handle it is mandatory to visualize only a sample of the available information. Unfortunately selecting the "right" sampling is a hard task because the distribution of objects can be very skewed and some areas might become either over or under represented. In addition, different levels of sampling should be applied when changing from one zoom level to another. When zooming on a specific area it is necessary for example to retrieve more data to maintain a constant density.


  • Level of detail: most likely the users will want to visualize the data at different zoom levels, as is common in any map navigation, and adapting the level of detail will be necessary to present as much information as possible, depending on how crowded a given area will be. Under the name "semantic zoom" reside a series of studies in visualization that address exactly this problem but deciding what's the appropriate detail is is not trivial in this case because the data is transient and density can change.

    • Frank, A.U. and Timpf, S., 1994. Multiple Representations for Cartographic Objects in a Multiscale Tree - an Intelligent Graphical Zoom. Computers and Graphics, 18(6): 823-829.


  • Visual feature overloading: one of the explicit goals of the project is to visualize heterogeneous data, that is, data object pertaining to different semantic entities (e.g., places, people, events, messages, etc.). In this case the problem is (1) how to distinguish one object from another and (2) how to remember the meaning of a visual items with respect to the entity it represents. In this case the whole theory of preattentive processing can be of great help in selecting the right visual feature. Another useful branch of research is the one related to the creation of Visual IDs, that is, icons that can be easily recognized and hardly mixed up.


  • Change blindness: the problem of change blindness raises in the visual perception of moving objects. Under certain circumstances the human perceptual system misses some changes and thus it is blind to them. In vision science there is a long tradition of studies ti understand this kind of phenomenon. What matters here is that the visualization of moving object in real-time can be affected by blindness and this must be considered in the design of the visualization. Some initial studies exist on how to design visualization that explicitly cope with this problem, see the reference below.


  • Exposing correlation and causation: this last one is the biggest challenge is see. If the visualization is used to understand how certain events influence some others (correlation) or to observe the consequence of some deliberate actions (causation) the visualization must be able to aid the user in finding these patterns. One obvious method to see these trends is to replay the visualization at different speeds and with different settings but I don't think this is enough to analyze these data. This is one of those cases where the joint use of visualization and mining techniques can really make the difference. The whole new mindset of Visual Analytics can be a starting point to deal with it.

So ... in summary I think this is a very challenging domain with a lot of research to do soon. I expect to see a lot more of this kind of visualizations in the future, especially when a certain critical mass of users will be developed.

November 19, 2007

Matthew Ericson's InfoVis Keynote

The VIS/InfoVis/VAST conference has been as usual a great event, with lots of good presentations, events and interesting people to meet. The conference venue was really nice (trivia: the Hyatt Hotel is the place where Governor Schwarzenegger lives when he is in the capital) and I definitely enjoyed some Californian sun and was really pleased to find old and new friends.

Despite the high-level technical program and the abundance of good presentations my best moments have been the InfoVis keynote and the VAST keynote respectively by Matthew Ericson, from NY Times, and by Stephen Few, from Perceptual Edge. I was really happy to see these two guys describing, from very different perspectives, things on information visualization as done in the real world: for people who don't want to spend excessive efforts to understand what a visualization means and by people who are not traditional visualization researchers/developers. It was a breath of fresh air for my mind.

I'll start with Ericson's talk here in this post. I intend to write something about Few's talk too in another post.

Matthew Ericson described quite in detail the work they do at NY Times to produce effective visualizations that are informative and easy to understand at the same time. I was impressed by the quality of their work and the heterogeneity of the people in the group. And I was also impressed by Mr. Ericson's argument that they consider themselves first of all "journalists" rather than designers. As such, their primary purpose is to tell a story to the reader. Looking at the graphics produced it's impossible to remain indifferent, the eye and the mind are suddenly engaged, there are stunning visualizations, complex and simple at the same time. Each piece is extremely rich: annotated with concise and well placed text notes, multiple tiny views arranged in a way that the whole set tells a story, pictures and/or diagrams added when/where needed.

I also liked the concept of "honest portrayal". Tufte and others have for a long time warned us about the dangers of visualization; for the very fact that it is so potent in conveying information, it can also be used to send wrong or partial messages. Mr. Ericson goes a little bit further, in my opinion, saying that it is important to keep always an eye open to that fact that visualizations may convey partial truths and, more important, that often in order to convey the whole picture a single visualization is not enough, it is necessary to present the data under different perspectives. The example of the US 2004 elections made the case clear.

ericson-talk-ex1.png

The picture is not necessarily "wrong" or purposely "false", but still it contains a partial story that can be misinterpreted: the amount of red in the map is enormously higher than the amount of red because it represents only two values: Bush (red) vs. Kerry. But the picture does not tell anything about margin of votes ... and neither about the population density! Here is how the maps have been reworked and assembled in a full story (click on them to see a big picture).


ericson-talk-ex2-small.png
ericson-talk-ex3-small.png

ericson-talk-ex4-small.png

Another element of interest of the NY Times people is how fast they can produce these graphics. Ericson explained that they work in very very tight schedule because they have to follow the news when they are hot and cannot wait weeks or months to produce a story.

What remains totally obscure to me is what kind of tools these people use to produce such a beautiful and complex graphics in such a short amount of time. Especially because the kind of visualizations, charts, and diagrams they design are not at all trivial and I would bet that most of the time they have to mix the outputs of various tools. Being able to turn data into pictures in such a short amount of time looks to me some kind of magic.

In short what I learned from the talk is that if we want to reach the large public with visualization we have to take care of every detail and present a beautiful, rich, engaging and self-explained piece of work. Sure, this does not take into account how people would interact with interactive visualizations when provided with them, but still I have the feeling that the same principles remain: design complex and composite solutions to provide depth and richness and, at the same time, strive like crazy to make it simpler, simpler, and simpler. This is what most of the time people need to reason about their data.

[Talk's slides from Matthew Ericson's website (Zipped PDF)]

May 17, 2008

Can we speak of Vis 2.0? ... Some patterns

Since the term Web 2.0 was born, a number of tangential fields have also utilized the same terminology to indicate how they have been shaped by it. Notable examples are: Business 2.0, Enterprise 2.0 and even Bubble 2.0! :-).

At the risk of seeming a hype follower, here I ask the question: can we speak of Vis 2.0?

vis2.0.jpg

[See the classic Web 2.0 article by Tim O'Reilly to learn more about it]

When I think of Vis 2.0, my intent is to group all the recent visualizations I have seen appearing on the web under a single label. I have the strong feeling that something new is really happening and that the whole domain of visualization is being transformed by the forces acting on the web. Its future shape will depend a lot on it. Think about it: even the sole thing that a visualization, once it is designed and developed, can instantly be made public and potentially reach millions of people is a revolution in itself!

Following the tradition of defining Web 2.0 by the observed new patterns vs. old ones, here I provide my personal list of patterns:

  • Web vs. Desktop: The application is distributed over the web and accessible through a simple web browser. No installation, no configuration, no hassle.

  • Communication vs. Exploration (and Discovery): The traditional open-ended task explore and "discover the unexpected" is somewhat subverted. The vast majority of Vis 2.0 applications are meant to communicate something that cannot be seen in raw data but that when visualized is quite obvious to understand. Sure, it is clear that discovery and exploration are still attractive and can be largely encouraged, nonetheless this does not seem to be the driving force anymore. In addition, many tools seems to have a clearer goal, a direct connection between task and tool. Traditional InfoVis applications, conversely, have always suffered this limitation of being able to do anything and nothing a the same time.

  • Many and Diverse vs. Single and Specialized User Base: Users come from many different sources with a whole spectrum of interests an goals (often curiosity). The visualization is there ready to be observed and used in way that could not be anticipated by its designer. Compare this to the traditional data analyst, using a very technical tool and spending hours alone figuring out what's in the data. Some systems also allow collaboration and discussion, which is another revolution. The user is not alone and the task is never fully ended. People discuss around a topic aided by visualizations. It's the full power of the collective.

  • Small and Targeted vs. Large and General Purpose: If we consider the nature of the the tool we have an opposite trend. If the audience is generic and diverse, the tools are small and specific. Forget monolithic desktop applications connected to huge enterprise data warehouses issuing complex queries to data cubes. Here we have thin tools with a single and often very simple and clear purpose, which becomes obvious at first sight. The interaction is often limited, what you see is really the only thing you get, but the purpose is clear. Nothing less, nothing more.

  • Shallow vs. Deep Interaction: The tools appearing on the web often employ very limited interaction techniques. In fact, the tool is not mean to be used for complex tasks, few clicks and the job is done.

  • Funny and Empathic vs. Cold and Technical: Many of the realizations on the web bring a lot more emotional involvement compared to the traditional tools. This is somewhat paradoxical if we consider the power of visualization to bring not only information but also emotion. The desire to convey emotions is so evident in certain web visualizations that would be an error to consider only as sporadic events (see Iraq War Coalition Fatalities for an excellent example).

  • Maps and Charts vs. Fancy Visualizations: A very large segment of web visualizations is realized with maps and simple charts like: bar charts, line graphs, sparklines, etc. It's true that there are also many "esoteric" designs out there but they seem to cover the less useful portion of applications. Traditional InfoVis has instead a clear bias towards creating always new visual designs, often completely useless. This is a very personal consideration, but I strongly believe that we yet have to discover the full potential of simple charts when more clever interaction schemes are attached to them.

  • Scripting and XML vs. Java and DBMS: I'm getting a bit technical here, however, even in terms of programming models there are some new trends. Web visualizations are realized with lightweight programming models using technologies like: JavaScript and Flex for the UI and remote and asynchronous data access with XML, JASON and similar technologies. Traditional visualizations, realized for desktop environments are realized with more solid languages like Java and similar and often retrieve data from complex DBMS.

One thing I want to clarify is that I'm not necessarily assigning a positive or negative value to these new trends. In fact, I believe there are good and bad things about them.

As an example, I have already said before in this blog that I'm not very excited by the proliferation of badly designed, too simplistic and often useless visualizations on the web. But, it is also true that they have greatly helped spread the word about visualization as an interesting field.

Before closing this post I want to provide a list of examples of Vis 2.0 tools that I consider really great:

we feel fine: for its great emotional value
crimespotting: for its usefulness and advanced interaction
hindsight: for its beauty
hotpads: for their heatmaps
google finance: for its F+C interaction
finviz: for its complexity and simplicity at the same time

That's all folks. I hope you will have something to say about this.

March 24, 2009

Books for practitioners, not designers!

datamining-bi.jpg

I've recently come across this incredibly good book: "Data Mining for Business Intelligence". I was at first a bit skeptical, my academic background naturally led me to wrongly assume a book on applied business intelligence had nothing more to give than the two other respected books I have on the shelf. Wrong wrong wrong!

As I started reading, chapter after chapter, I felt refreshed by a new stream of ideas, like if all those notions I had accumulated year after year could be seen from a new and fruitful perspective. The book is full of applied examples, compact, with a direct and simple language and, above all, made me finally understand what data mining is and what is it for in the real world. It is the first time I feel I can walk in the same pair of shoes of those guy in the trenches who desperately need strong technology to resolve *their* problems.

So, why do I blog this?

Continue reading "Books for practitioners, not designers!" »

About critique

This page contains an archive of all entries posted to Visuale in the critique category. They are listed from oldest to newest.

charts is the previous category.

diagrams is the next category.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type 4.1