Information Visualization
presented by
Edward Tufte
Edward Tufte, the author of the books "The Visual Display of
Quantitative Information," "Envisioning Information," and
"Visual Explanations", gave a one-day presentation on the
concepts behind his works.
He covered four main areas: overall concepts related to
information visualization, display of financial data, user
interface design, and effective presentations. Each of
these topics has a wider scope than their title implies.
General Concepts
First he touched on general concepts relating to visualization
of data. There are two main problems we have to deal with when
representing information, be it on a computer screen, in a
presentation, or in a printout. What we find is that the
multivariate nature of much data we wish to represent conflicts
with the two-dimensional nature of our displays; both computer
screens and paper are 2D surfaces, less than optimal for
representing data with more than 2 dimensions.
Secondly, we run into problems of resolution. At a later point
in the presentation, he humorously described a scale of resolution
of display media, ranging from the 'damned overhead projector' at
the bottom of the scale, then to the current-day limited resolution
computer screens, then printed paper, then film, and finally to
the resolution of the real world, as perceived by our eyes. The
problem of resolution is particularly acute for computers, with
their limited size and number of pixels available.
After dealing with the two main problems, he touched on his 'five
laws of representing information.' A 17th century ink drawing by
Charles Joseph Minard gives an excellent example of how this can
be done. (TVDoQI, p. 41).
- The presented information should clearly answer the
question 'Compared to what?'
In C. J. Minard's depiction of the fate of Napoleon's army in Russia
during the devastating winter of 1812. We can see
see the comparison between the starting size of 422,000 men and the
final size of the army, a mere 10,000 men.
- The display should show process. The cause and effect should be
clear to the viewer.
Looking at a detail of C. J. Minard's drawing of Napoleon's army, we
can see the cause and effect during the return from Moscow. In the
previous picture we could see that the army was still 100,000 men
strong when Napoleon left Moscow, and in
fact was reinforced by two flanking armies totaling 39,000 men. Here,
we can see that Napoleon returned to his starting point with
only 10,000 men. Minard has
linked the path of the return army to the date and temperature. We
can see that while it was a balmy 0 degrees Fahrenheit at Moscow in
October when he began the return trip, it had dropped to -20 degrees
for the fateful crossing of the Berezina River, where nearly half the
army perished. It dropped as low as -30 degrees, never rising
above the -20 degree mark for the rest of the trip.
- The display must effectively capture multivariate data.
Minard's drawing shows no less than six variables: The size of
the army, its location on a two-dimensional surface, the direction
of the army's movement, and for the return trip, the date and
temperature.
- The information should be displayed in a single, unified display.
Multiple displays forces the viewer to attempt comparisons while
flipping pages, changing screens, or otherwise attempting to
remember data from one view while looking at another.
In this case, the six dimensions (size, 2D location, direction,
date and temperature) are clearly combined in a single graphic.
It is clear and uncluttered, and after a moment of familiarization,
the viewer can clearly see the relationship between the variables.
- The quality, integrity, and relevance is of the utmost importance.
Charles Minard had the goal in mind of depicting the horrors of
war. He did not state so anywhere in the pictoral; the information
presented there could more than speak for itself. All of the
information he presented was accurate and relevant, and this
increased the quality of the presentation manyfold.
The analytic task should define the display. If something is not relevant
to the analysis being performed, it should be left out. Something else
important is avoiding 'one damned thing after another' syndrome. Galileo,
in his 1613 work 'History and Demonstrations Concerning Sunspots and Their
Phenomena,' fell prey to this; he had pages and pages of beautiful drawings
of the sunspots on the sun, one per page:
It was up to Christopher Scheiner,
a contemporary of Galileo's, to use what can be referred to as 'small
multiples' to make a coherent picture of the lot.
Use of the 'small multiples' technique of information display has several
advantages. First, it gives credibility. A number of displays gives the
viewer the idea that the source has lots of information ("this author must
know a lot about sunspots, he has lots of pictures of them"). In addition,
it takes advantage of the viewer's investment in understanding: once the
person viewing has figured out the first of the many duplicated images,
there is no additional investment on their part to understand the remainder.
They can instead concentrate on the information presented, rather than
puzzling out what the picture represents. Another advantage of the small
multiples technique of presenting information has to do with comparisons.
A person can easily compare a number of images if visible at once; doing
a similar comparison when the images are part of a sequence, with only
one or two visible at a time is much more difficult.
Displaying Financial Data
There are some concepts that are relatively unique to displaying
financial data. These can be summarized by the following 10 guidelines
to displaying financial data. It is important to note that these can
often be applied to other types of data as well, but are of particular
interest in the realm of financial information.
- Displays of financial data want to assess change, i.e. 'compared
to what?'
This example is from an analysis of Connecticut Traffic Deaths;
it shows that the 'compared to what?' question is of no mean
importance.
 |
Here we have a display of traffic fatalities, supposedly
linked to stricter enforcement by the police. The problem
is, we have no context in which to evaluate the data.
There are any number of contexts that might be present:
The variation could
be normal, and the more vigorous enforcement in fact had
no effect.
The variation
could have been a
single, atypical instance, in this circumstance, it is unlikely
that the intervention of the police had any effect.
Or, the
enforcement could indeed
have had the desired effect, namely to reduce the number of
traffic fatalities.
|
Now we see the data in context. Note that the 325 or so deaths
reported in 1955 is an extrema;
this means that it is most likely the number would go down, and
it is hard to tell what role the enforcement played.
Comparisons with other states
give a still better context, revealing it was not only Connecticut
that enjoyed a decline in traffic fatalities in the year of the
crackdown on speeding. Perhaps it was an icy winter. Perhaps
indeed the crackdown caused a ripple effect (chuckle). Not
likely, however.
- Variability and deviation is important, and
should be represented.
This weather chart from the New York Times shows 2,220 numbers.
We can not only see the overall trend, but can see the variability.
This lends credence and enables the viewer to double check against
known information.
- Often data need adjustment before being displayed. Money vs. time
display must somehow reflect the effects of inflation.
|
At the top we see raw data for total retail sales in the United
States from 1960 through 1971, which appears to show a clear upward
trend. These raw data are not useful without correction for a
number of factors, which we can see in the lines for Holiday,
Trading day, and Seasonal variation. Finally, at the very bottom,
we see the figures adjusted for inflation. This is the
relevant information to display.
|
- Always footnote the information.
Conversely, never trust information
that is not footnoted. What correction for inflation was used? Where
did the data come from? This is documentation of the figures, and
allows the viewer to verify the information if desired.
- Financial displays are mostly descriptive. One gentle way to
add causal information is though annotation.
|
Here we see how annotation can clearly add to the display without
intruding on the information itself. A good design will allow
transparent access to the data and yet provide information not
available from the data.
|
- A statement of error is necessary.
Often information trades off
timeliness vs. accuracy, or small variations may appear in the
analysis. These should be noted.
- Follow the New York Times and the Wall Street Journal.
If you are displaying financial information, chances are one of
these two publications already does so. Copy them. Not only have
they been doing it for years, but viewers of the information will
understand
much quicker if it is in a familiar form.
Do what the professionals do. Don't try to develop something
from scratch; chances are what you're trying to represent has been
done by someone else. Other good sources are the Government
Statistical Abstract and the Bureau of Justice Statistics' "Report
to the Nation."
- Build a portfolio of excellent examples, and copy them.
This analogy is present in software and architecture, and is just as
valid here as in those fields.
Going with this is use tools that fit the job. Use a spreadsheet
to gather numbers. Use a professional graphics tool such as
Adobe Illustrator of Corel Draw to create the final graphic; the
spreadsheet tools' capabilities are not sophisticated enough.
- Understand the issues relating to aesthetics and technique in
designing a display.
"Graphical elegance is often found in
simplicity of design and complexity of data."
- Edward Tufte, "The Visual Display of
Quantitative Information"
Attractive displays of statistical information:
- have a properly chosen format and design
- use words, numbers, and drawing together
- reflect a balance, a proportion, a sense of relevant scale
- display and accessible complexity of detail
- often have a narrative quality, a story to tell about the data
- are drawn in a professional manner, with the technical details
of production done with care
- avoid content-free decoration, including chartjunk (chartjunk
is the gratuitous decoration often present in statistical
displays, such as grids, cross hatching, and gratuitous
decoration).
From "The Visual Display of Quantitative Information"
- Small multiples are as useful in financial information as in other
types of information display.
A display such as this allows a much faster comparison, as well as
a determination of the context of recent changes. Across the board
comparisons are much harder to do with tables of numbers.
User Interface Design
There are two common problems with the user interface of software.
The first common problem is that software often replicates the bureaucracy
of whatever organization created it; the organization is inflicting
its own world-view and politics on the user. In this situation, the user
is subjected to numerous unwanted and undesired screens before getting
to where they want to be. Splash screens, introduction screens, and
copyright notice screens are all examples of this imposition.
A second problem is that user interface designs often directly reflect the
binary nature of computer design, resulting in the dreaded 'menu-tree'.
The hapless user will descend down the tree, and then have to traverse
up and down repeatedly to move to a desired location or get from one
place to another. In addition, such a hierarchy is easy to get lost in,
and results in a frustrated "how do I get out of here" response.
The goal is to have a FLAT interface. Forget the goodies, we want
content. Compare the following two interfaces, one for an information
system at an art museum, the other for a photographer index.
Here we have a screen with flat access to our data. About
10% of the space is taken up with administrative controls,
leaving 90% of the space free to show content.
On the other hand, this screen uses only 18% of the space
for relevant information, namely the photographers and
their work, with 82% of the space taken up by administrative
controls or by nothing at all.
The amount of content should always be maximized; measure the number of
characters of data versus control, and the area devoted to control buttons
and the like compared to the area devoted to data display.
Making Effective Presentations
Presentations themselves are a form of information presentation, and while
they should not consist solely of charts and other statistical exhibits,
many of the qualities of an effective data display are also present in
an effective presentation. Making a presentation has a lot in common with
teaching as well. There are 13 basic aspects to making an effective
presentation:
- Show up early.
Not only will this give you time in case something
unexpected happens, but it will give you a chance to speak to the
people individually as they arrive. If there are attendees you wish
to sit together (or apart), you'll have an opportunity to place them
as desired. Something good is bound to happen.
- Keep the audience's attention.
One good way to do this is by
giving them a tiny overview. Tell them what the problem is you are
addressing, who cares, and what your solution is. This gives them
a context in which to evaluate the information you present.
Another somewhat more risky way is the 'stumblebum' approach. You
make an intentional but obvious error at the beginning of your
presentation, and let the audience catch it. The hope is that the
audience will watch intently for future errors, the idea being
that you do not make any other errors.
- To explain a complicated concept, use PGP (Particular, General,
Particular).
Choosing one instance, you explain what the significance
of that particular point is. Then you give an overview of what all
of the information signifies, then you describe another specific
point.
For example, we can see in this display that the high temperature
at the end of February was
less than the normal low for this time of year - about 20 degrees
below the normal high, and the low was almost 20 degrees colder than
the normal low.
Overall, however, we can see that such variation is not unusual, and
in fact variations of about 20 degrees outside the norm happen about
six times a year. Even with this variation, we can see that the
average temperature for the year was pretty much as expected, and
that the overall trend followed the normal curve of highs and lows.
December 25th, the low for the year, was another instance where the
temperature fell 20 degrees or more outside the norm. In fact, were
it not for this particularly cold Christmas day, the previous low
would have held the record for the year. It is also interesting
to note that the low for the year falls over a month away from the
average low for the year.
- One law of presentations: ALWAYS give the audience something
tangible to take with them.
A piece of paper is something they can
take away with them, it gives them something to refer to later. If
they don't come away with something tangible, it's almost as if the
presentation never happened.
- Find out what the audience reads.
Not only does this get some
response, but it gives you an idea of what they are interested in,
and may let you focus your presentation or avoid explaining
information they already understand.
- If you're thinking of using an overhead, think again.
And if you're still thinking of using an overhead, read this again.
- Audiences are precious. Act that way.
Nothing will turn off an
audience faster than an arrogant or patronizing attitude.
- Humor is important, but you have to be careful.
Humor is a two-edged sword. Some level of humor will relax the
audience and get their attention. Inappropriate humor
is worse than no humor at all, and too much humor can turn the
presentation into a carnival.
- Avoid masculine pronouns.
"So he goes to the menu, and he chooses
from his selections, and his next window appears." Mixing singular
and plural in speech is formally correct, as in "When the user goes
to the menu, and makes a selection, their window will appear." The
user/their combination is proper.
- Questions are important.
People will often judge your entire presentation on how you answer
their question. If you are evasive, people will notice, so if you
do not know, say so. Also, it is acceptable to handle long or
involved questions offline "Yes, I have that information here, if
you would like to come up afterwards, I can show you."
Also, it's important to avoid having your presentation taken over by
a persistent questioner. One good way to handle this is with a
statement at the beginning along the lines of "I have about 20 minutes
of material here. We can get through this, and then I can answer
questions at the end." Then, when the interrupting questioner
raises his ugly head, you can appeal to the agreement "Well, I have
about 15 minutes of material here..."
This will avoid the issue of being led by questions from a particular
attendee. While this is also rather poor on the part of the
questioner, since a presentation is not a place to show your knowledge,
it also indicates that the presenter has let things get out of hand.
On the other hand, if you have a question and answer period and don't
get any responses, consider having a plant to get things moving.
- If you believe something, make sure your audience knows you believe
it.
Not only will it give you credibility, but it will avoid the
opposite problem, where your audience does not think you believe
what you are presenting.
- Finish early.
As with arriving early, this allows something good
to happen. Perhaps you can answer some questions. Perhaps someone
who did not want to speak up in front of everyone will approach
you privately. At the very least, keep in mind all those times you
walked away from a presentation going "Gee, I wish he went on for
another half an hour."
- Practice, practice, practice.
Not only will you sound more relaxed
and confident, but it will allow you to pay attention to the
audience and their responses, rather than to your material.