4B. Visualization

(JDrucker 9/2013)

Information visualizations are used to make quantitative data legible. They are particularly useful for large amounts of information and for making patterns in the data legible in a condensed form. Compare these two versions of the same information, in a table and in a chart:

Vis_1

Vis_2

All information visualizations are metrics expressed as graphics.

The implications of this simple statement are far ranging—anything that can be quantified, given a numerical value, can be turned into a graph, chart, diagram, or other visualization through computational means. All parts of the process—from creating quantified information to producing visualizations—are acts of interpretation. Understanding how graphic formats impose meaning, or semantic value, is crucial to the production of information visualization. But any sense that “data” has an inherent “visual form” is an illusion. We can take any data set and put it into a pie chart, a continuous graph, a scatter plot, a tree map and so on. The challenge is to understand how the information visualization creates an argument and then make use of the graphical format whose features serve your purpose.

Many information visualizations are the “reification of misinformation.”

Data creation, as we noted in an earlier lesson on the topic, depends on parameterization. To reiterate, the basic concept is that anything that can be measured, counted, or given a metric or numerical value can be turned into data. This, of course, is the concept that all data is capta, that it is not “given” but “made” in the act of being captured. The concept of parameterization is crucial to visualization because the ways in which we assign value to the data will have a direct impact on the ways it can be displayed. Visualizations have a strong rhetorical force by virtue of their graphic qualities, and can easily distort the data/capta. All visualizations are interpretations, but some are more suited to the structure of a given data set than others.

(For example, if you are showing the results of opinion polls in the United States, the choice of whether you show the results by coloring the area inside the boundaries of the states or by a scatter plot or other population size unit will be crucial. If you are getting information about the outcome of an election, then the graphic effect should take the entire state into account; but if you are looking at consumer preferences for a product, then the population count and even location are significant; if you are trying to track an epidemic, then transportation networks as well as population centers and points of contact are important.)

What is being counted? What values are assigned? What will be displayed?

In many cases, the graphic image is an artifact of the way the decisions about the design were made, not about the data. (For example, if you are recording the height of students in a class, making a continuous graph that connects the dots makes no sense at all. There is no continuity of height between one student and another.)

Some basics:

  • The distinction between discrete and continuous data is one of the most significant decisions in choosing a design.
  • If you are showing change over time or any other variable, then a continuous graph is the right choice.
  • If you are using a graph that shows quantities with area, use it for percentages of a whole. If you increase the area of a circle based on a metric associated with the radius, you are introducing a radical distortion into the relation of the elements.
  • The way in which you label and order your graphic elements will make some arguments more immediately evident. If you want to compare quantities, be sure they are displayed in proximity.
  • The use of labels is crucial and their design can either aid or hinder legibility.
  • Keep in mind that many visualizations, such as network diagrams, arrange the information for maximum legibility on screen. They may not be using proximity or distance in a semantically meaningful way

For more information about basics see Many Eyes and also Whitepaper from Tableau (on CCLE).

Exercise: The chapter from Calvin Schmid describes eight different kinds of bar charts:

  • Simple bar chart
  • Bar and symbol chart
  • Subdivided bar chart
  • Subdivided 100 per cent bar chart
  • Grouped bar chart
  • Paired bar chart
  • Derivation bar chart
  • Sliding bar chart

What are their characteristics, for what kind of data are they useful, and can you draw an example of each?

Which one would you use to keep track of 1) classroom use, 2) attention span, 3) food supplies, 4) age comparisons/demographics in a group?

Exercise: For what kind of data gathered in the classroom would you use a column chart? Tools that are part of your conceptual, critical, and design set:

Elements, scale, order/sequence, values/coordinates, graphic variables

Exercise: The Lie Factor (http://www.datavis.ca/gallery/lie-factor.php)

Which of these issues is contributing to the “lie-factor” in each case: legibility, accuracy, or the argument made by the form. What is meant by a graphic argument?

Exercise: Take one of the these data sets through a series of Many Eyes Visualizations.

Which make the data more legible? Less?

  • United States AKC Registrations
  • Sugar Content in Popular Halloween Treats

Takeaway:

Information visualizations are metrics expressed as graphics. Information visualizations allow large amounts of (often complex) data to be depicted visually in ways that reveal patterns, anomalies, and other features of the data in a very efficient way. Information visualizations contain much historical and cultural information in their “extra” or “superfluous” elements—i.e. the form of visualizations is also information.

Required reading 4B:

* Plaisant, Rose, et. al. “Exploring Erotics in Emily Dickinson’s Correspondence with Text Mining and Visual Interfaces”

Study questions for 4B:

  1. Calvin Schmid and Many Eyes offer useful advice on what form of data visualization to use for different kinds of data. Referring to their work, describe a data visualization that will work for your group project. How would you make it useful if you were to scale up to hundreds of objects?
  2. If you were to pick a visualization from Michael Friendly’s timeline to use for your project, which would it be and why?

Copyright © 2014 - All Rights Reserved