As Jacobs (1990) was able to prove, graphical displays are clearly superior to tabular presentations for many tasks 1. Thus the basic question of "Should a graph be used?" posed by Bertin (1974) can, at least for certain tasks, be answered in the affirmative. The more favorable visual perception in a graph makes complex relations easier to recognize and enables more demanding questions concerning a data set to be answered.
The amount of data becomes more complex when several data sets are to be displayed in one presentation. In this case graphical presentations are often necessary in order to make it possible for certain relations to be recognized within a reasonable amount of time. Whereas earlier research often concentrated on the analysis of relatively simple tasks using different graph formats, (e.g. Croxton & Stein 1932, Culbertson & Powers 1963, Feliciano et al. 1963), the current experiments are aimed at examining the trends of several data sets at an advanced experimental level. This involves testing the influence of graph arrangement and graph type on how well graphs are understood in order to find answers based on empirical findings to a further question posed by Bertin (1974), "What type of graph should be used?".
There is a large number of questions that could be asked when extensive amounts of data are presented in a graphical display and they undoubtedly cannot all be examined in an experimental situation. One disadvantage of many studies is that only a few more or less arbitrary tasks are taken into account at all, which, at the end of the experiment, can lead to the suggestion that certain graph formats are better than others. What is really important is that the "right questions" are asked in connection with the presentation of data and in particular that the interaction between the graphical presentation and the question asked is taken into consideration.
In this experiment we shall above all examine questions (or tasks) that appear to be particularly suitable for graphical presentations (as opposed to tables). As Coll, Thyagaranjan & Chopra (1991), Coll (1992) and Coll, Coll & Thakur (1994) have shown, graphs are consistently superior to tables in "relational info questions". Bertin (1974) distinguishes three groups of questions in connection with graphical presentations:
One of the main concerns of this research is finding questions concerning graphical presentations that are interesting from a theoretical point of view and relevant in practice. To this end tasks will be developed that relate to groups of elements (2) and in particular to data sets in their entirety (3). Answering these questions requires the use of a special type of perceptual approach (graphical chunks) that is rendered easier or more difficult depending on the graph type and the graph display. We are mainly interested in establishing whether tasks are solved more quickly due to a better perception of the individual conditions in a graph and what reason can be given for the improved perception. In the long term this could lead to new approaches to the problem, which would allow applied research to be linked to "pure research", e.g.: "How are certain visual variables grouped together, e.g. bars added up, or bar sizes compared? Which perceptual processes occur automatically and which require further selective searching? How can the process of answering a question be divided into steps?
If there were no kind of interaction between the graph type and the task, one graphical form of presentation would suffice. But in practice there are many different kinds of graphical presentations in use and any graphics program that wishes to be taken seriously will include the appropriate options. Thus it is desirable to find out which is the ideal graphical presentation for a particular task.
Nevertheless, often one would also like to consider a data set from different perspectives and ask several questions concerning the same data. In the computer age this problem could, of course, be solved easily by using diagnostics, whereby according to the task concerned, the data are displayed in their ideal presentation form.
Another strategy would be to choose a graph format that allows most of the questions asked to be answered satisfactorily and which would thus, on average, be better than other presentations. Occasionally it is difficult to know what kind of task to set in connection with a particular graph, and it is the graph itself that sparks off ideas for possible questions. These graphs in particular must allow several tasks to be solved. Above all, this requires the skill on the part of the viewers to be able to recognize relations that they would not have noticed in the absence of a graph.
The present research is interested in both questions, in the strengths and weaknesses of the individual graph formats as well as the all-round qualities of certain graph formats. This approach means that tasks are set that cannot necessarily be answered in an optimal fashion using the graph types displayed.
If several data sets are to be displayed at one time, each set can be presented in a separate graph (juxtaposition) or all sets can be shown in one single graph (superposition). For these two types of graph display, bar charts and line graphs as graph types were a logical choice. (see also: the graph variants in the experiments)
A particular graph type can appear in many different forms and the question remains as to what extent all these variants fulfill the basic definition of the graph type concerned. When exactly can we speak of a bar graph? Can one still refer to a graph as a simple line graph when the values of the individual levels of X are specifically indicated next to each point on the line, or should this kind of graph be called a "point-line graph"? The graphs examined in these experiments have been kept simple and should correspond to a large extent to the basic form of each graph type. In addition, they have been designed to allow several tasks to be solved as well as possible. Numbering the levels of X is, for instance, vital if particular levels are to be compared. We endeavored to ensure the comparability of the graph variants because "unless the design of all displays is optimal conclusions drawn from the comparisons have to be regarded as tentative only" (Meyer Shinar & Leiser 1996).
Examples of the graphs used for each task are always shown (usually the originals). These examples all fulfill the particular experimental requirements and represent, strictly speaking, possible examples because the graphs shown in the experiments were generated according to certain rules, including random processes, in order to be able to test as many different graphs as possible.
The basic types of graph display (c.f. Bertin 1974, p. 109) are as follows:
B allows several display variations. Occasionally juxtaposition was tested using two special forms of arrangement: horizontal and vertical arrangement. These two types of arrangement were then compared. However, this distinction was not made for most tasks for pragmatic reasons and usually "normal arrangement", as seen below, was chosen:
| Number of data sets |
normal arrangement |
horizontal arrangement |
vertical arrangement |
| 2 | 1 2 | 1 2 | 1 2 |
| 4 | 1 2 3 4 |
1 2 3 4 | 1 2 3 4 |
| 8 | 1 2 3 4 5 6 7 8 |
- | - |
To ensure that the data sets displayed in superposition could be easily
viewed and distinguished from each other, a visual variable enabling the
best possible viewing was to be used. We thus chose the medium of color,
which is seen by many in the field as one of the best ways of coding several
data sets (c.f. Bertin 1974, p. 99, Schutz 1961b, Casali and Gaylin 1988,
p. 41, Travis 1991, p. 122). The software and hardware at our disposal
not allowing us to choose colors freely, we did not endeavor to select
the optimal colors from a physiological point of view. Instead, certain
aspects were simplified and standardized (e.g. the first data set was always
blue, the second always red, c.f. the many examples).
One particular problem, which cannot be discussed further
at this point, is enabling a fair comparison between superposition and
juxtaposition. In our view it was absolutely vital to maintain the same
relationship between the ordinate and the abscissa for all the graphs.
All experiments are based on the principle of repeated measurement, in which each subject works through all the experimental conditions. The order of trials within each task was always determined at random, but the tasks themselves always appeared in the same order.
The construction principle of the experimental conditions is highly complex and can only be outlined here:
This approach enabled many different groups of data to be tested in the experiment, forming a sound basis on which to make generalizations based on the findings. At the same time, the internal validity of the experimental design was guaranteed by the repeated measurements and the series of randomization processes that were carried out. In contrast, we accepted that our chosen procedure meant doing without a reduction of the variance of errors and would lead to a loss of statistical efficacy of the experimental design.
Before the actual test phase began, the experimental subjects were given sufficient instructions as to the particular type of task to be completed and were shown the corresponding graphs (c.f. the sample instructions). Then the subjects were required to complete at least two examples, although they could try out as many examples as they liked and were always instructed not to begin the actual experiment until they were quite sure what was required of them.
The subjects first saw a specific question appear on the screen, e.g.: "What kind of trend does the red data set display?" The next instruction told them to answer the question as fast as possible, while ensuring that the answer was also correct. Pressing the space bar made the question disappear and the graphical presentation appear in its place. Pressing the space bar again, which made the graphics disappear, indicated when the subjects had decided on their answer. Then they answered, usually by clicking on the mouse or typing in a short answer.
| Steps | specific (implicit) instruction | remarks |
| 1 | Read the question without seeing the graph |
Memorize the question, e.g.: "Do two data sets have the same trend?" |
| 2 | Press the space bar to make the graph appear |
Start timing response time |
| 3 | Look at the graph | Try to answer the question while viewing the graph |
| 4 | Press the space bar to remove the graph from the screen |
Stop timing response time |
| 5 | Indicate the answer | Click or write the answer |
Please note: the experimental subjects did not have to label or name any of the data sets or graphs. Thus questions such as "Which data set does the blue line represent or what data set does graph 2 show?" were not asked. We ask direct questions about such things as "the blue line" or "graph 1" because visual perception, not cognitive orientation, was our main concern.