A comparative study of linear and region based diagrams

: There are two categories of objects spatial information science investigates: actual objects and their spatial properties, such as in geography, and abstract objects which are employed metaphorically, as for visual languages. A prominent example of the latter are diagrams that model knowledge of some domain. Different aspects of diagrams are of interest, including their formal properties or how human users work with them, for example, with diagrams representing sets. The literature about diagrammatic systems for the representation of sets shows a dominance of region-based diagrams like Euler circles and Venn diagrams. The effectiveness of these diagrams, however, is limited because region-based diagrams become quite complex for more then three sets. By contrast, linear diagrams are not equally prevalent but enable the representation of a greater number of sets without getting cluttered . Cluttered diagrams exhibit inherent complexity due to overlapping objects, irrelevant details, or other reasons that impinge upon their legibility. This study contrasts both types of diagrammatic systems and investigates whether the performance of users differs for both kinds of diagrams. A signiﬁcant difference can be shown regarding the number of diagrams that can be drawn within a ﬁxed period of time and regarding the number of errors made. The results indicate that linear diagrams are more effective by being more restrictive and because region based diagrams show much clutter due to overlapping, coincident, and tangentially touching contours, as well as an overwhelming number of empty zones. Linear diagrams are less prone to errors and do not suffer from clutter.


Introduction
A subfield of spatial information science is about the use of space to represent information.Geometrical objects, such as lines or regions, and the way they are spatially arranged, primarily in two-dimensional space, encode information.This allows the creation of diagrams as spatial information carriers.Hierarchies, processes, and other relationships are made explicit and can be easily communicated by means of diagrams.Diagrams are not to be confused with visualizations which are the result of mapping distributions of data to spatial displays.Diagrammatic representations go one step further by defining both a formal syntax and semantics, similar as for textual languages [24].
Diagrammatic representations are omnipresent in many different areas in order to summarize data, to illustrate relationships, or to communicate ideas.Indeed, a colorful mix of diagrammatic systems exists [2,3,28].In several studies the employment of diagrams and other kinds of external representations, such as route maps, architectural drawings, flow diagrams, organizational charts, and economic graphs have been investigated [29].According to these studies, external representations bring in a number of advantages: they enlarge human memory and offload internal representations to shareable spaces, relieving the mind that is limited regarding information and operations it can keep track of [30].The benefits of diagrams have been shown in different studies [4,12,18,21].
There are two complementary views investigating diagrammatic systems, namely the formal point of view and that one of cognitive psychology.On the one hand, so-called wellformedness conditions concern the syntax of diagrams and determine their appearance.On the other hand, it is of interest how well cognitive abilities do cope with such formally motivated conditions.This concerns in particular perceptual abilities inasmuch as wellformedness conditions are concerned with geometric constraints.It is the purpose of the presented work to compare two different kinds of diagrammatic systems and to investigate how specific well-formedness conditions impact on the competence of users dealing with those diagrammatic systems.
The motivation for this kind of investigation derives from the employment of diagrams in many different application areas.Especially, this concerns such general domains as the representation of sets.Modelling knowledge in the context of engineering or in the case of teaching, diagrammatic representations for sets are commonplace and the question arises: how do different representations for sets compare to each other?
This paper is structured as follows.At first, two types of diagrams are contrasted which depict sets and their relations.Afterwards, an experiment is described which investigates differences among both kinds of diagrams.A discussion analyses the results of the experiment, in particular by showing how well-formedness conditions could be maintained or are violated.Eventually, the conclusion is drawn that diagrams that are rather restricted in the way they are to be drawn and that have a simple, plain layout seem to be preferable.

Diagramming sets
Diagrams are frequently utilized for illustrating sets and their relations.They are employed already early on at school in teaching the theory of sets.Such diagrams are widespread in various areas in order to show relations between sets, syllogisms, and statistical data.Most such diagrams are based on Euler circles [7] or Venn diagrams [31,32] and several variawww.josis.orgtions and extensions exist [26].What all those diagrams have in common is that they represent sets by means of regions in the two-dimensional plane.Topological relations among regions, such as the proper part relation, partial overlap between regions, and their disconnectedness, correspond to the representation of subsets, set intersections, and disjoint sets, respectively.These approaches are all referred to as "region based diagrams." Region based diagrams are as widespread as if they were the only way to diagram sets.However, there are other diagrams that represent sets in the two-dimensional plane by means of lines: in [1] sets are represented by curved lines within two-dimensional maps and crossing lines depict intersections, as overlapping zones do in conventional diagrams.A simpler depiction is chosen by [19] who presented a way of reasoning with intervalvalues and by diagramming intervals by straight line segments.Similarly, the approach of [4] represents concepts of the theory of probability with subsets of possible outcomes being arranged in parallel on horizontal lines.A way of syllogistic reasoning by straight segments is presented by [23] and [6].Such linear representations have their roots in the work of Gottfried Wilhelm Leibniz who was presumably the first to employ linear diagrams [17].In his diagrams the various segments of a line represent the various parts of a proposition.According to [25], Leibniz favored linear diagrams over intersecting circles.Recently, the formal syntax of linear diagrams and their set-theoretic semantics have been introduced [10,11].

Diagramming containment relations
One possible reason for the predominance of region based diagrams is their direct representation of set-theoretic concepts.It is compelling to draw a curve c B within another closed two-dimensional curve c A in order to illustrate that set B (represented by c B ) is subset of set A (represented by c A ).There is an obvious relationship between the set-theoretic concept of a subset and its diagrammatic representation as a closed curve contained within another closed curve.In the field of semiotics, this concerns the iconicity of the representation [14]: it refers to the relationship between the feature of a concept (the subset relation) and a representation of it (two curves, one contained within the other one).Simultaneously, that representation bears some resemblance to the presented concept: the containment relation is found at the conceptual level as well as at the level of the representation.
Containment relations are not as directly represented by relations between lines, unless they are lying on each other.But then they are difficult to distinguish in a diagram.Therefore, other conventions have been proposed, such as the parallelism of lines representing sets containing the same elements [4,10,19].Figure 1 shows both types of diagrams which illustrate the same relations between two sets A and B, both being contained in the universe of the natural numbers between one and six.
The differences between both types of diagrams become particularly apparent when drawing them.While one has to align all sets with respect to the universe, one additionally has to take care of all relations between the sets when constructing region based diagrams.This is not necessary for linear diagrams in which all sets can be drawn without taking care of the relations between them.Inspecting the diagrams, there is another difference when reading off containment relations: linear diagrams require the user to identify which sections are in parallel1 , while for region based diagrams one has to identify containment relations among curves.

The grammar of diagrams
A diagrammatic representation is defined by a number of geometric well-formedness conditions, which are chosen so as to satisfy syntactical rules of the given representation.While such a formal approach is relevant to provide a consistent representation, there is only little research about how well those geometric conditions can be dealt with by users [8].
An optimal set of conditions would support the user to avoid errors in the construction and inspection of diagrams.Note that such well-formedness conditions are defined in the context of specific formalizations of diagrammatic representations as part of their formal syntax.Different formalization can be chosen for the same type of diagram.Accordingly, there might be different well-formedness conditions for region based diagrams.None of them can claim to represent the better choice.
However, user studies can be carried out in order to indicate how different types of diagrams compare: either to instruct subjects to stick to specific well-formedness conditions or to let different subjects use different diagram types in order to solve the same tasks.The latter approach can especially be used in order to find out which diagram type is preferable for specific tasks.If the users are not aware of the well-formedness conditions those conditions can implicitly be analysed.This is what we will do below in our experiment.
Another issue concerns the closeness of the relationship between representation and concept.In the field of semiotics, this concerns the aforementioned iconicity of the representation [14]: the more features there are which support a close correspondence between concept and representation the higher the iconicity of that representation.This corresponds to what [22] has introduced as the mode of correspondence: namely the different degrees, ranging from the literal to the non-literal, to which a more or less direct correspondence of features between a concept and a representation of it exists.
While the well-formedness conditions concern the geometrical syntax of a diagrammatic system, the semantic level is defined via its mode of correspondence, irrespective of how strong it is.Indeed, provided that there are geometrical features that do not correspond to any conceptual features, there are geometrical details without any meaning.Such irrelevant geometrical features are distracting and the term clutter has been coined for such irrelevant details [13].

Well-formedness conditions for region based diagrams
According to [8] a typical set of well-formedness conditions for region based diagrams, more concrete Euler diagrams, are as follows: • A1 Contours are simple closed curves.
• A2 Contours are not coincident.• A3 Contours do not tangentially touch.• A4 No more than two contours meet at any single point.
• A5 Each zone is a minimal region.
By a region we mean an arbitrary set of points in the plane.Then, the formal notion of a minimal region refers to a region that is a connected component of the plane and a zone is a union of minimal regions [26].This is of relevance in the present discussion because a subset is to be represented by a zone and subsets which are represented by a number of different disconnected regions, violating condition A5, are to be distinguished.An example would be a diagram that contains an intersection like A∩B∩C two times or even more often.The experiment below will show that subjects do not take much care of well-formedness condition A5.

Well-formedness conditions for linear diagrams
A possible set of well-formedness conditions for linear diagrams is: • B1 All segments are straight.
• B2 All segments parallel along the horizontal of the image plane.
As, in general, segments could also be sections of a curved line, the first item becomes relevant.The second one describes the entire structure of a linear diagram, while the last item permits two or more sets to be depicted which are in a partial overlap relation, however, without the existence of an ordering of all elements so that both sets yield a connected segment.

Research thesis
Although being omnipresent in textbooks and elsewhere where set visualizations are made, this paper doubts the convenience of region based diagrams, due to their inherent complexity.In order to test to which well-formedness conditions subjects automatically stick, in an experiment participants had to draw several diagrams but without letting them know of any such conditions.Instead, subjects were just shown a simple diagram example from which they could retrieve what the diagrams to be drawn should look like.Mechanisms about the processes applied when retrieving relevant information from the example are not investigated.Instead, the experiment focuses on the transformation of a given propositional description with a number of sets to a diagrammatic depiction of those sets.Thereby, it is assumed that the initial example provides sufficient information to guide the participants.Our thesis is that the work with linear diagrams is less prone to errors than the work with region based diagrams.

Hypothesis
Despite of the omnipresence of region based diagrams in the context of teaching and despite of their appealing nature, our hypothesis is that the construction and use of linear diagrams is simpler than the construction of region based diagrams.This assumption is based on the higher complexity of the well-formedness conditions of region based diagrams and due to the diversity of irrelevant information contained in those diagrams: this concerns the shapes of curves, their sizes, their positions as far as the relevant relations among curves are considered, and the positions of the elements within each of the sets.By contrast, linear diagrams are more restrictive.There is only the ordering of sets from the top to the bottom which can be chosen arbitrarily by the user, as can be the order of elements in the universe of discourse.
Determining the number of diagrams which can be constructed within a fixed period of time, for both types of diagrams differences regarding the complexity of their constructions are investigated.Additionally, differences in the number of mistakes might also indicate differences in the complexity of both types of diagrams.As both types of diagrams can be employed for representing sets, their use can directly be compared, in particular when applied to the same sets and relationships.

Participants
The n = 27 participants were all students who were recruited from a pool of volunteers who had participated at an undergraduate course in image processing at the University of Bremen (degree programme: Informatics 12, Systems Engineering 10, Digital Media 5).They consisted of 20 males and 7 females, whose ages ranged between 19 and 36 years (M = 24.19years, SD = 4.19 years).One female student was excluded from the evaluation, as she did not follow the instructions correctly.

Procedure
Participants were shown an example together with a sample solution.The sample solution showed a region-based diagram for group A (n 1 = 12 participants), while group B was presented a linear diagram (n 2 = 14 participants).The only difference between the instructions of both groups was, that the participants of group A were told that a set is to be depicted by means of a closed curve, while the other participants were told that a set is to be depicted by means of a line segment which can be potentially disconnected.According to the sample solution, the participants were instructed to solve a number of problems for 18 different examples, in the order the examples were printed on a task sheet.Each example was confined to show a number of sets in a propositional way.The examples became gradually more complex, in the sense that the number of sets and the number of elements in the sets increased from example to example.After 20 minutes the investigator interrupted the participants, who have not been told before how much time they had.The time available has been chosen in a way that it was hardly possible to solve the problems for all examples within that period.

Experimental set-up
Both groups received the same initial example, namely A = {1, 2, 3} and B = {2, 3, 5}.They were asked to solve the following problems: • (a) Draw a diagram with these sets.Each set is to be drawn by a closed curve.Each number is only allowed to show up once in the diagram.
www.josis.org Figure 2 shows for the initial example both solutions presented to the participants.Each participant was only shown the problem description and solution for that participant's group.After having comprehended the initial example, the instructor told the participants to solve the four problems for each of the 18 examples in the given order on the task sheets, making a total of 4 • 18 = 72 tasks.The order on the task sheets reflects the complexity of the examples in terms of the number of sets and number of their elements involved.

Results
A total of 110 and 166 diagrams have been drawn in group A and group B, respectively.On the average, participants dealt with 5 to 12 of the examples in group A (M = 9.17  3 shows that the members of group B whose participants had to draw linear diagrams solved more problems for each of the eighteen examples than the members of group A. To test the significance of that difference between both groups the t-test is applied (approximately, a normal distribution of both underlying populations (of the number of examples dealt with) can be assumed given the distributions within both samples and both samples have almost the same size): H 1 stipulates that there is a significant difference ).However, according to the t-test the probability of that difference is larger than before, given that H 0 holds: p(t(24) ≥ 2.35) = 0.01367 < 0.05.The reason is that the standard deviation among the participants in group B is much larger than when considering all examples including those with flawed solutions.
In group A everyone made at least one error, while there are two participants in group B, one male and one female student, who made no errors, however who both only dealt with ten of the eighteen examples.There was only one male participant in group B who dealt with all examples and who made only one mistake.That mistake was made with the very first examples.In fact, within group B most errors have been made with the first example (first red square in Figure 4), while this is not the case for group A for which the number of mistakes tendentiously increased with the complexity of the examples.
Figure 4 shows the errors for each example which have been made on average, together with the standard deviation among the participants.For the region based diagrams (group A) the number of errors increases from the first example towards the last one.By contrast, the average error rate keeps similar along all examples for the linear diagrams.Additionally, the standard deviation among the participants is larger in group A than in group B.
Four kinds of errors have been distinguished: www.josis.org(a) that the diagram contains a mistake (a label or an element has been omitted, or the set has been drawn in a wrong relation to other sets), (b) that the declaration of sets contains mistakes (the participants had to mark those sets with the letters "k" and "g" which contained the smallest or largest element, respectively), (c) that the declaration of the intersection of sets "A" and "B" is wrong (which the user had to mark within the diagram), and (d) that the missing numbers could not been determined correctly (the participants had to name all natural numbers between "k" and "g" which were not member of any of the given sets).

Discussion
The increasing number of errors within the region-based group coincides with the increasing difficulty the more sets to be considered in a region based diagram.That the error rate is quite similar for the linear diagrams along all examples shows that linear diagrams with more or larger sets are not more difficult to construct.A reason for this difference is that the integration of a set into a region-based diagram requires the consideration of all sets already present in that diagram.By contrast, each new set to be added to a linear diagram can be added without attending to the present sets.That most of the errors in group B have been made for the first example might be due to the need of the participants to get used to those kinds of unfamiliar diagrams.The standard deviation among the participants is larger in group A indicating that there is a larger difference in the expertise to draw region based diagrams as opposed to linear diagrams which show less many differences among the participants.In the region-based group participants made between 1 and 22 errors, while in the other group the number of errors ranges between 0 and 13.
The participants were told to read off all required information (problem descriptions (b) to (d)) from the drawn diagrams.In order to avoid that participants remember the required information from the construction task (a), a future user study should additionally look at a group that is exclusively instructed to solve inspection tasks, such as those sought-after with problems (b) to (d).However, the present purpose is to identify an overall difference regarding the manipulation of region-based as opposed to linear diagrams.This includes the simultaneous dealing with construction and inspection tasks when employing both types of diagrams.
The results show a significant difference in the construction of both types of diagrams.In particular many well-formedness conditions are violated by the region-based diagrams, while linear diagrams are consistently drawn in the sense of their well-formedness conditions, though the participants are not aware of those conditions.
The group of participants is not gender-balanced.There are more male participants (20) than females (8−1).The mean of the number of examples dealt with by females in group A is 7.5 and in group B 11.In contrast, 10 examples have been dealt with on average by the males of group A and 13.2 of the examples by the males of group B. These differences between males and females lie within the standard deviation of solved tasks within the whole groups.We conjecture that gender specific conclusions require a gender-balanced group with more participants.

General discussion
Several observations can be made when glancing through the drawn diagrams.As region based diagrams of different subjects show a great many variations, there is much to say about those diagrams.Linear diagrams have a comparatively simple layout.This shows the first obvious distinction between both types of diagrams, namely that region based diagrams, which are drawn by different participants, look quite different, while linear diagrams are very similar (cf. Figure 5 on page 13 and Figure 6 on page 15).The reason is that there are fewer restrictions for region based diagrams (concerning the shapes, sizes, and precise positions of contours, as well as their overall layout).There are many fewer possibilities left to the users who employ linear diagrams.Region based diagrams enable users to draw them more the way they prefer, or rather, how they are able to transpose a propositional description of a number of sets into such a diagram.Linear diagrams limit the possibilities the users have.

Observations of region based diagrams
It is clearly visible that there are many empty zones contained in almost all of the diagrams.For the 110 region based diagrams, there are altogether 1002 empty zones, making 9 empty zones in the average per diagram.For example, the diagram of Subject 2 in Figure 5 contains 7 empty zones and Subject 4 contains even more than 20 empty zones.Those subjects who avoided empty zones draw instead coincident contours, as Subject 10 did who draw angular diagrams with overlapping outlines (Figure 5).
Empty zones are not forbidden regarding the well-formedness conditions, since empty zones are frequently meant to represent empty set intersections.However, regarding the original definition of Euler circles, empty sets are to be avoided and regarding Venn diawww.josis.orggrams zones are to be shaded out in order to represent empty sets.Obviously, empty zones do considerably contribute to clutter: they occupy space within a diagram and introduce dispensable contours.However, for most of the examples it would be more difficult to find a solution that avoids empty zones.
The only well-formedness condition which is satisfied for all diagrams is A1, that is all contours are simply closed curves (apart from imprecise drawings such as set C of Subject 26 in Figure 5).There are many pairs and even triples of coincident contours (A2) and also a number of tangentially touching contours (A3).Due to the complexity of the drawings, it is difficult to count these cases let alone to tell A2 and A3 always apart.There are approximately 370 cases in which A2 or A3 is violated, making in the average 3.36 violations per diagram.There are also some cases in which more than two contours intersect at a point (A4), but these cases are even more difficult to count reliably.Finally, there are also diagrams in which A5 is violated, that is not all zones are minimal regions.This applies to two or more empty zones in a diagram which represent all the very same intersection.
There are further characteristics not related to the well-formedness conditions, but which obviously do contribute to the clutter of diagrams: many sets have been drawn as concave contours.It is impossible in many cases to draw more than four sets by means of convex shapes, for instance, set E of Subject 18 in Figure 5.In fact, sets which are to be added later in a diagram are frequently concave, since they have been drawn in the end according to the order of sets within the example at hand.Thereby, they had to be integrated within an assemblage of sets already present.Provided that all possible intersections are not empty, only for up to three sets they can be represented by means of circles and for four sets they can be represented by means of ellipses.However, if there is no empty intersection for five or more sets, it is impossible to draw a diagram without concave shapes [5].Looking at the contours, it seems that all participants tried to get along with convex shapes.Concave shapes were introduced as soon as it seemed impossible to draw convex shapes.
Yet another property of many diagrams is the spatial distribution of the elements contained within the same set or intersection.Elements of the same set are not necessarily close by each other.There are even cases in which elements of other subsets are in between due to the concavity of curves.For example, set E of Subject 22 in Figure 5 is concave and contains the elements {1, 3, 4}, however, element 5 of set C is closer to 3 than 1, although 3 and 1 are in the same set.
Given the many intersections in single diagrams, it is frequently very difficult to follow the contours representing a specific set.For more than four sets, it is rarely the case that the contours of specific sets or their intersections can be clearly separated by the eye from other contours.Ambiguities arise because of several overlapping, coincident, and touching contours, or due to actually overlapping sets.
Finally, with an increasing number of sets it becomes difficult to clearly annotate the sets by labels.It is sometimes not clear to which sets the labels are assigned.Sometimes a label is closer to another curve than that one representing the according set or there are two sets equally far apart from a label.

Observations of linear diagrams
Linear diagrams are free of intersections.This seems to be a fundamental distinction to region based diagrams and avoids clutter because of the aforementioned ambiguities.Intersections are a constitutive characteristic of region based diagrams as they represent the intersection of sets.By contrast, linear diagrams represent intersections by means of parallel segments.The user has to take care of the correct intersections when drawing region based diagrams, while the sets are just to be aligned with respect to the distribution of the elements of the universe in the case of linear diagrams.It is an open issue whether the effort is higher for linear or region based diagrams to read off the elements which are contained in specific set intersections.
Users are much more restricted when drawing linear diagrams than with region based diagrams.This is the reason why linear diagrams look more similar to each other than region based diagrams do (cf.Figure 6).In particular, there is no option as to how the shapes of sets are allowed to look, because each set is represented by straight segments (B1).Therefore, the user is more confined when employing linear diagrams.But we speculate that this is not perceived as an annoying factor; rather it is helpful in order to draw a diagram in a straightforward way.
The greatest difficulty is to draw disconnected line segments so that they clearly belong to the same row in the diagram.Some participants tend to draw parts of disconnected line segments at a slight angle (the second part of set E of Subject 0 in Figure 6) and the more disconnected elements there are the more difficult it is to maintain a straight direction within a row (sets C to E of Subject 0 in Figure 6).In other words, for some users it is difficult to maintain B2, namely that all segments parallel along the horizontal of the image plane.For this reason, a single participant introduced auxiliary lines which separate the elements of the universe for half of the diagrams he has drawn (Subject 17 in Figure 6).More generally, www.josis.orgAnother problem arises as soon as rows representing different sets are too far away from each other when they are to be compared.If not using auxiliary lines, as by Subject 17, it is sometimes difficult to see clearly which segments are all parallel to each other.
One of the problems was to mark the intersection of sets A and B in each diagram.The different solutions can be comprehended quite well in Figure 7.While the region-based diagram gets even more cluttered when marking specific subsets, for the linear diagram a new segment is introduced which shows the intersection (last row).In this case, the aforementioned disadvantage of rows being too far away from each other becomes apparent.For the linear diagram in Figure 7 it is difficult to determine all parallel segments: element 3 is not clearly contained in set B, nor is it clearly parallel to the last row which represents the intersection A ∩ B.

Conclusions
Among others, the presented diagrams are relevant in the context of teaching, visual data mining, and for the design of complex systems.Independent of the application, the question about the effectiveness of particular diagrammatic representations arises.We assume that diagrammatic systems are more effective when well-formedness conditions are sufficiently restricted, because less restricted systems leave many decisions about the realization of diagrams to the users.But this leads to diagrams that also show much irrelevant detail-that is, those characteristics of a diagram that lack relationships to the concepts to be represented (in other words characteristics that are not iconic).By contrast, restrictions avoid irrelevant information, and therewith, they avoid clutter.
On the other hand, restrictions might require a higher degree of precision when drawing diagrams.Region based diagrams are not restricted regarding the shapes of curves and their location as far as all intersections are properly dealt with.The user does not have to take care of specific objects, their shapes, sizes, or locations.In contrast, linear diagrams require users to draw their diagrams more carefully.Nevertheless, the resulting diagrams show a simple and clear layout.
This discussion of more or less restricted diagrams should not be confused with the employment of diagrams in design and other creative contexts [20].Diagrams of the kind dealt with in the current paper and those diagrams employed in design are two different categories.The former are devoted to a particular class of objects and their relations.They should be depicted in a clear way, avoiding any ambiguities.By contrast, diagrams in design should even allow some degree of ambiguity inasmuch sketched diagrams in this context are to stimulate the creativity and to try different depictions.A short overview article that discusses different categories of diagrams is provided by [9].
It has been shown that linear diagrams are as effective as Euler circles and more effective than Venn diagrams in syllogistic reasoning [23].However, that study is confined to a few propositions requiring just a short number of segments to be related, whereas the presented experiment has shown the effectiveness of linear diagrams applied to more than three sets (instead of propositions).This study, however, is only a first attempt to look at the differences between both types of diagrams.There are further issues to be looked at, for example, that the considered sets are sets of numbers, and as such, have an intrinsic order.For group B, though not being instructed this way, the participants had a simple choice, www.josis.orgnamely to order all elements numerically.The question arises whether it makes a difference if the items had no intrinsic order.Moreover, another case for linear diagrams occurs when leaving out the list of elements in the top row, such as in the systems of [11,15] who confine their investigations to the relationships between parallel segments.Can subjects deal equally effective with general relationships in linear diagrams?
The objects the users had to deal with in the study are natural numbers.The question arises in which way they influence the performance of users, because in diagrams natural numbers are frequently represented by dots, while intervals are frequently represented by segments.In the present study, natural numbers are neither represented by dots in linear nor in region based diagrams; in the former case they are represented by segments, while in the latter case they are directly written into regions.Whether the users have difficulties to represent natural numbers by means of segments instead of dots, has not been investigated.But even if the results are biased due to this difficulty, the users who employed linear diagrams nevertheless performed better than the users of the other group, and this despite of region based diagrams being usually more familiar to students.Moreover, for linear diagrams segments are even employed for both single objects as well as sets of them.The results show that this has no negative impact on the performance.
However, the employment of natural numbers could have another influence concerning inspection problems: having used the natural order of natural numbers during the construction of a diagram, advantages for any inspection task emerge in the case of linear diagrams, as the order helps in finding particular elements.Given the small number of objects considered in the present study, this influence might not be of relevance but should be taken into account when dealing with larger sets of objects.In fact, that the order of elements can be maintained within linear diagrams might indicate that linear diagrams are in particular the better choice as far as natural numbers are to be dealt with.On the other hand, as far as pairs of natural numbers are to represent points on two-dimensional surfaces, region based diagrams could be of advantage if they are aligned to the domain of interest, instead of defining arbitrary regions to represent tuples of natural numbers.
Linear diagrams show some similarities to pixel oriented diagrams, such as those proposed in [27] in the context of visual data mining.The authors apply those diagrams to social network graphs, with the diagrams showing adjacency matrices that replace the depictions of complex graphs.One essential argument for that kind of representation is the avoidance of clutter in large graphs which show an overwhelming number of intersecting edges between nodes.The layout of linear diagrams resembles matrices.The fundamental similarity between the work of [27] and linear diagrams is that there is a line by line and column by column reading for both diagrams.Additionally, clutter is avoided in both cases since both kinds of diagrams get along without overlapping objects.Similar to [27] are the diagrams of [16], which have also been invented for the purpose of visual data mining.
A significant difference among region-based and linear diagrams has been shown in the present experiment.In [27] for pixel oriented diagrams, it is argued that the represented activities are clearly visible as there are no overlapping objects.Similarly, [13] state in the context of Euler diagrams that those diagrams where most pairs of contours intersect tend to appear more cluttered than those where most pairs are disjoint.We share this argumentation and speculate that the avoidance of overlapping objects gives good reason for the priority of linear diagrams over region based diagrams, in particular as soon as more than three intersecting sets are involved in a diagram.More generally, this is a strong argu-ment for diagrams that avoid intersecting geometrical objects and that have a simple plain layout.
Future studies, however, should also reveal how the aforementioned drawbacks of linear diagrams, such as difficulties in drawing them precisely enough, influence their employment and limit their application to an upper bound number of sets to be considered.On the one hand, it might turn out that subjects prefer region based diagrams for fewer than four sets; on the other hand, linear diagrams might show their benefits just as soon as a larger number of sets is to be taken into account.

Summary
The presented experiment reveals that subjects have less difficulty with linear than with region based diagrams.This has been shown for the construction of diagrams with up to ten sets and a total of 46 elements.Both the number of diagrams drawn and the number of mistakes made indicate the superiority of linear over region based diagrams.It is assumed that this is mainly due to weakly restricted well-formedness conditions of region based diagrams, in particular those that lead to overlapping objects as well as coincident and touching contours, and thus, to clutter in diagrams.This explanation coincides with conclusions drawn in other studies about diagrammatic representations which find a significant reason for clutter in overlapping objects, and thereby, the introduction of ambiguities.

Figure 1 :
Figure 1: Left: a region based diagram; right: a linear diagram.

Figure 2 :
Figure 2: Left: the example solution for group A; right: that one for group B.

Figure 3 :
Figure 3: The number of examples which have been dealt with.Group B with the linear diagrams (red squares) created more diagrams than the region based group (blue circles).

Figure 4 :
Figure 4: For all 18 examples the average error for all participants is shown together with the standard deviation among the participants.