Generative Design Inspiration for Glyphs with Diatoms

We introduce Diatoms, a technique that generates design inspiration for glyphs by sampling from palettes of mark shapes, encoding channels, and glyph scaffold shapes. Diatoms allows for a degree of randomness while respecting constraints imposed by columns in a data table: their data types and domains as well as semantic associations between columns as specified by the designer. We pair this generative design process with two forms of interactive design externalization that enable comparison and critique of the design alternatives. First, we incorporate a familiar small multiples configuration in which every data point is drawn according to a single glyph design, coupled with the ability to page between alternative glyph designs. Second, we propose a small permutables design gallery, in which a single data point is drawn according to each alternative glyph design, coupled with the ability to page between data points. We demonstrate an implementation of our technique as an extension to Tableau featuring three example palettes, and to better understand how Diatoms could fit into existing design workflows, we conducted interviews and chauffeured demos with 12 designers. Finally, we reflect on our process and the designers' reactions, discussing the potential of our technique in the context of visualization authoring systems. Ultimately, our approach to glyph design and comparison can kickstart and inspire visualization design, allowing for the serendipitous discovery of shape and channel combinations that would have otherwise been overlooked.


INTRODUCTION
Inspiration for novel visualization design can come from many sources. In this paper, we present a novel technique for drawing inspiration from the data itself, revealed through the use of a generative design process in combination with interactive design externalization.
We concentrate on design inspiration for glyphs: small visual objects comprised of multiple marks, where the visual properties of a mark correspond with values from a single data point [11,25]. Our work is motivated by the prevalence of glyphs in communicative visualization and the lack of support in existing tools for designing and constructing glyphs. Their design involves choices about the number of distinct marks, the relative positioning of said marks, and the encoding channels used to convey data values, choices that impact how viewers visually discriminate the glyphs. However, glyph design is not solely about perceptual concerns. Designers also consider aspects such as visual symmetry, the presence of emergent patterns and figurative associations, as well as how these aspects interact with the semantics of the underlying data. These aspects of glyph design are difficult to specify a priori. In light of these difficulties, we allocate glyph generation to a constrained sampling process, one capable of producing a continuous sequence of candidate glyph designs, whereupon the designer becomes a curator [30], tasked with identifying promising designs and excluding less promising ones based on their own abstract and ineffable criteria.
Our primary contribution is Diatoms, a technique that encapsulates the defining characteristics of generative design [30] (repetition, randomness, and logic) to provide glyph design inspiration by sampling from palettes of mark shapes, encoding channels, and glyph scaffold shapes (Figure 1-left). To review and navigate between alternative glyph designs, our sampling process is supported by two modes of interactive design externalization. The first is a familiar small multiples configuration in which every data point is drawn according to the same design specification (Figure 1-right), coupled with the ability to page between alternative glyph designs. The second mode is what we refer to as a small permutables design gallery, in which a single data point is drawn according to each of the generated glyph designs (Figure 1-center), coupled with the ability to page between data points. Our secondary contributions include observations from interviews with 12 designers, to whom we demonstrated an implementation of our technique as an extension to Tableau that featured three example palettes. Finally, we discuss the potential of our technique in terms of how it might be integrated into interactive authoring systems, so as to connect the process of inspiration gathering with bespoke visualization construction.

BACKGROUND AND PRIOR RESEARCH
We draw upon visualization research and practice as well as adjacent domains' incorporation of design externalization and generative design.

Inspiration for Visualization Design
Recent interactive visualization construction tools allow people to craft bespoke visualization beyond conventional statistical charts. These tools include Lyra [64], Data Illustrator [47], Charticulator [59], and most recently StructGraphics [80]. Whether the goal is to realize mash-ups of existing chart types or to draw xenographics ("weird but (sometimes) useful charts" [45]), the output of these authoring tools typically serves a specific communicative intent [42], where transferability of the output to other datasets is not as critical as novelty and memorability [12]. This communicative intent stands in contrast to those of other visualization construction environments [29] where the output is to be used for analyzing data and should generalize across datasets and use cases. A common critique of bespoke visualization construction tools [65] is that they are authoring tools, not design tools, in that they do not provide any design inspiration or support. These tools assume that people already have a particular design in mind when approaching these tools; absent a preconceived design, they face a blank canvas. With Diatoms, we address this missing step in visualization construction by providing design inspiration via a sampling-based process coupled with a comparative display of design alternatives, albeit with a focus on glyph-based visualization. We distinguish inspiration from recommendation, in that we associate the former with a desire to produce a novel visualization to support a communicative intent while the latter addresses analytical intents, exemplified by projects like Show Me [51] and Voyager [87]. Inspiration from others. Sources of visualization design inspiration vary across individuals and communities of practice. The public-facing work of others is one such source, particularly given the volume of work appearing in news media, at practitioner conferences, within visualization communities on Twitter and Reddit, or on the #share-inspiration channel of the Data Visualization Society's Slack [20]. Some seek inspiration among others working with the same tools or languages. For example, D3.js developers can find inspiration on indexed repositories such as Bl.ock Builder [36] or bl.ocksplorer [61], while among the Tableau community, people seek out inspiration on Tableau Public [75]. While it is possible to download or fork specific example implementations from these repositories or extract elements from published vector-based charts with tools like SVG Crowbar [79], two issues arise: first, differences between the dataset used in an example implementation and the dataset at hand can hinder the use of the example as a starting point for design; second, the new design can be derivative of the original on which it was based. Visual metaphors. Another source for visualization design inspiration are visual languages used in other media, from abstract art to musical notation and engineering diagrams [50]. The natural world is also a boundless source for visualization design inspiration: we point to examples of botanical motifs evoking trees [19,41] and flowers [72], geological motifs evoking sedimentary layers [83], celestial motifs evoking constellations [14], or biomimetic motifs [23] evoking the behavior of flocks or swarms [6]. Even the human face [17] and body [32] have served as inspiration for visualization design. However, whether drawing inspiration from other visual media or from the natural world, the structure and distributions of values in the dataset at hand may not be congruent with a particular visual metaphor. In light of this, our approach attempts to generate visualization design inspiration independent of external influences. As we discuss in Sections 4 and 5.5, our sampling-based approach does not preclude the serendipitous recognition of visual motifs found in nature or other media. In our case, the naming of the Diatoms technique came about when our approach generated patterns reminiscent of the eponymous microscopic algae, such as those depicted in the inset image at left [62].

Glyph-Based Visualization
While new techniques and sources for design inspiration are needed for all forms of visualization, we restrict our scope to glyph-based visualization. In a 2013 survey, Borgo et al. [11] describe a glyph as a small visual object that depicts attributes of a data record. Fuchs et al. [25] elaborated upon this definition, describing glyphs as instances where single data points are encoded individually by assigning their dimensions to one or more marks and their visual variables.
The use of glyphs in practice. Our scope reflects the prevalence of glyphs in bespoke communicative visualization by practitioners; we point to examples in information design award showcases [4,71], in visualization trade journals [39,63], in celebrated collections like Dear Data [49] or Data Sketches [13,88], or in featured collections on Tableau Public [38,43]. We are also motivated by the versatility of glyphs. Though often arranged in grid configurations, glyphs also appear in variants of other charts, with examples appearing in embellished bar charts [72], maps [86], tile maps [89], and node-link diagrams [2]. The opportunity for higher information density with glyphs also allows for their use in compact arrangements in tables [58], as inline wordscale elements in bodies of text [8,27], or in small display contexts, such as the popular iOS Activity app [5] and its glyphs comprised of three concentric ring marks.
Drawing glyphs. Despite their prevalence and versatility, glyphs are tedious to construct using interactive tools. For instance, Tableau experts have described how they iterate between multiple tools [68], echoing findings from prior interview studies [10,57]; they draw mark shapes in applications like Illustrator [3] or Procreate [66], transform vector graphics with tools like coördinator [7], and impute and densify data with spreadsheet or scripting tools. The tedium and difficulty of glyph construction persists among bespoke visualization authoring tools; to create a small multiples arrangement of glyphs in Charticulator [59], one must undergo a recursive process of exporting a single glyph and importing this nested chart in a new Charticulator instance [65].
Difficulties associated with constructing glyphs also reflects the immensity of their design space. Beyond the initial selection of mark shapes and encoding channels, designers must be conscious of how the perception of mark shape, fill color, and mark size can interact [70], and they must weigh the effects of juxtaposing or superimposing marks on Gestalt perception within and between glyphs. Accordingly, there are varying approaches to glyph design, including experimenting with visual metaphors from nature or other media [49], appropriating semantically-related figurative elements and frames [16,40], and constructing taxonomies of abstract mark arrangements [52]. With our sample-based approach to glyph design, we relieve designers of these decisions, at least early in the design process, asking them to instead act as a curator of generated designs.

Design Externalization
Spatially juxtaposing alternative designs is common across visual disciplines. Scholars studying the history of design employ this technique to study variations across designers, historical periods, and geography, facilitating conversation among collaborators and inspiring new research questions [22]. Meanwhile, designers collect and visually juxtapose design inspiration produced by others, whether on physical walls or virtual whiteboards. In this paper, we are concerned with designers employing this technique to discuss and critique their own designs. Automating design externalization. While it is possible to arrange design variants manually, automated design externalization can facilitate in-the-moment comparison of variants as they are produced. This automation is well-suited for drawing and photo editing applications [46,53,78], where an interactive parameter space can quickly produce many variants. Interactive design externalization can facilitate browsing and presenting design alternatives [15] as well as a sequential exploration of a parameter space [44], either by individuals or by teams of designers engaged in a process of collaborative critique [55]. Externalization for visualization design. In the context of visualization, there are precedents for juxtaposing a sequence of visualization artifacts in analysis tools for the purpose of analytical provenance and auditing [31], but examples of externalization in visualization design are less common. One precedent is the process undertaken by the news graphics team of the The New York Times [18], where an automated script generates a screenshot with every git commit to a project, appending this screenshot to a thumbnail gallery of visualization design variants. While this approach may not easily afford quick comparisons of designs with minute differences or of charts that are ideally suited for full-screen viewing, it may be appropriate for comparing alternative designs of small glyphs. Another precedent for externalization in visualization design can be found in Schroeder and Keefe's Visualization-by-Sketching interface [67], wherein a designer can vet a candidate mark design by interactively generating a gallery that illustrates how the mark may manifest in different conditions, such as by varying mark density and size. We incorporate ideas similar to both precedents in Diatoms, from comparing across glyph design choices to comparing alternative sizes and positions of a particular glyph design.

Generative Design
A common characteristic of conventional design is the step-by-step and often manual process of realizing a new idea. In visualization design, this captures the trial-and-error process of associating visual encoding channels to attributes of data, which could be informed by an understanding of graphical perception or by domain conventions. In contrast, generative design suggests an alternative approach that fixates on the identification of configurable parameters and the refinement of rules for generating many ideas and especially those that exhibit emergent yet desirable traits that cannot be easily specified beforehand. With respect to glyph-based visualization, these traits could include strong Gestalt associations or memorable figurative associations. In generative design, Groß et al. [30] describes how traditional craftsmanship recedes into the background, and abstraction and information become the new principal elements. The formalization of abstract rules typically involves elements of repetition, logic, and randomness. Artists and designers have applied generative processes in many domains; we point to examples in interface design [74], architecture [28], urban planning [69], industrial design [54], and even music [24]. Despite its prevalence in these fields, generative design is under-utilized in visualization research. Generative design for visualization. Data art and data visualization are ideal application scenarios for generative design [34]. The visual nature of the artifacts produced allow for repetition to manifest both temporally and spatially, the latter being conducive to externalization of alternative designs in design galleries [46,53]. Designers and artists must consider how to balance randomness with logical decisions that bind data types and values with visual properties.
One noteworthy recent example of interactive design externalization coupled with a generative process is Morph [21], a tool for generating visual art from tabular data; Morph seeds the design space with a familiar statistical chart and applies random mutations to visual encoding channels, resulting in a visual branching of increasingly mutated designs, which fans outward with additional selections. Like Morph, we also embrace randomness to an extent with Diatoms, sampling from encoding channel palettes for each mark in a glyph.

THE DIATOMS TECHNIQUE
The purpose of Diatoms is to quickly generate design inspiration for glyphs, and this generation process is coupled with interactive design externalization for comparing promising designs. We break down the Diatoms technique into a process of designating conjunction and repeat associations in the data, sampling from shape and channel palettes to generate alternative glyph designs, and finally comparing and curating these designs in small multiples and small permutables viewing modes. Throughout this section, we refer to Figure 2 and an example featuring an urban mobility dataset [9] to explain aspects of the technique, while the supplemental video [1] illustrates its interactive aspects.
We assume tabular data expressed in wide format such as in Figure 2.1. Given such a table, Diatoms will draw a glyph for each row. We first decide which columns to include in the glyph designs, and we optionally group columns that share semantic associations into column sets. Otherwise, each set contains a single column.

Sampling from Shape and Encoding Palettes
We assign mark shapes and encoding channels via a constrained random sampling process. The example shape and channel palettes featured in Figures 1 -5 represent one possible starting point in the glyph design space, others are certainly possible and should be left to the discretion of the designer. In Section 4, we reflect on our experimentation with palette options that preceded those featured in this section. Mark shape palette. We assign a unique shape to each column set, and each shape can only be assigned once within a single glyph design. Our example shape palette shown in Figure 2.3 contains eight polygon shapes and one wave shape, meaning that a single glyph design can contain at most nine unique mark shapes. Our inclusion of the wave serves as a counterpoint to the more salient filled polygons. We chose a sine wave following experimentation with straight line marks, where we determined there to be an insufficient number of salient encoding channels compatible with the latter, whereas the former provided us with frequency and amplitude in addition to length. Other wave shapes are worth considering for future palettes, such as square and sawtooth waves. Most of the polygon shapes in this palette are symmetric and familiar, reminiscent of the mark shape palettes in Tableau. Since we include mark rotation in our example encoding channel palette, we introduced asymmetry to this mark shape palette in two ways: by including houndstooth ( ) and drop ( ) shapes and by adding a directional pip to all marks: a small white circle to indicate the rotation of a shape, akin to a level indicator on a physical dial. Encoding channel palette. We also assign encoding channels to columns via sampling. Given the small number of options in this example palette (shown in Figure 2.4), we allow for channels to be sampled by more than one column set. We include one categorical channel (mark color) and an equal number of quantitative channels for polygons (alpha, size, rotation) and waves (frequency, amplitude, length). In Section 4, we discuss our prior experimentation with several other encoding channels (e.g., position, distortion, stroke properties), along with rationale for why we omitted them from this palette. Scaffold shape palette. Finally, each glyph design is randomly assigned a scaffold shape. Our example palette ( Figure 2.5) has two linear and six polygon shapes; the latter includes a spiral as a contrast to the simple and symmetric shapes that fill out the rest of the palette. Scaffolds are organizing principles for marks, which are placed at equally-spaced intervals along a scaffold following column set order. We additionally randomize the distance of marks from the periphery of the scaffold, which we refer to as a scaffold's gravity (Figure 2.6). Design A has a spiral scaffold with weak gravity, while design B has a triangle scaffold with medium gravity.

COLUMN SET #1: DESCRIPTIVE ATTRIBUTES OF CITIES
--These columns do not share data type; a conjunction designation will assign three encoding channels to a single mark.

COLUMN SET #2: MOBILITY SCORES FOR CITIES --
These columns share a data type and domain; a repeat designation will result in three marks that share an encoding channel.   (2) for a data table [9] (1), as well as sampling outcomes with respect to mark shape (3), encoding channel (4), scaffold shape (5), and scaffold gravity (6) for alternative glyph designs A and B . On the right, we added the red annotations to cropped sections of the corresponding small multiples configurations from Figure 1 (right) to highlight the encodings and underlying values.
Column sets. The cardinality and type of a column set determines the number and appearance of marks that get drawn. Figure 3 shows variations on a column set designation, with four alternative glyph designs produced for each. Designating one column per set may not yield interpretable designs after more than a few columns, such as in the example of Figure 3.1. While one solution is to use only a subset of columns, such as in Figure 3.2, we can also combine columns into sets. The example designation in Figure 2.2 features two columns sets, and these sets represent two types of associations between columns.
Column sets with conjunction designations. The first column set contains descriptive aspects of cities: REGION, AREA, and POPULATION. This set will correspond with a single mark exhibiting a conjunction encoding [84], where each column corresponds with a unique encoding channel. In glyph design A , this set is assigned a drop shape, while it is assigned a hexagon shape in glyph design B . For the drop mark in glyph design A , REGION, AREA, and POPULATION are respectively assigned to color, size, and rotation. For the hexagon mark in glyph design B , these columns are assigned to color, rotation, and alpha level. The blue marks in Column sets with repeat designations. BIKE SCORE, TRANSIT SCORE, and WALK SCORE share a common numerical domain from 0 to 100; combining them in a column set with a repeat designation means that each column will be assigned a unique mark, but each mark will repeat a shared shape and channel combination, so as to promote a direct comparison of values following the Gestalt principle of similarity. For a column set with a repeat designation, we distinguish the set's columns using color. Unlike the colors corresponding with REGION in the first column set, these colors do not reflect categorical values appearing in table cells. This repeat designation could be thought of as a pivot transformation on these columns, from a wide data format to a long one, where column names become categorical values amenable to encoding with a channel such as color. Color assignments are unique, meaning that the combined sum of categories across categorical columns and the number of columns appearing in sets with repeat designations should be fewer than the number of distinguishable colors. In glyph design A , the amplitude of the purple, brown, and pink wave marks correspond with the three mobility scores, while in glyph design B , they correspond with the rotation of the star marks. The grey marks in Figure 3.3 and the repeated marks in Figure 3.4 illustrate the difference between a conjunction and repeat designation for these columns, with the latter being the same designation used in Figure 2.

Small Multiples and Small Permutables
Given a column set designation, sampling across the three palettes can generate many possible glyph designs, and the number of possible permutations of mark shape, channel, and scaffold shape grow as designations include more columns. We therefore propose two ways to view and navigate between alternative designs. The first viewing mode is a familiar small multiples grid configuration in which a glyph for each data point is drawn according to the same glyph design, such as in  Fig. 3. Variations on a column set designation with the urban mobility dataset [9], with four glyph designs shown for each designation in a small permutables configuration Variant 4 uses the column set designation used in Figure 2.
the two grids of 12 glyphs for designs A and B in Figure 1 (right). In our implementation of Diatoms, we provide the means to page between small multiples configurations of alternative glyph designs. The second mode is a design gallery of glyph alternatives which we refer to as a small permutables grid configuration, in which one data point is drawn multiple times, once for each alternative glyph design, such as in Figure 1 (center).
Whereas navigation in small multiples mode pages between alternative designs, in small permutables mode it pages between data points. Clicking on a glyph selects it, which serves to preserve context when toggling between the two viewing modes. Curation. Diatoms seeds the inspiration process with an initial set of alternative glyph designs. In our current implementation, we generate five as a starting point. At any time, we can generate and append additional glyph designs as desired, or we can cull uninspiring ones. Legends on demand. In the supplemental video [1], we show how mousing over a glyph reveals a tooltip-based legend indicating the correspondences between marks and data values, a consolidated version of the red annotations added to Figure 2 (right). Legends are critical for glyph-based visualization, particularly when glyphs are presented to their intended viewing audience. Legends may be less critical for rapid design iteration, where unpromising glyph designs can be dismissed without scrutinizing their legends. Our current approach is a compromise. On the one hand, a designated and always-visible legend would be compatible with our small multiples viewing mode, where every glyph shares a design. On the other hand, our small permutables viewing mode would require a unique legend for every glyph design; an always-visible legend for every design, shown adjacent to each design or shown as annotations akin to Figure 2 (right) would result in excessive visual clutter, impeding design comparison and iteration. Resizing and positioning candidate designs. Beyond navigation and curation, we anticipate other ways in which designers may want to assess alternative glyph designs. First, viewing the glyphs at different sizes allows us to determine a particular design's suitability for different viewing contexts, such as placing glyphs within a table or a small display context. Second, we may want to break the grid to view the glyphs according to a customized spatial configuration; any selected glyph can be dragged to a new location in the canvas, and when used in conjunction with glyph resizing, this functionality could be used to assess the viability of glyphs for placement in a scatterplot, a tile map, or a symbol map such as in Figure 4. In small permutables mode, this spatial reconfiguration could be used to group or rank alternative designs, so as to compare a subset of promising designs side by side.

Implementation of Diatoms
We demonstrate an implementation of Diatoms in Tableau Desktop [76] via its Extensions API [77]. This choice of implementation allowed us to defer the tasks of data shaping, data type inferencing, and exploratory analysis to Tableau, such that Diatoms only ingests a tidy data table that  requires no further filtering or aggregating. Finally, this implementation context allows for an instance of Diatoms to be combined with other Tableau content in a dashboard, including other instances of Diatoms, as depicted in Figure 5. We implemented Diatoms in JavaScript and used p5.js [81] to render the glyphs to a canvas element.  (1); position-and repetition-based encodings with a small palette of mark shapes (2); parameterized shape encodings (3); mark shape palettes comprised predominantly of asymmetric shapes (4); mark stroke color and weight encodings (5); and colored pips for marks corresponding with column sets having repeat designations (6).

TECHNIQUE EVOLUTION AND PALETTE EXPERIMENTATION
We reflect on our experimentation with generative processes and on the alternative mark shape, encoding channel, and scaffold shape palettes that preceded the example palettes featured in Figures 1-5. Figure 6 contains snapshots from our experimentation, encompassing a period of six months, during which time the authors met weekly to critique the generated designs and the palettes that we sampled from.
Scaffold and mark permutations. Inspired by the prevalence of glyphs with marks arranged in circular or radiating patterns (Section 2.2), we initially experimented with generating permutations of circular scaffolds populated using a simple palette of circle, square, and line marks. Even before we bound data to these marks, arrangements of placeholder marks such as those shown in Figure 6.1 conveyed the variety of possible designs, even with a small number of marks.
Position and repetition as encoding channels. The glyph designs shown in Figure 6.2 capture our experimentation with central and peripheral mark positioning and non-circular scaffolds. We also assessed position-based encodings, where data values would determine the position of marks relative to the scaffold. However, anticipating the potential deployment of glyphs in contexts where the position of the glyph itself is meaningful, such as on map like in Figure 4, we omitted intra-glyph position from subsequent encoding channel palettes in favor of an even spacing of marks relative to their scaffold.
Small multiples and parameterized shape encodings. Though displayed in a grid formation, we generated the scaffolds in Figure 6.1 and the glyphs in Figure 6.2 from small p5.js sketch programs [82] that afforded a serial comparison of designs. Furthermore, these sketch programs used a single example data point to generate one design per run cycle. Realizing the need to compare glyph designs across data points, we began testing ideas in small multiples configurations using Tableau as a backend, which allowed us to test multiple data points and a variety of datasets. Figure 6.3 shows our experimentation with parameterized shape encodings, such as by modulating the position and number of vertices in a polygon or the parameters of curves. This led to the emergence of shapes exhibiting degrees of 'spikiness' and those evoking patterns of malignant growth, where glyphs would infringe upon the space of their neighbors. We found shape modulation encodings to be visually dominant, diminishing the salience of other marks and encoding channels present in a glyph. Moreover, the underlying data values driving the modulation of shapes were not readily apparent, leading us to omit shape modulation from subsequent encoding channel palettes. Asymmetric marks and figurative associations. The use of rotation as an encoding channel prompted us to consider palettes comprised predominantly of asymmetric shapes. Though these palettes yielded memorable designs and figurative associations, such as birds and fish heads in Figure 6.4, shapes such as moons and arrowheads were imbued with cultural connotations, evoking petroglyphs and other motifs unrelated to the data. Seeing these patterns prompted us to reconsider palettes of abstract symmetric shapes and another way to signify rotation: adding circular 'pips' to marks. Our subsequent mark shape palettes that combined symmetric and asymmetric shapes still elicited figurative associations, albeit natural rather than cultural ones, with the former inspiring us to adopt the Diatoms moniker for our technique. Mark stroke encodings. In an effort to consider broader encoding channel palettes, we considered mark stroke color and weight as additional categorical and numerical channels, respectively. Though the former provided another categorical channel beyond mark fill color, Figure 6.5 shows how mark stroke channels were difficult to interpret; they were dependent upon the shape and size of mark, and when resizing the glyphs or viewing them in a smaller viewport, it became difficult to differentiate mark stroke widths. Distinguished pips for repeated marks. Another way to distinguish marks corresponding to a column set with a repeat designation is the coloring of mark pips, as shown in Figure 6.6. As with mark stroke encodings, colored pips were insufficiently salient, leading us to color the marks themselves, as described in Section 3.1. Since color is the only categorical encoding channel in the palette featured in Figures  1-5, this design choice means that repeated marks can only encode numerical values. To encode multiple categorical values, a palette would require additional categorical channels, such as textured fill patterns or stroke styles.

INTERVIEW STUDY
To better understand how Diatoms might integrate into design workflows, we conducted open-ended interviews and chauffeured demos [48] with twelve designers. These hour-long interviews focused on the potential utility of generative design processes and what additional functionality designers would require for connecting design inspiration with visualization authoring. Our approach represents an effort to collect candid reactions from designers in the absence of an established evaluation methodology; between previous experiments assessing the perceptual efficacy of particular glyph designs [25] and design reproduction studies assessing the potential of interactive visualization authoring tools [60], we argue that existing methods cannot be used to study the early and divergent stages of visualization design. Participants. We recruited seven design students (P1-P7: 6 ♀, 1 ǿ) and five experienced designers (P8-P12: 1 ♀, 4 ǿ) who work with visualization. The former group were enrolled in a graduate course in information design and visual cognition, and we interviewed them as they were completing an assignment in glyph design. Format. We introduced the Diatoms concept and implementation to all interviewees prior to speaking with them. For the students, this took place during a guest lecture, while we sent an extended version of our supplemental video [1] to the professional designers. We began the interviews by discussing the interviewees' experience with design and visualization tools, their sources of inspiration, and their current workflows. We then provided a choice of three small datasets (city mobility data [9], film metadata from IMDb [35], or socioeconomic data from Gapminder [26]). After giving the interviewees an opportunity to ask us clarifying questions about their chosen dataset or the concept underlying Diatoms, we began a chauffeured demo to elicit their observations and questions about Diatoms' output. For each demo, we used the same palettes as those featured throughout Figures 1-5. We initially constrained the range of possible outputs by demonstrating the technique with fewer data columns and no set designations (such as in Figure 3.2). After discussing this initial output, we added more data columns and introduced repeat and conjunction set designations (such as in Figure 3.3-3.4). Our interviews were seeded with a set of questions that probed into why interviewees found some glyph designs to be promising and others not, as well as how they would further refine the more promising glyph designs. As all interviews took place via video chat and screen-sharing, the first author 'drove' Diatoms in response to interviewees' observations and critiques of specific glyph designs or their requests to generate new ones; the other authors observed and took notes. We recorded screen capture video and audio from these interviews; the following discussion reflects our thematic analysis of the transcripts and notes from these interviews.

General Impressions
Overall, designers and students alike were generally receptive to our technique. P10's comments capture this sentiment: "conceptually, [Diatoms] is an awesome idea, it speaks to more of the playful elements that people like experimenting with." Diatoms could "speed up the process of generating ideas" [P1], allowing designers to "do some rapid sketching, [. . . with] different sizes, shapes and colors right off the bat" [P2]. P9 explained that the time typically required to experiment with encodings meant that he would not be able to assess multiple options; he saw Diatoms generate alternatives in seconds that would otherwise take hours.
Beyond accelerating the early design phase, P3 suggested that Diatoms could also help novice designers: "this is good for someone who isn't as creative, [. . . ] they can generate something easily and not put much thought into it". As one who does not visualize data programmatically, P3 also saw Diatoms as a way of providing designers with creative alternatives that would have previously required coding.
Finally, two interviewees described the unlikeliness of discovering a particular glyph design independent of Diatoms. Some mark, encoding channel, and mark placement combinations would have been otherwise overlooked because the combination suggested a violation of visualization design convention. For instance, a glyph with strong gravity will bunch the marks together in the center of a scaffold and can result in partial mark occlusion; despite this, P6 noted how one such glyph design generated using the urban mobility dataset appropriately evoked the density of cities. Even if the first impressions of a design suggest a violation of convention or a lack of perceptual clarity, P11 saw the advantage in giving these imprecise designs an opportunity to inspire. Many designs are "often not clear or straightforward ideas, [and] even though 99 are wrong; the messiness can be a virtue;" he later added that "there are a lot of bad hits, but once in a while you stumble upon something that really works."

Comparing and Winnowing Glyph Designs
In each interview, we asked about the utility of our two modes for comparing glyph designs. The concept of small permutables resonated with designer P8 in particular, who saw a parallel in his side practice of brand logo design, where he would arrange alternative logos to see which ones register and which ones should be discarded. Upon demonstrating the ability to page through data points, he stated that this viewing mode "makes it clear how the [values] are changing". He saw small permutables as a starting point for narrowing down one's scope, before toggling to a small multiples mode to see all of the data points drawn according to the selected glyph design. First impressions can be deceiving. We found that reactions to particular glyph designs depended on the viewing mode in which they were first encountered. As only one instance of each glyph design is shown in small permutables mode at any one time, a single design may show promise until it is applied to every row of data in small multiples mode. This happened multiple times to P7, who changed her mind about a design immediately upon switching modes, citing an inability to discriminate between glyphs corresponding to different data points. We also observed the converse reaction, where P4 did not react positively to glyph designs first seen in small permutables mode until he saw all of the data points represented in a small multiples configuration. The best of both modes. Upon demonstrating both viewing modes and the ability to toggle between them, we received suggestions on how we could integrate the two modes within a single display. While one solution could be to spatially juxtapose the two viewing modes (such as in Figure 1), P11 suggested a hybrid viewing mode in which a small subset of data points are shown for each design; he specifically pointed to a section of our explanatory video in which we arranged four small multiples screenshots within a single display, each featuring the same 12 data points drawn according to a different glyph design. Similarly, P8 suggested a way to select a shortlist of three or four glyph designs from the existing small permutables mode, and this selection would allow designers to 'dial in' to a more focused comparison mode, such as the one described by P11.

Observations on Mark Shape and Channel Sampling
Every interviewee commented on our example palettes of mark shapes and encoding channels. When discussing the various combinations assigned via sampling, some conversations turned to steering or overriding the results of this sampling process. For instance, P10 expressed a desire to "play around with the palettes," meaning a way to modulate the amount of variance in subsequently generated glyph designs, such as by weighting the sampling in favor of certain mark shapes and encoding channels. Balancing symmetry, familiarity, and salience. A recurring topic of discussion pertained to the asymmetric shapes in our example mark shape palette. For instance, P7 indicated that the drop ( ) shape evoked location pins in mapping applications, particularly when rotated such that its tapered end points downwards, and this association could be misleading depending on the underlying data. Meanwhile, P1 rightfully pointed out that the houndstooth shape ( ) "is not a common shape that people are familiar with," and due to its unique geometry, P2 noted that it "seems to have more meaning" than other shapes.
The wave shape also drew comments. As the sole non-polygon shape in our example mark shape palette, both P1 and P4 remarked that they were difficult to interpret in situations where they were partially occluded by polygon marks due to a strong scaffold gravity. However, both saw this shape as novel and promising; P1 suggested using it more judiciously, such as in glyph designs comprised only of wave marks.
Though we deliberately included a mix of asymmetric and unusual shapes in our example mark shape palette, these comments suggest a need to exclude or reduce the likelihood that certain shapes get assigned to marks, or to preclude specific combinations of shape and encoding channel. On the other hand, other interviewees [P3, P4, P5, P6, P8, P9] supported the idea of designer-designed mark shape palettes, comprised of shapes that could evoke semantic associations with the underlying data or shapes that are merely 'extravagant' and 'fun' [P3]. Adding options to the channel palette. Of the numerical encoding channels for polygons in our example palette, size differentiation is most salient. Commentary on other channels in our example palette recalled our earlier experimentation described in Section 4, such as P1's suggestion that we could encode values into marks' stroke weights or, in reference to the star shape, we could map values to the number of polygon vertices. Beyond channels that we had previously considered, P11 urged us to consider other channels, such as the fullness of a mark's fill and the inclusion of textured fill patterns.
The mark rotation channel continues to be a challenge, as our use of a circular pip elicited some confusion. P2 and P3 both commented that the default position of pips varied across mark shapes, and while marks with different shapes correspond with unique column sets whose values are not directly comparable, this is not evident at first glance. Another recurring source of confusion was the inclusion of pips irrespective of whether the rotation channel was assigned to the mark. Our rationale for the universality of pips was consistency: while shapes and encoding channel combinations may differ across marks, each would exhibit this common defining characteristic, akin to how all biological diatoms have a nucleus. P2 and P3's comments would suggest that this uniformity conflicts with the occasional use of the pip to signify mark rotation. Post-sampling mark and channel overrides. After demonstrating the sample-based assignment of mark shapes and encoding channels, we asked our interviewees to select a promising glyph design and tell us if and how they would refine it. Both P10 and P11 offered suggestions relating to an ability to override specific assignments. P10 suggested the ability to select a mark within a single glyph design and swap out its shape or its encoding channels without affecting the rest of the design. He further suggested the ability to modify the output range of a selected encoding channel, such as constraining the minimum and maximum sizes of a mark or the range of possible mark fill colors. This hypothetical override control for a single mark suggests the need for on-demand widgets such as in-context sliders as described by Webb et al. [85]. Alternatively, P11 suggested a shelf and pill interface similar to Tableau [73], in which assignments to a selected mark could be overwritten by dragging alternative mark or channel pills to a shelf.

Observations on Scaffolds and Mark Arrangements
We explained the concept of a glyph scaffold and a scaffold's gravity, indicating that both were randomly determined. In general, we noted that glyph designs with either extremely strong or extremely weak gravities tended to be ignored or dismissed as unpromising by interviewees; an exception was P4, who expressed an appreciation for designs with strong gravities: "I like that all of the shapes are drawn in the center; I start seeing them as a whole, that everything has its center".
Opinions varied in terms of what a desirable set of options should be for a scaffold shape palette. For instance, P9 expressed a preference for polygon scaffolds that exhibited mark placement symmetry, while P3 and P8 expressed a preference for simple scaffold shapes over hexagons and spirals. The former noted that "it's easier to register the differences when the spatial organization is basic" and the latter indicated a preference for the simplicity of a vertical linear scaffold. Refining the scaffolds. As with mark and channel assignments, interviewees suggested ways to refine the scaffolds after they are assigned, such as modifying their colors [P2, P6, P11] or sizes relative to the marks superimposed on them [P5, P6, P9]. P4 and P5 both expressed an interest in reining in randomness associated with a scaffold, such as by binding its color to a value from the corresponding data point. Similarly, when we suggested the possibility of binding the gravity of a scaffold to a data value, so that glyphs could be differentiated by the proximity of their marks, P8 agreed that this too would be worth experimenting with. Mark arrangement hierarchies. P5 and P10 asked us about the placement and ordering of marks relative to a scaffold. Upon explaining this process (Section 3.1), it was evident that both wanted to manipulate these initial placements. P10's suggestion was the ability to establish a visual hierarchy of marks, such that one mark is more salient than the others; this mark could be noticeably larger than the others or placed more centrally within the scaffold while other marks orbit around the scaffold's periphery. Creating this visual hierarchy need not take place after sampling; in discussion with P5, we realized that this assignment of focal and peripheral marks could take place during column set assignment. Once a hierarchy is defined, we could assign less salient encoding channels to peripheral marks; for instance, both P2 and P4 commented on the relative subtlety of the alpha level channel relative to the more salient size channel, with P4 indicating that she would relegate the former channel to marks that were less central to the visual hierarchy. Finally, the incorporation of a visual hierarchy could add clarity to the use of the rotation channel for peripheral marks; P10 suggested that rather than rotate these marks relative to absolute cardinal directions, they could be rotated toward or away from some other point of reference, such as a central focal mark.

Semantic and Figurative Associations
Associating the visual properties of a glyph with the semantics of the underlying data or with figurative motifs was another line of questioning that we pursued. We suggested that some associations could be planned prior to sampling, while other associations occur serendipitously after sampling: a post-hoc recognition of emergent visual phenomena. Associations by design. P6, P8, and P11 offered several examples of how designer-defined palettes of shape and channel options could evoke aspects of the dataset. As a designer who has worked on map-based visualization projects, P8 suggested the use of Diatoms for generating weather-related glyphs, so that incorporating color palettes that are conventional in weather maps could result in glyphs that trigger meteorological associations. Similarly, P11 mentioned his preference for what he referred to as the 'visceral meanings' of soft color palettes and organic shapes, particularly when drawing glyphs that represent people. Considering the urban mobility dataset, P6 suggested incorporating the shape of a city's geographic footprint into the glyph's scaffold, so as to reinforce the association with intra-city travel. A contrasting view came from P9, who saw planned semantic relationships as a bonus. Upon seeing the output of Diatoms, he remarked that "you can get a lot of inspiration from what is already here, [although] importing shapes would work in certain situations." Emergent associations. On several occasions, interviewees serendipitously recognized certain shape, channel, and scaffold combinations that evoked figurative elements that aligned nicely with the underlying data. Examining glyphs generated for the film dataset, P3 noted how certain combinations of shapes along a horizontal scaffold were reminiscent of film cameras. Similarly, P4 spotted a design where the length of a wave mark corresponded with a film's runtime, evoking a physical filmstrip; in another design, a rotated circle mark and its pip suggested an analog clock, which was deemed appropriate for conveying a time-related value. Some associations were less overt, such as in the context of the urban mobility dataset, where P7 spotted skyscrapers comprised of square marks arranged in vertical scaffolds, or where P5 saw frequency-varying wave marks superimposed over square marks, a combination that evoked either the density or frenetic activity of a city.
Some emergent figurative associations are unrelated to the data. After using the term 'personality' to describe a particular design, P12 explained that with Diatoms, "you are creating a new entity, a new organism," citing the project's biological namesake. Later in the interview, she would refer to her favourite glyph design as 'Bob,' noted for his eyes and wavy limbs (see inset at left).

Study Limitations
While our study provided us with designers' impressions of our technique and how it might be incorporated into their workflows, we cannot make any claims regarding the efficacy of the glyph designs that Diatoms ultimately inspires. Diatoms can provide alternative glyph designs as starting points for designers, who may in turn incorporate selective aspects of these designs into their final visualization design. For instance, they may iteratively adjust the glyph design and augment it with helpful annotations and legends, while simultaneously integrating other sources of design inspiration. A longitudinal diary study of designers who incorporate Diatoms into their workflow could cast light on the efficacy of the technique, such as by capturing the lineage of a Diatoms-inspired glyph design. When designers publish or disseminate their work, we can additionally assess the efficacy of their final glyph designs among their intended viewing audiences. Ultimately, our experience suggests that this area of research is ripe for methodological innovation, one in need of new methods and metrics for evaluating visualization design inspiration techniques.

DISCUSSION
We see Diatoms as being part of a larger effort aimed at expanding the vocabulary of visualization design choices and combinations. Echoing Johnson et al. [37], we see this effort as a way to avoid converging on a local maximum, a point where most programmatically-generated visualization exhibits a common aesthetic, one with a limited potential to evoke a range of affective responses from viewers. Randomness and designer agency. To paraphrase P11 from our study, random sampling from palettes of marks and channels can at times result in the serendipitous discovery of a promising design; however, in many cases, it will not lead to a useful expansion of the vocabulary for visualization design choices. Consequently, we must temper randomness and provide designers with the agency to curate and constrain the output of a technique such as Diatoms, such as by providing the ability to cull unpromising designs or the ability to incrementally add data columns or column set designations to a glyph design specification. Beyond assuming the role of a curator, another potential way to restore designer agency is to assign them the role of a breeder: as inspiring precedents, we look to how Morph [21] mutates an encoding, or how House et al. [33] incorporated genetic algorithms into flow visualization design, and we envision similar approaches being applied to explore specific paths through the glyph design space. Designer-defined palettes. We see the ability to define project-specific palettes of shapes and encoding channels as a way to promote designer agency with a sample-based technique such as Diatoms. Beyond palettes of custom polygons, paths, and colors, we envision the incorporation of external sources of visual imagery. For instance, DataQuilt [90] allows visualization designers to extract mark shapes and fills from regions of photographs, while artifact-based rendering [37] involves appropriating the visual texture and shape from 3D scans of small objects, such as small clay sculptures. These techniques could serve as a potential precursor to the palette sampling that Diatoms performs, wherein a designer provides palettes of shapes, colors, and textures extracted or scanned from images that are semantically related to the dataset at hand. From design inspiration to visualization authoring. Diatoms takes us closer to uniting the processes of visualization design and visualiza-tion authoring. As existing visualization construction tools assume that people already have a design in mind prior to using them [65], Diatoms could inform these designs as a part of these tools.
We see different possible trajectories for integrating Diatoms into visualization construction workflows. Our proof-of-concept implementation of Diatoms in Tableau represents one possible pipeline, from data shaping and exploratory data analysis to publishing communicative glyph-based information graphics on Tableau Public [75]. However, to complete this pipeline, designers would require the ability to refine, position, and format glyphs before publishing them.
Alternatively, we envision the integration of Diatoms' shape and channel sampling process into recent bespoke construction tools [65] as an alternative to drag-and-drop data binding interactions. Moreover, these tools could benefit from the addition of interactive design externalization and of small permutables in particular: the ability to see the same data point visualized many different ways across a single display. For example, this externalization could manifest as a peripheral or collapsible view akin to StructGraphics' gallery of saved templates [80]. This automated externalization could reduce the need to manually capture and arrange design artifacts in external tools.
Finally, we also foresee a use for Diatoms in multi-tool workflows [10], whereby a standalone Diatoms implementation could export promising yet incomplete glyph designs as vector images that could be further refined using illustration tools or visualization libraries.

CONCLUSION AND FUTURE WORK
In this paper, we introduced Diatoms, a technique for inspiring glyph design through the use of a sample-based generative process combined with interactive design externalization. We demonstrated the technique via an extension to Tableau, which featured three example palettes to sample from: one for mark shapes, one for encoding channels, and one for glyph scaffold shapes. We also reflected on the evolution of the technique and our experimentation with palettes that preceded those used in the study and featured in Figures 1-5. Finally, we collected responses to the technique from a group of information design students and professional designers, which suggested ways by which they could incorporate it into visualization design workflows.
Looking to the future, we hope that our research motivates others in our research community to consider the use of generative design processes and interactive design externalization into visualization construction workflows. We also encourage designers and researchers to experiment with mark shape, encoding channel, and scaffold shape palettes beyond those featured in this paper.
Our interviewing of information design students led us to consider the pedagogical potential of Diatoms. A collaborative review of glyphs generated by this technique could complement existing activities that encourage divergent thinking about visual encoding, such as sketching two quantities in as many ways as one can think of [56]. Diatoms could be used to introduce concepts like conjunction encoding, the relative effectiveness of integral and separable visual channels [84], and Gestalt principles of perceptual grouping. Beyond visual perception, the technique could also be used to explore the impedance match between the semantics of a data column and its visual representation within a glyph. Finally, we hypothesize that an interactive small permutables viewing mode could facilitate both types of pedagogical discussion.
Beyond design inspiration, we pose a speculative question that looks beyond the scope of this paper: could the Diatoms technique be used as a tool for exploratory data analysis? In our study, we did not ask interviewees about any new insights into the data that they realized during our brief chauffeured demonstration, as our focus was on the potential utility of the technique for inspiring early glyph design. We therefore leave it to future work to ask whether the combination of a constrained random sampling of marks and channels with the ability to rapidly page between alternative glyph designs could reveal patterns in the data that were previously overlooked.