Learning from graphic designers: using grids as a scaffolding for ...

Learning from graphic designers – using grids as a scaffolding for automatic print layout Eamonn O’Brien-Strain and Jerry Liu Hewlett-Packard Labs, 1501 Page Mill Rd., Palo Alto, CA 94304, USA ABSTRACT We describe an approach for automatically laying out content for high quality printed formats such as magazines or brochures, producing an aesthetically pleasing layout that correctly conveys the semantic structure of the content and elicits the desired experiential affect in the reader. The semantic structure of the content includes the reading order graph, the association of illustrations with referring paragraphs, and the preservation of perceived text hierarchies. We appropriate a popular conceptual tool used by graphic designers called the grid. A well-designed grid will cause a pleasing uniformity through all the pages of a publication while still allowing flexibility in the layout of each page. In the space of different automatic layout systems, our approach is somewhere between template-based techniques and generative techniques, with the aesthetics determined by the combination of the grid and a generative algorithm One consequence of using the grid is that it greatly reduces the space of possible layouts from a high dimensional continuous space to a discrete space. Using a simple greedy algorithm, our first results are promising. Keywords: document, layout, composition, grid

1. INTRODUCTION 1.1. Motivation Recent advances in printing and display technology open up the possibility of democratizing the production of high quality documents such as magazines and marketing brochures suitable for printing or for display on new generations of tablet computers and e-book readers. Increasing production of these “long tail” documents by individuals and small businesses promises to be disruptive for the publishing and printing industry, providing business opportunities in many areas such as digital printing, web-based publishing services, and personalized printed advertising. We see two ways we can help consumers and small businesses overcome barriers to long tail publishing: 1.

Content generation: Make it easier to author new content or repurpose existing content

2.

Composition: Allow unskilled user to create professional-looking layouts

This paper addresses composition, but we should mention one particular approach to content generation, which is to allow repurposing of material from a web site. Our observation is that typical high quality print design is quite different from typical Web design. Therefore, we decompose the web pages into their fundamental content, which we then recompose and automatically lay out in accordance with the conventions of print design. The net result is greatly enhanced quality and flexibility of printing from the Web. 1.2. Automatic Layout We need automated layout technologies that minimize the required effort and skill, while maximizing the resulting variability and aesthetic quality. Typical professional tools used to layout such documents, for example Adobe InDesign, require too much graphic design skills to be suitable for the average user. Office tools such as Microsoft Word or PowerPoint are easier to use but still require some graphic design skills and tend to produce amateurish results because of limitations in the tools. Only an automated layout tool can reduce the required effort and skills to be usable by the average person.

The important requirements of an automatic layout tool are •

It should somehow encapsulate the aesthetics of graphic design.

•

It should handle mixed textual and graphical content, with no restrictions on quantity of text or aspect ratio of images.

•

It should produce a wide variety of different styles of output.

•

It should produce a layout that correctly conveys the semantic structure of the content. This includes the reading order structure and the preservation of perceived text hierarchies

•

It should produce a document that elicits the desired experiential affect in the reader. This might be a particular mood or personality, or an association with some brand identity.

1.3. Automatic Layout Trade-Offs Current automated tools trade-off aesthetics against flexibility, while we require both. Typically, tools that automatically produce layouts of high aesthetic quality do so at the cost of restricting the numbers and size of images and text that they can deal with. Conversely, tools that are flexible enough to take a wide variety of input often produce results that are lacking in aesthetics. One type of tool is a static template system containing a library of professionally designed templates that encapsulate the graphic design aesthetics. These can produce professional-looking results because a skilled graphic designer can custom design all the properties of the pages other than the particular images and text that the user will ultimately insert. The drawback is a limit in the variability of the layouts generated from the tool, and restrictions on aspect ratios of images and quantities of text, because there is a practical limit on how many templates the template library designers can create and amongst which the end-users can choose. Another type of tool is a generative algorithm, for example BRIC3 or de Oliveira’s system5, that encapsulates the aesthetics implicitly in the procedures and metrics of its algorithm. For example, the system described by de Oliveira uses a generative algorithm to produce newspaper-style column-based layouts, taking as input a single linear readingpath sequence. Because it does not use templates it can produce a wide variety of different layouts, however they all have a similar general style. Systems like this can produce wider variability in the layouts they generate than a static template system and can deal with a much wider variety of input content, handling arbitrary image aspect ratios or arbitrary quantities of text. However, the cost of this flexibility is a certain uniformity in style inherent in the algorithms. Ideally we would like a tool that is as flexible and produces at least as much variability in its layouts as a generative algorithm but with the aesthetic quality of a static template system. One approach to transcend the flexibility-aesthetics trade-off is Jacobs et al 4 which describes a “grid-based” layout system. It has a library of constraint-based page templates, used by a “paginator” to assign content to pages and a layout engine to produce a layout meeting the constraints of the templates. It is grid-based insofar as the templates are specified using a grid. It offers more variability than a static template system because the templates can adapt to different page aspect ratios, however its variability is limited by the availability of professionally designed templates. 1.4. The Grid A popular conceptual tool used by graphic designers is “the grid”, a pattern of horizontal and vertical guides that constrain the alignment of text and graphics. A well-designed grid will cause a pleasing uniformity through all the pages of a publication while still allowing flexibility in the layout of each page. The grid has maintained a consistent influence over the graphic design community even as its popularity has waxed and waned over the years. At times during the twentieth century, for example during the height of the Bauhaus modernist movement, the grids rectilinear imperatives have dominated not just graphic design but other visual arts such as architecture and the fine arts. At other times, elite designers have chafed at the constraints of the grid and sought to express themselves more freely. Nevertheless, grid-based designs are widespread in commercial graphic design, as you will see if you leaf through popular magazines or look at marketing brochures. Particularly in applications where a consistent identifiable style is important, grids have been widely used. In the analog era, graphic designers physically cut out text and graphics and

glued them on sheets printed with a grid in a blue color invisible to the photo-reproduction process. Today, professional graphic designers use software layout tools such as InDesign, Quark, and Scribus that support grid-based design methodologies. 1.5. Layout Aesthetics One of the greatest challenges of this work is to define exactly the “aesthetic” properties that we are trying to achieve in our layouts. Many artist and graphic designers would indeed claim that it is impossible to quantify in a machineunderstandable way what is an essentially human and mysterious quality. Nevertheless, cognitive psychologists and evolutionary biologists assume that aesthetics are a property of how the brain perceives, probably arisen because it has some adaptive advantages in our evolutionary past6. Therefore, aesthetics should be amenable to quantitative analysis, at least in principle. Several researchers have attempted to tackle the particular domain of layout aesthetics. One particularly applicable thread of research has been by Harrington et al7, who created a layout aesthetic model with a single numerical score as a non-linear combination of nine individual measures. These measures, come from a survey of the graphic design literature, and include alignment, regularity, uniform separation, balance, and proportion. Of course in many practical applications, what is more important is not “high art” aesthetic but rather the effectiveness of the layout in conveying the desired information and eliciting the desired response from the reader. Therefore, an ultimate measure would depend on the goal. For example if the application is to lay out an advertising brochure then the measure of the effectiveness would be how the reader responds to the intended “call to action” that should be clearly expressed in well designed marketing material..

2. OUR SOLUTION 2.1. Approach Our approach is somewhere between static template-based techniques and generative techniques, with the aesthetics determined by the combination of the grid and a generative algorithm. Instead of a professionally designed template library, our system uses a “scaffolding” library containing professionally designed grids, as well as other design elements such as color palettes, fonts, and graphic embellishments. A single grid can generate a variety of layouts. This means that a modest expenditure of time by a graphic designer to create a relatively small library of grids allows us to generate a wide variety of different-looking printed outputs. One consequence of using a grid is that it greatly reduces the space of possible layouts. A general generative algorithm operates in a high-dimensional space that includes many continuous dimensions such as the many x, y, height, and width values of the text and graphical elements. In contrast, an algorithm working within the constraints of a grid operates in a much smaller discrete space, resulting in a smaller combinatorial optimization problem. 2.2. Page Semantics One input to our layout system is a description of the content in the “page semantics” format defined as part of a broader automated publishing architecture. This consists of stylistically tagged text and illustrations organized into parallel and branching reading paths that specify the user reading order.

Figure 1. Page Semantics Data Model

We designed the page semantics model to be as simple as possible while still containing all the textual and graphical elements and interrelationships that could affect the final layout. •

The core data class is the “chunk”, which has a semantic classification (a “meaning”) as well as some text or a bitmap. This is the fundamental unit that the layout algorithm deals with. Examples of chunks are paragraphs, headers, captions, and images.

•

These chunks are arranged into ordered sequences called “reading paths”, which represents segments of the primary reading order through the document.

•

A collection of such top-level “backbone” reading paths make up the page semantics passed to the automatic layout.

•

Each chunk in the backbone can optionally have a subsidiary “rib” reading path attached to it for associated content.

2.3. Style Another input to the automatic publishing system is a set of parameters that determine which layout library components to use. These components include font styles, color palettes, and graphical embellishments. We do not examine these in more detail in this paper because they are generally orthogonal to the layout aspect of the design. Nevertheless, they have an important impact on the perception of professional quality and we provide mechanism to control them.

2.4. Placement Algorithm, in General Using the grid is only part of the solution: we also need a generative algorithm to make the actual layout placement decisions within the constraint of the grid. We distinguish between the main body text of the main reading paths and all other content. We allocate the main body text to the main text columns where it flows from column to column and from page to page. We assign other text blocks, as well as illustrations, to some of the grid cells causing the main body text to flow around them.

Figure 2. Example of allocating content in a grid, shown as four successive steps of initial grid, image allocation, text column flow, and final layout

In general, such an algorithm would operate as shown in Figure 1. Here you see a simple example of a grid that contains four columns crossed by a horizontal guide to create eight cells. The algorithm allocates grid positions for two images, shown as blue rectangles. Note how the edges of the images are constrained to align with grid guides, so that they occupy an integer number of cells. Note also that when cells are at the edge of the grid the image may extend to the edge of the page, thus achieving “full bleed”, which is a common feature of modern graphic design. After it assigns these floating blocks, the algorithm then flows the main text into remaining available parts of the columns. In practice there are a lot more details like handling the different text chunk types such as headers, more complex reading path structures such as sidebars, and multi-page issues such as pagination and allowing content to span across a two-page spread. This general description of a placement algorithm also does not include an explanation of how the algorithm actually makes the allocation decision of assigning chunks to grid cells. There are a number of approaches to making the placement decision. One is to approach it as an optimization problem of maximizing some numerical metric. For example, we could use the Harrington measures described in the Introduction. Here we can take advantage of the grid to simplify the calculation of this metric: we can ignore some components, such as alignment, because the grid takes care of them and other measures are cheaper to compute because the grid transforms the problem from the continuous domain to a relatively small discrete domain. Given this formulation, we can tackle the problem using standard optimization techniques such as simulated annealing and genetic algorithms. Another approach is to eschew the goal of a globally optimal solution but instead to use a greedy algorithm that fills pages successively using some simple heuristics. This is indeed our initial approach, as we wanted to prove out the general framework before spending a lot of time investigating different more complex placement algorithms. 2.5. Placement Algorithm, an Initial Proof-of-Concept In our first solution, we used a first-fit algorithm to allocate grid cells for all text and graphical blocks except the main body text. The unallocated cells on each page are filled with the main body text flowing in the basic column structure of the grid. You can think of our algorithm as operating like the Tetris video game. For each successive item, we estimate the number and configuration of grid cells it will need. This is analogous to the shape of a falling Tetris block that must find a slot where it will fit in the page. The algorithm looks for an opening in the available unallocated cells on the current page into which this shape will fit. If there are multiple such places, it chooses one that maximizes a metric. For now, we

are not using the Harrington metric but one that takes account of the semantics of the content. It tends to keep successive items in the reading path going top-to-bottom and then left-to-right. It also has a “gravity” component that tends to move some text, such as titles, up the page and other text, such as footnotes, down the page. If there are no available spaces for the shape on the current page, the algorithm creates a new page and allocates it to cells there. 2.6. Implementation We use the open-source Scribus page layout program for rendering the layout to PDF and dealing with text flow and hyphenation. This program is an interactive program, but we were able to use its scripting capabilities to turn it into a server that exposes its features over a simple web service protocol. We implemented the core algorithms in the Scala language running in a Java environment. We provide our layout solution as a Java library for use in a Java application, typically web applications. The library implements a generic API for layout algorithms that take the page semantics representation of the content and generates a PDF of the rendered layout.

Figure 3. An example three-page output of our system. Note how content can span across a two-page spread.

3. NEXT STEPS The current greedy allocation algorithm can produce sub-optimal results, particularly for more complex page semantics. We plan to explore a variety of other approaches including the global optimization approaches described previously. We handle many features of modern layout design, such as sidebars, drop capitals, and items spread across multi-page spreads. We would also like to add the “layering” of graphical and textual content that is common in professional

designs. This would allow muted illustrations behind text, or text in a suitable contrasting color on top of non-salient regions of photographs. Currently the styling input to the algorithm comes from the pre-designed library. We would like to allow some of the styling input to come from the original source material. This would support, for example, a use model in which we extract the “branding” of a web page (fonts, colors, or graphics) for use in the printed document.

REFERENCES

[1] [2] [3] [4] [5] [6] [7]

Lupton, E. and Phillips, J. C., [Graphic Design the New Basics], Princeton Architectural Press, New York (2008) Samara T., [Making and Breaking the Grid], Rockport Publishers, Beverly MA (2002) Atkins C., "Adaptive photo collection page layout, " 2004 International Conference on Image Processing, ICIP '04. 2004:2897-2900 (2004). Jacobs C., Li W., Schrier E., Bargeron D. and Salesin D., "Adaptive grid-based document layout," ACM Transactions on Graphics (TOG);22(3):838-847 (2003). de Oliveira J. B., "Two algorithms for automatic page layout and possible applications," Multimedia Tools and Applications.43(3):275-301 (2009). Dutton, D., [The Art Instinct: Beauty, Pleasure, and Human Evolution], Bloomsbury Press, New York (2008) Harrington, S. J., Naveda J. F., Jones R. P., Roetling P., Thakkar N., "Aesthetic measures for automated document layout," Proceedings of the 2004 ACM symposium on Document engineering - DocEng '04.:109 (2004)