Non-photorealistic Rendering of Images as Evolutionary Stained Glass Daniel Ashlock Mathematics and Statistics University of Guelph, Guelph, Ontario Canada, N1G 2W1
[email protected] Balu Karthikeyan Mechanical Engineering Iowa State University Ames, Iowa USA 50010
[email protected] A BSTRACT
Kenneth Mark Bryden Mechanical Engineering Iowa State University Ames, Iowa USA 50010
[email protected] Non-photorealistic rendering is a broad class of techniques for creating art from digital pictures. One or more digital filters is applied to create an apparent pencil sketch, watercolor, or in this study a design for stained glass. A collection of points that are the centers of weighted Voronoi tilings are evolved to minimize the variance of the variance in luminance within each tile. The average color within each tile is computed. A fractal model of stained glass is then run to create a stained glass texture with a similar average color to that in the tile. Tile boundaries are rendered black, providing the “lead” enclosing the stained glass panes. The stained glass textures are then applied within their corresponding tiles to yield a final image. Evolution of the tile centers is a challenging problem with an expensive fitness evaluation. On the order of 500-3000 real parameters representing the tile centers and their associated weights are optimized. A modified evolution strategy is used to perform this optimization.
the stained glass tiling, requires global comparison of pixels in the image with center points of the pieces of glass; this global comparison is not computationally efficient. An important part of the rendering pipeline presented here is the creation of textures that mimic stained glass. The creation of textures is itself a complex process [5]. Genetic programming [2] has been used to create textures [11]. The stained glass textures used in this study are part of an ongoing project on texture synthesis within the Ashlock lab. The remainder of this paper is organized as follows. Section II gives the details of the stained glass rendering pipeline. It gives the definition of weighted Voronoi tiles and the fitness function used to search for effective image segregations. Section III gives the details of the stained glass texture creation. Section IV gives the design of the evolutionary experiments used. Section V gives the experimental results. Section VI makes suggestions for improving the rendering pipeline, particularly in the matter of speed and suggests generalizations of the research performed.
I. I NTRODUCTION
II. T HE R ENDERING P IPELINE
One method of creating digital art is to start with a digital image and use computation procedures to modify it. The general name for this class of techniques is non-photorealistic rendering [7], [6]. These techniques can be used to render an image so that it appears to be a pen-and-ink drawing of the scene in the original image [13]. It is also possible to generate apparent watercolors [4], [8]. In this study an evolutionary algorithm is used as part of a non-photorealistic rendering pipeline to create designs for stained glass from digital images. One image is a photograph, three others are synthetic images of geometric objects. Other computational intelligence techniques besides evolutionary algorithms have been used to create digital art. Artificial ants, for example, have been used as agents for performing non-photorealistic rendering [14]. Care must be used in the design of algorithms for nonphotorealistic rendering[10]. The rendering pipeline presented here is quite slow and much of the discussion treats places where algorithmic efficiency may be gained. One step in the algorithm, assignment of pixels to particular pieces of glass in
This section gives the stained-glass rendering pipeline. An image is selected. A weighted Voronoi tiling is then evolved to segregate the image into tiles, using a fitness function that tends to choose tiles that respect natural boundaries within the image. Once the tiling is located, stained glass textures that match the average color of a tile are generated. Tile boundaries are rendered in black while tile interiors are colored with the stained glass textures. The original images used in this study are shown in Figure 5. A. Weighted Voronoi Tilings A Voronoi tiling [12] is a division of a finite subset of the plane into convex polygonal tiles. A Voronoi tiling is generated by placing a set of tile centers into the plane. A tile associated with a given tile center p consists of all points in the plane closer to p than to any other tile center. The boundary between two tiles that meet is always a segment of the perpendicular bisector of the line segment joining the two tile centers. For many applications this is a convenient way to subdivide the plane. Images, however, often contain objects with curved
boundaries. To deal with this common feature of images, sets of weighted Voronoi tiles are used to segregate images.
of a square are shown in Figure 1. The boundaries between weighted Voronoi tiles are no longer line segments. Examine the comparison of squared distances made for two tile centers (xi , yi ) with weight ωi , i = 1, 2 for a point (u, v) whose tile membership is to be decided: ω12 (x1 − u)2 + (y1 − v)2 < ω22 (x2 − u)2 + (y2 − v)2 (1) If ω1 = ω2 then all quadratic terms in the comparison cancel. If, however, ω1 6= ω2 then the boundary is a quadratic curve. Note that comparing squared distances yields the same result and is computationally cheaper. The goal is to find a weighted Voronoi tiling, specified by a vector of 3n values (x1 , y1 , ω1 , x2 , y2 , ω2 , . . . , xn , yn , ωn ) that yield a good segregation of an image for a stained glass design. The notion of good is to minimize the variance of the variance within tiles of the pixel luminance. Luminance in a red-green-blue(RGB) image is a scalar grayscale value that is thought to capture the brightness of the image. For a pixel with color values r, g, and b, luminance(r, g, b) = 0.299r + 0.587g + 0.114b
Standard Voronoi Tiling
Weighted Voronoi Tiling Fig. 1. The upper image is a standard Voronoi tiling of a square while the lower image shows a weighted Voronoi tiling. Contrast the edge curvature in the two images. Note that when a weighted tiling is used one tile can be completely contained within another.
A weighted Voronoi tiling differs from a standard Voronoi tile in that each tile center has a weight ω associated with it. This weight is multiplied by the distance from a tile center to a point in the plane when deciding tile membership. A large weight makes distance to a tile center more expensive, shrinking the size of the tile. A small weight make the distance to a tile center cheap, increasing the size of the associated tile. Examples of a standard and weighted Voronoi tilings
(2)
Suppose that a weighted Voronoi tiling is used to segregate an image. If the sample variance of the luminance within each tile is roughly the same then each tile contains a similar amount of information. Minimizing the variance of the tile luminance variance achieves this goal of placing the same amount of information in each tile. This fitness function also encourages the segregation to respect natural boundaries within the image. Consider a tile that crosses a boundary between two parts of the image with distinct colors or textures. This tile will have a very high variance of luminance. Evolution using a fitness function that minimizes the variance of variance(VOV) across the tiles will favor shrinking or eliminating such tiles, yielding a segregation of the image that, a a side effect respects natural object boundaries within the image. For these reasons fitness in this study is chosen to be VOV, which is to be minimized. B. The Tile Center EA In order to minimize the VOV fitness function, a form of evolution strategy(ES)[3] with µ = 1 and λ = 4 is used. A population of five vectors of length 3n specifying weighted Voronoi tilings with n tiles is generated initially. In each iteration of the algorithm variations of the most fit member of the population replaces the others. Variation is accomplished in two steps. First a vector of 3n values in the range [−1, 1] is created. This vector is normalized to make it a unit vector. This vector is then scaled by a Gaussian random variable with mean zero and variance 0.02. The vector is then added to the vector specifying the weighted Voronoi tiling. The variance value was chosen in a series of preliminary studies on the Yang-Yin image; comparison to choose the best value was performed inspection of the resulting stained glass images. This first variation step makes a small move in an arbitrary direction within the space of weighted Voronoi tilings.
The second variation step is intended to eliminate highand low-weight tiles. The ten highest-weight tiles are selected. In order of descending weight they are matched with the 10 lowest weight tiles. For each of the high weight tiles, the center of a corresponding low-weight tile is moved to a position near the center of the high weight tile. The new position of the tile center being moved is chosen with a bivariate Gaussian random variable centered on the center of the high-weight tile under consideration. The Gaussian has a variance of 10.0, yielding changes on a reasonable scale given that images 800 pixels across are used in this study. The weight of the high weight tile is divided in half and this halved weight also replaces the weight of the former low weight tile. Essentially the low-weight tile center is removed and a new tile center is placed near the center of the high-weight tile to replace it; the old high-weight tile and the low weight tile then split the weight of the high weight tile. The first variation step represents undirected search in tiling space. The second is specifically designed to eliminate problematic tiles. A high-weight tile tends, once evolution has been running for a while, to be in a high-variation region of the images. Reducing the weight and enlarging the tile while placing another tile in the same high-variation region causes the search to work harder on resolving the difficult parts of the image. If the source of high variation is a natural boundary then this form of variation helps the algorithm to pace the tiles in a manner that respects the boundary. See Figure 6 for an example of the impact of the second variation operator. In an initial study the EA was not asked to minimize the VOV fitness function. Instead of the variance of luminance the sum of the variance of the individual red, green, and blue color planes was used. Since variation within the image may be low in one color and high in another this fitness function came with built-in conflicts. In essence the algorithm was performing three-criterion optimization by summing the three relatively independent components represented by the red, green, and blue variance. Changing from sum-of-individualcolor-variance to the aggregate color, measured as luminance, substantially improved the appearance of the final stained glass designs. III. S TAINED G LASS T EXTURE G ENERATION An iterated object placement fractal is created by taking an object and repeatedly placing it, additively, within a drawing region. An example of this type of fractal are the Gaussian hills that appear in [9], for which the object is a step function with Gaussianly distributed vertical displacements placed along randomly placed line. A square is repeatedly divided by randomly generated lines that intersect the square. On one side of the square the hight of all locations in changes by a single value sampled from a Gaussian distribution. In general an object is a collection of vertical displacements associated with positions (x,y) within the drawing area. The object is placed at random (x,y) positions and its vertical displacement summed with any already present. The vertical direction used in this study is “more color” not upwards.
40 35 30 25 20 15 10 5 0 3 2 1 -3
-2
0 -1
0
-1 1
2
-2 3 -3
Fig. 2. A plot of the crater function, used to generate the fundamental object that generates the stained glass texture.
Fig. 3.
An example of the stained glass texture.
Thus, while the algorithm used generates fractal landscape, the heights of the resulting relief map are exported as color intensities rather than vertical displacements. The mathematical formula used to generate the vertical displacements of the fundamental object that generates the stained glass textures is: f (x, y) =
30 1 + |R − x2 − y 2 |
(3)
This function is called the crater function; a plot of the function appears in Figure 2. The crater is sampled at all points for which −20 ≤ x, y ≤ 20 and this sampling is mapped onto a 200 pixel square area. The value of R = 20 3 is chosen to size the crater into the sampling area. The function is sampled once and all values (x, y, f (x, y)) for which f (x, y) ≥ 4 are saved. The number 4 represents a trade-off between recording enough of the vertical change to capture the essential shape of the crater function and the need to keep the data structure storing the object from being too large. Since the values stored will be summed into a texture during its generation hundreds of times,
keeping the number of values stored as small as is consistent with acceptable appearance of the texture is important. A “paint chip” of stained glass textures in several shades is shown in Figure 4. In color these are shades are an equally spaced set of hues from red to magenta; even in black and white the ability to create multiple shades of stained glass texture is apparent. The sampled crater function is used to create the stained glass textures in the following manner. A texture must have an average color close to that of the tile within the picture that it will fill. A collection of three color planes, red, blue, and green, encompassing a 200 × 200 area are all initialized to zero. The crater object is placed at random withing this area 200 times, wrapping at the edges of the color planes. When an instance of the crater is placed a nonempty subset of the three color planes is chosen, uniformly at random, and the vertical displacements are added into the color planes selected. These vertical displacements are scaled by the average color values within the tile. This scaling yields the correct ratio of red:green:blue, but the values are typically far too large for the eight-bits-per-color used. Once all craters have been placed, the vertical values are linearly scaled so that the largest value is 255. IV. E XPERIMENTAL D ESIGN For each of the images, the ES was run for 100 generations. This small number of generations was deemed acceptable for two reasons. First, a few runs were extended to 500 generations and fitness did not improve after generation 100. Second, fitness evaluation requires re-computation of Voronoi tile membership. The images used are up to 800x800 pixels with as many as 1000 tiles for a total of 640,000,000 quadratic comparisons per fitness evaluation: the algorithm is slow. Four images, a photograph of daffodils and synthetic images of surfaces of genus 3 and 1 (a torus) and a yang-yin symbol were used. Twenty evolutionary runs were performed for each image; examples of the outcomes are shown in Section V. Runs for the daffodil and yang-yin symbol were made with 480 tiles, the torus was run with 1000 tiles, and the surface of genus three was run with 480 and 1000 tiles for comparison. V. R ESULTS Selected stained glass designs for each of the images are shown in Figure 7. The images for the daffodil look more like the original image from a distance (at small image size). This is a property that real stained glass designs often have. The second variation technique, which removes tiles with extreme weight values in the weighted Voronoi tilings, was constructed after examination of a series of runs performed for the yang-yin image without it. The ability of the algorithm to find natural boundaries appeared to improve substantially when the second form of variation was used. Figure 6 shows an image created with and without the second variation technique. In all of the evolutionary runs there was a rapid initial drop (improvement) in the fitness of the best structure. Improvement of the best structure sometimes stagnated as early as the 20th
Fig. 6. The upper image was generated using both forms of variation while the lower image does not used the variation method that eliminates tiles with extreme weight values. Note the smoother boundaries in the upper image.
updating of the ES but also sometimes delayed until after the 90th step. While preliminary runs were used to chose the cutoff of 100 updatings of the population within the ES, it is still a goal to see if longer runs yield better results. A prerequisite to useful experimentation of this type is improvement of the algorithm speed. VI. D ISCUSSION The technique for designing stained glass layouts worked acceptably well. The images look considerably better in color and are available in color by e-mail from the first author on request. The part of the stained glass design pipeline most in need of improvement is the evolutionary algorithm. Exploration and maturation of the technique require experimentation which is difficult when the time to produce one stained glass design is measured in minutes. The use of pre-computed craters for the creation of stained
Fig. 4.
A selection of seven equally spaced hues of the stained glass texture. The spacing is in the sense of hue-saturation-value(HSV) color values.
Daffodils
Genus-3 Surface
Torus
Yang-Yin Fig. 5.
The original images used.
glass textures enormously reduces the time required to render an image. In initial studies, when each crater was generated individually, the rendering step was slower than the evolutionary computation step. Re-computation of tile membership for all
pixels is the computational bottleneck in rendering the image. This step has time proportional to both the number of pixels and the number of tiles. A number of approaches could be used to speed this step.
Daffodil (480 tiles)
Daffodil (480 tiles)
Genus 3 (1000 tiles)
Genus 3 (480 tiles)
Yang-Yin (480 tiles)
Torus (1000 tiles) Fig. 7.
Selected glass designs.
1) Divide-and-conquer. An initial run of the EA to segregate the image into a small number of tiles followed by subsequence segregation of those tiles as images in their own right would substantially speed the algorithm. The increase in the number of evolutionary runs that could be performed may well enhance the quality of the best image located. If not, the divided image could be reunited, taking the union of the sets of Voronoi tile centers and using the resulting set of tile centers for a small number of generation to “finish” the segregation. 2) Incremental segregation. The second variation operator implemented for the stained glass design pipeline leaves the number of tiles constant. It removes a low-weight tile in order to place a tile near a high weight tile. If, instead, the algorithm started with a small number of tiles then tiles could simply be added near high weight tiles. This variation of the technique has the advantage that the average number of tiles processed over the course of evolution will be roughly half (slightly more) the final number. If some sort of stopping condition based on acceptable variance of variance of luminance were implemented then evolution might terminate quickly in some runs with a relatively small number of tiles. 3) Local tile re-computation. When the position and weight of all tile centers is changed then there is no choice more efficient than the re-computation of tile membership for all pixels. When single tiles are varied, as in the second variation technique, then only pixels in the vicinity of the modified tile center need to be recomputed. It may be possible to speed the algorithm by exploiting this locality of changing tile membership. The rapid leveling of fitness for the ES used to optimize tile centers, together with the failure of any two images to look the same, suggests that the fitness landscape being searched by the evolutionary algorithm is quite rugged. Final fitness varied quite a bit and low (good) fitness correlated with designs that respected natural boundaries within the image. The large variation (not shown) in quality of stained glass designs produced for each starting image suggests that the optima have substantially different peaks not only in fitness but in result quality (which is assessed by human inspection of the results). Considering this, it seems that a better initialization that random initialization is possible. The following procedure is suggested. Transform the pixel data so that a pixel at position (i,j) with color values (r,g,b) becomes, instead, a point in the unit 5-hypercube by normalizing each of the five values associated with the points to lie in the range 0-1. Perform k-means clustering[1] on these transformed points with the number of cluster set to the desired number of tiles. Kmeans clustering is given in Algorithm 1. To generate a set of weighted Voronoi tile centers, use the mean pixel position of the pixels in a cluster as tile centers (x, y) and set the variance of the luminance within the cluster as the initial weight. Algorithm 1:
k-means
Input:
1) A set S of points in Rn 2) A desired number k of clusters. 3) A bound B on the number of cycles permitted
Output: A category function C : S → {0, . . . , k − 1}. Details: Choose k distinct points in S as initial cluster centers. Repeat Assign each point to the cluster whose center it is closest to, breaking ties at random∗ . Recompute cluster centers as the average of all points in the cluster. Until (no points change their cluster assignment or B cycles have occurred+ ) Report the assignment of points to clusters as C. * for real-valued data such ties seldom occur + for real-valued data B is seldom required
It is intuitive that this initialization technique would yield better initial designs than the random initialization used. Placing both color and position information into the points being clustered will bias the selection toward compact regions that avoid boundaries between areas of the image with different colors. The initial weights will also have values roughly appropriate to the amount of color variation present near their tile center. Testing of this initialization technique is a potential next step for this line of investigation. A. Other Textures The stained glass texture used in this study is one of a large number of textures that can be achieved by iterated object placement. Changing the fractal texture algorithm can yield very different sorts of non-photorealistic renderings of the evolved segregations given in this study. Examples of other iterated object placement textures are given in FigureWootness. VII. ACKNOWLEDGMENTS The first author would like to thank the University of Guelph Department of Mathematics and Statistics for its support of this research. R EFERENCES [1] D. A. Ashlock, E.Y. Kim, and L. Guo. Multi-clustering: avoiding the natural shape of underlying metrics. In C. H. Dagli et al, editor, Smart Engineering System Design: Neural Networks, Evolutionary Programming, and Artificial Life, pages 453–461, 2005. [2] Wolfgang Banzhaf, Peter Nordin, Robert E. Keller, and Frank D. Francone. Genetic Programming : An Introduction : On the Automatic Evolution of Computer Programs and Its Applications. Morgan Kaufmann, San Francisco, 1998. [3] Hans-Georg Beyer. Theory of Evolution Strategies. Springer, New York, 2001. [4] Cassidy J. Curtis, Sean E. Anderson, Joshua E. Seims, Kurt W. Fleischer, and David H. Salesin. Computer-generated watercolor. Computer Graphics, 31(Annual Conference Series):421–430, 1997. [5] D. S. Ebert, F. K. Musgrave, D. Peachy, K. Perlin, and S. Worley. Texturing and Modeling: a procedural approach. Academic Press, San Diego, 1998. [6] B. Gooch and A. Gooch. Non-Photorealistic Rendering. A. K. Peters Ltd., 2001.
Fig. 8.
Examples of other types of iterated object placement textures.
[7] J. Landsdown and S. Schofield. Expressive rendering: A review of nonphotorealistic techniques. IEEE Computer Graphics and Applications, 15(3):29–37, 1995. [8] E. Lum and K. Ma. Non-Photorealistic rende ring using watercolor inspired texture and illumination. In Proceedings of the 9th Pacific Conference on Computer Graphics and Applications, pages 322–331, 2001. [9] Benoit Mandelbrot. The fractal geometry of nature. W. H. Freeman and Company, New York, 1983. [10] L. Markosian, M. A. Kowalski, D. Goldstein, S. J. Trychin, J. F. Hughes, and L. D. Bourde. Real-time nonphotorealistic rendering. In Proceedings of the 24th annual conference on Computer graphics and interactive techniques, pages 415–420, 1997. [11] F. K. Musgrave. Genetic textures. In Texturing and Modeling: a procedural approach, chapter 15. Academic Press, San Diego, 1998. [12] Okabe, Boots, Sugihara, and Chiu. Concepts and Applications of Voronoi Diagrams. Wiley, New York, second edition, 2000. [13] M. P. Salisbury, M. T. Wong, J. F. Hughes, and D. Salesin. Orientable textures for image-based pen-and-ink illustration. In Proceedings of
the 24th annual conference on Computer graphics and interactive techniques, pages 401–406, 1997. [14] Yann Semet, Una-May ORiely, and Frdo Durand. An interactive artificial ant approach to non-photorealistic rendering. In Proceedings of GECCO 2002, pages 188–200, 2002.