You are on page 1of 24

A Unied Conceptual Framework for Geographical Optimization Using Evolutionary Algorithms

Ningchuan Xiao
Department of Geography, The Ohio State University During the last two decades, evolutionary algorithms (EAs) have been applied to a wide range of optimization and decision-making problems. Work on EAs for geographical analysis, however, has been conducted in a problem-specic manner, which prevents an EA designed for one type of problem from being used on others. In this article, a formal, conceptual framework is developed to unify the design and implementation of EAs for many geographical optimization problems. The key element in this framework is a graph representation that denes the spatial structure of a broad range of geographical problems. Based on this representation, four types of geographical optimization problems are discussed and a set of algorithms is developed for problems in each type. These algorithms can be used to support the design and implementation of EAs for geographical optimization. Knowledge specic to geographical optimization problems can also be incorporated into the framework. An example of solving political redistricting problems is used to demonstrate the application of this framework. Key Words: evolutionary algorithms, geographical optimization problems, locationallocation analysis, political redistricting, spatial representation.

Durante las u ecadas, se han aplicado algoritmos evolucionarios (evolutionary algorithms, EA) a una ltimas dos d amplia variedad de problemas de optimizaci on y toma de decisiones. Sin embargo, el trabajo en los EA para an alisis geogr acos se ha realizado de manera espec ca al problema, que evita que un EA disenado para un tipo de problema se use para resolver otros problemas. En este art culo se desarrolla un marco conceptual formal para unicar el diseno on de EA para muchos problemas de optimizaci on geogr aca. El elemento y la implementaci clave en este marco es una representaci on graca que dene la estructura espacial de una amplia variedad de problemas geogr acos. Con base en esta representaci on, se discutieron cuatro tipos de problemas de optimizaci on geogr aca y se desarroll o un conjunto de algoritmos para cada tipo de problemas. Estos algoritmos se pueden usar para apoyar el diseno on de EAs para optimizaci on geogr aca. Tambi en se puede incorporar en y la implementaci el marco conocimiento espec co a los problemas de optimizaci on geogr aca. Se usa un ejemplo de resoluci on de reasignaci on de distritos pol ticos para demostrar la aplicaci on de este marco. Palabras clave: algoritmos evolutivos, problemas de optimizaci on geogr aca, an alisis de ubicaci on-asignaci on, reasignaci on de distritos pol ticos, representaci on espacial.

eographers and researchers from related disciplines have devoted a signicant amount of attention to the development of solution methods for geographical optimization problems. These problems can be found in the literature of locational analysis (Rushton 1988; Densham and Rushton 1996),

natural resource management (Murray and Church 1995; Hof and Bevers 1998), nature reserve selection (Church, Stoms, and Davis 1996), regionalization and political redistricting (Williams 1995), spatial data mining and exploratory analysis (Han, Kamber, and Tung 2001), and spatial decision making in public

Annals of the Association of American Geographers, 98(4) 2008, pp. 795817 C 2008 by Association of American Geographers Initial submission, November 2006; revised submission, July 2007; nal acceptance, October 2007 Published by Taylor & Francis, LLC.

796 and private sectors (Armstrong et al. 1991; Ghosh and Harche 1993; Bennett, Xiao, and Armstrong 2004). A common feature shared by these geographical optimization problems is that they require search for congurations of discrete spatial entities and activities that satisfy one or more objectives.1 Geographical optimization problems are often computationally intensive to solve because the number of feasible solutions may increase exponentially with the input size. This property, known as NP-completeness,2 makes it impractical to use exact solution approaches (e.g., linear programming) to search for optimal solutions in a time frame that is acceptable for many real-world applications. Suggestions have been made to replace the requirement of nding global optimal solutions by guaranteeing the closeness of solutions found to the optimum (Vazirani 2001). Such approximation methods, however, are often difcult to design and their theoretical proximity to an optimum may not be proven. To solve geographical optimization problems effectively and efciently, a number of heuristic approaches have been developed. Such an approach, however, cannot guarantee that the solutions found will be close to an optimum. Nevertheless, they can be used to nd high-quality solutions that are near optimal or optimal (Cooper 1964).3 Recent developments have been aimed toward the design of general and exible approaches to solving a broad range of optimization problems. These general methods, called metaheuristics (Osman and Kelly 1996; Gendreau and Potvin 2005) or modern heuristics (Reeves 1993), are often based on a metaphor of natural processes (e.g., biological evolution) and they typically include ant colony systems (Dorigo, Maniezzo, and Colorni 1991), evolutionary algorithms (B ack, Fogel, and Michalewicz 1997; De Jong 2006), simulated annealing (Kirkpatrick, Gelatt, and Vecchi 1983), and tabu search (Glover and Laguna 1997). Among these metaheuristic approaches, evolutionary algorithms (EAs) have shown great promise for generating solutions to large and difcult optimization problems and have been successfully used across a variety of application domains (Goldberg 1989; Forrest 1993; Houck, Joins, and Kay 1996; B ack, Fogel, and Michalewicz 1997; Xiao, Bennett, and Armstrong 2007). The purpose of this article is to develop a formal, conceptual framework that can be used to unify the design and implementation of EAs for solving many geographical optimization problems. This research is motivated by the current status of using EAs as a problem-specic ap-

Xiao proach to solving geographical optimization problems, meaning that algorithms designed for some particular problems are not necessarily suitable for other problems. In this article, I demonstrate that it is possible to develop a conceptual framework to consolidate the patchwork quilt of algorithms and representation strategies in the domain of geographical optimization problems. More details of the research background are discussed in the next section, which also provides a survey of EAs and their application in geographical problem solving. It is important to note that although the framework developed here is exible, it may not be suitable for all types of geographical problems. The following section focuses on the design of a conceptual framework for the formulation of four types of optimization problems in geographical analysis. After the algorithms designed for the framework are presented, this framework is demonstrated using an example of solving political redistricting problems. Finally, I conclude this article by positioning it in the context of computational geography and discuss extensions of and limitations to this framework.

Background and Problem Statement


The 1960s saw three similar, but independent, conceptual developments in Germany and the United States that contributed to the emergent eld of EAs, as they are called today. These developments include evolutionary programming (L. J. Fogel 1962), evolution strategies (Rechenberg 1965), and genetic algorithms (Holland 1962). Active interactions among these groups began in the 1980s, which led to the formation of new branches of research such as genetic programming (Koza 1992). The study of these algorithms collectively is called evolutionary computation and is now an area of intensive interdisciplinary research with a substantial literature established during the past two decades (see Goldberg 1989; Michalewicz 1996; B ack, Fogel, and Michalewicz 1997; De Jong 2006). An EA can generally be regarded as a computer program that simulates evolutionary processes in which the Darwinian notion of natural selection is maintained. An EA typically starts by randomly initializing a population of individuals, each of which is an encoded solution to the problem being addressed. In genetic algorithms, a solution is normally encoded as a string of bits (see Figure 1A), although other EA avors may use real numbers or integers to represent solutions.

A Unied Conceptual Framework for Geographical Optimization Using Evolutionary Algorithms


(A) 011001011

797

011001011 (B) 100011011

011011011 100001011

(C)

011011011

011010011

Generation 0 A0: 100011011 (1) B0: 110111001 (19) C0: 001010010 (1)

Parent solutions B0: 110111001 E0: 010110001 B0: 110111001 C0: 001010010

Recombination results 110110001 (17) 010111001 (15) 110110010 (14) 001011001 (6) 010110010 (10)

Mutation results 110110001 (17)

110110010 (14) 001011000 (7) 010110010 (10)

E0: 010110001 (13)

E0: 010110001 F0: 101010110

Generation 1 A1: 110110001 (17)

Parent solutions A1: 110110001 D1: 001011000

Recombination results 110111000 (20) 001010001 (4) 010111010 (12)

Mutation results 110111001 (19) 001010001 (4) 010111010 (12) 010110001 (13)

(D)

C1: 110110010 (14) D1: 001011000 (7) E1: 010110010 (10)

B1: 010111101 E1: 010110010 C1: 110110010 A1: 110110001

110111010 (16) 010110001 (13)

111111010 (17) 010110001 (13)

Generation 2 A2: 110111001 (19) B2: 001010001 (4) C2: 010111010 (12) D2: 010110001 (13) E2: 111111010 (17) F2: 010110001 (13)

Parent solutions A2: 110111001 C2: 010111010 D2: 010110001 E2: 111111010 E2: 111111010 F2: 010110001

Recombination results 110111000 (20) 010111011 (7) 010111010 (12) 111110001 (18) 110110001 (17) 011111010 (13)

Mutation results 111111000 (21) 010111011 (7) 010111010 (12)

110110001 (17) 010110010 (13)

2 Figure 1. An example evolutionary algorithm (EA). A binary representation is used for a problem that maximizes f = x1 + 2x2 x3 , where x1 , x2 , and x3 are integer variables between 0 and 7. A binary string of length 9 is used to represent a solution in which each three-bit substring encodes an integer variable. (A) An encoded solution of x1 = 3, x2 = 1, and x3 = 3. (B) A one-point crossover operation for recombination. (C) A mutation operation. (D) An example procedure of the EA where a shaded box is used to illustrate solutions in a generation. The population size in this example is 6. Each individual is marked by a letter (AF) identifying the individual in the population and a digit indicating the generation (02). Results of evolutionary operations performed during each iteration are enclosed in unshaded boxes. Individuals with a highlighted bit are manipulated by the mutation operation.

798 After initialization, each individual is evaluated and assigned a tness value based on its objective function value. Individuals that exhibit high tness values are likely to be selected as parent solutions. A new generation of individuals is then created from these parent solutions using recombination operations (Figure 1B). A small proportion of the offspring may be randomly chosen to undergo a mutation operation (Figure 1C) that introduces new solutions to the population. The processes of evaluation, selection, recombination, and mutation continue until a termination condition such as a maximum number of generations is met. Figure 1D illustrates the rst three steps (or generations) of an EA used for solving a simple problem. The overall procedure of a typical EA can be formally outlined as follows:
Algorithm EA. {General procedure} 1. t := 0 2. Initialize population P (t ) 3. repeat until a termination criterion is satised 4. Evaluate all individuals in P (t ) 5. Generate offspring from P (t ) 6. Copy offspring to P (t ) 7. t := t + 1

Xiao These applications, although mostly successful, have been mainly designed in a problem-specic manner that prevents the algorithms developed for one problem to be used on others. This situation, to some extent, contradicts the spirit of metaheuristic methods that aim to unify problem-solving approaches. Although some recent efforts have been made to design a general framework for solving optimization problems with spatial components, they still focus on a particular problem domain. Krzanowski and Raper (2001), for example, discussed the design of an EA that can be used to solve a class of set-covering problems. To address this issue, a theoretical framework is developed in this article for the purpose of unifying the design and implementation of EAs for solving a broader range of geographical optimization problems. The fundamental structure of this new framework is based on an understanding that many geographical optimization problems can be placed into several categories and design principles for evolutionary operations can be developed for each problem type.

Geographical Optimization Problems


Optimization problems addressed in the geography and related literature can be categorized using a variety of criteria (Scott 1971; Brandeau and Chiu 1989; Krarup and Pruzan 1990; Hamacher and Nickel 1998). Most previous classications are based on the mathematical aspects of problem formulation, such as the calculation of objective function values, the number of objectives, and whether a network representation is used. Although these classications are useful and can help researchers and students understand location and other optimization models, they are mainly suitable for categorizing locationallocation problems. The classication scheme for geographical optimization problems considered in this article is extended from previous work by Scott (1971, 4), who distinguishes between two types of spatially-structured combinatorial problems: network or graph-theoretic problems, and grouping and partitioning problems. In this research, geographical optimization problems (similar to the term spatiallystructured combinatorial problems used by Scott) are represented using graph theory and, more important, they are categorized according to the characteristics of spatial constraints.4 A Typology of Geographical Optimization Problems Generally, two broad classes of geographical optimization problems can be identied: those that require

The use of EAs in geographical problem solving has been investigated by a number of researchers during the last two decades. Although the initial application of EAs in solving the p -median problem was unsatisfactory (Hosage and Goodchild 1986), successful results were obtained by other researchers such as Bianchi and Church (1993), Dibble and Densham (1993), Estivill-Castro and Torres-Vel azquez (1999), Jaramillo, Bhadury, and Batta (2002), and Alp, Erkut, and Drezner (2003), who employed more effective representation strategies. The use of EAs has been extended to a wide range of geographical optimization problems. Hobbs (1996) demonstrated the use of an EA as a spatial clustering method. Bennett, Xiao, and Armstrong (2004) applied EAs to search for landscapes that exhibit interesting social, economic, and environmental values. Brookes (2001), Xiao, Bennett, and Armstrong (2002), and Xiao (2006) discussed the development of EAs that can be used to search for contiguous spatial clusters. Van Dijk, Thierens, and De Berg (2002) and Armstrong, Xiao, and Bennett (2003) successfully used EAs in the domain of cartography and spatial analysis. EAs have also been applied to planning problems where a variety of spatial constraints must be addressed (Krzanowski and Raper 1999; Huang, Cheu, and Liew 2004; Bac a o, Lobo, and Painho 2005; Li and Yeh 2005; Mooney and Winstanley 2006).

A Unied Conceptual Framework for Geographical Optimization Using Evolutionary Algorithms the partitioning (or grouping) of spatial entities and those that require the selection of a subset of spatial entities. These classes can be further differentiated according to whether the partitioning or selection is spatially constrained. The rst type of problem includes selection problems without spatial constraints, where a subset of spatial entities is selected to satisfy one or more objectives. There is no specic spatial requirement about how these selected entities should be arranged in space. Typical examples are the p -median (Hakimi 1965) and set-covering (Schilling, Jayaraman, and Barkhi 1993) problems. For the second type, selection problems with spatial constraints, entities selected must comply with some spatial constraints. Site selection problems, for example, enforce the constraint that selected entities must be contiguous (see, for example, Cova and Church 2000; Williams 2002; Xiao 2006). Conversely, harvest problems in forestry management ensure that selected plots must not be adjacent (Hof and Bevers 1998; Murray and Church 1995). Partitioning problems without spatial constraints are the third type of problem, in which each spatial entity is assigned a value with the goal of nding a combination of values for all entities such that a set of objectives can be optimized. A typical example is the study of optimal landscapes discussed by Bennett, Xiao, and Armstrong (2004). The fourth problem type refers to partitioning problems with spatial constraints. For this type of problem, space must be partitioned such that spatial constraints are satised. Typical examples include the political redistricting problem where each district must be contiguous (Williams 1995; Altman 1998a). It should be pointed out that the preceding typology is designed for the purpose of formulating geographical optimization problems in an EA domain. Although there is no doubt that different typologies can be created, the value of the classication approach discussed here is demonstrated in the remainder of this article. One may argue that a selection problem is a special case of a partitioning problem, in which spatial units are partitioned into two subdivisions with one formed by selected entities and the other by unselected entities. When spatial constraints are required, however, they normally do not apply to the unselected units. Consequently, it is necessary to distinguish selection and partitioning problems. Also important is the distinction between the types of problem with and without spatial constraints. One can argue that spatial

799

constraints are similar to other constraints because they can be formulated using linear programming techniques, but spatial constraints are different from nonspatial ones because heuristic methods that handle spatial constraints are signicantly different from the methods without such requirements. Therefore, when EAs are used to solve these problems, it is necessary to give spatial constraints unique considerations. Formulating Geographical Optimization Problems Using Graph Theory Graph theory provides a exible approach to specifying the relationships among spatial entities and has been widely used in the optimization literature (see, for example, Scott 1971; Evans and Minieka 1992). In this research, graphs are key components to formulate geographical optimization problems in the context of EAs. Based on this formulation, a formal classication of the types of geographical optimization problems is discussed in the next section. A graph is dened as G = (V , E ), where V = {v1 , v2 , . . . , vn } is a set of n vertices, E is a set of edges with each edge comprised of two vertices, and edge (vi , v j ) E if and only if vertices vi and v j are directly connected (Diestel 2000, 2). Using this notation, a vertex can be regarded as a spatial entity in a study area, and E denes the spatial structure of all vertices (see Figure 2A). Besides the graph, two additional sets that describe the attributes of vertices and edges also can be dened. The vertex attribute set is A = {a 1 , a 2 , . . . , a n }, where a i (1 i n ) contains a set of attributes for the i th vertex in V . The basic attribute for an edge is the distance (or other metric of spatial interaction) between the vertices at the two ends. This distance can be extended to a set D that depicts the connection (e.g., distance or transportation cost) between any two vertices. D is essential for solving many spatial problems. Intuitively, D can be regarded as a matrix and d i j D is the measure of cost between any two vertices, i and j , in V . Given graph G that represents a geographical optimization problem, a feasible solution to the problem is denoted as another graph G = (V , E ), where V is a set of vertices that forms the solution, and E depicts the spatial structure of the solution. For selection problems, V is a subset of V . For partitioning problems, V contains the same vertices of V , but those vertices that belong to one particular partitioning subdivision can be grouped as a unique subset. Thus, V can be denoted as a set of p subdivisions {Ui }, where

800

Xiao

2 1 (A) 7 5

3 4 6 9

V={1, 2, 3, 4, 5, 6, 7, 8, 9,10} E={(1,2), (1,7), (1,10), (2,3), (2,5), (3,4), (3,5), (3,6), (4,6), (4,9) (5,6), (5,7), (6,7), (6,8), (6,9) (7,8), (7,10), (8,9), (8,10)}

10

2 1 (B) 10 7 5

3 4 6 9 8 V'={2, 6, 9} E' = *

2 1 (C) 7 5

3 4 6 9 V'={3, 6, 7} E'={(3,6), (6,7)}

10

2 1 (D) 10 7 5

3 4 6 9 8 V'={{1,2,4,5,6,8}, {3,7,9,10}} E' = *

2 1 (E) 7 5

3 4 6 9

V'={{1,2,5,7,8,10}, {3,4,6,9}} E' = {(1,2), (1,7), (1,10), (2,5), (3,4), (3,6), (4,6), (4,9) (5,7), (6,9), (7,8), (7,10) (8,10)}

10

Figure 2. A graph representation for geographical optimization problems. In each row, the left gure is a hypothetical map that shows the distribution of the spatial units, and the gure in the middle is an illustration of the corresponding graph, which is formally denoted using the notations on the right. (A) A graph with ten vertices and a set of edges. (B) An example selection problem without spatial constraints (three vertices are selected). (C) An example problem that selects a contiguous set of three vertices. (D) A partitioning problem of two subdivisions (shaded and unshaded) without spatial constraints. (E) A partitioning problem that requires two contiguous subdivisions.

A Unied Conceptual Framework for Geographical Optimization Using Evolutionary Algorithms Ui is the i th subdivision (1 i p ). The spatial relationship between vertices in {Ui } is dened by E , which is often a subset of E . It should be noted that a subdivision need not be spatially contiguous. Instead, in some cases, partitioning can refer to an aspatial classication, in which each unit (represented as a vertex) is assigned an integer that indicates a particular class (see, for example, Armstrong, Xiao, and Bennett 2003; Bennett, Xiao, and Armstrong 2004). In summary, the input of a geographical optimization problem consists of sets G, D , and A. The goal of solving such a problem is to nd a solution G that, without loss of generality, minimizes a set of k objective functions: F = ( f 1 , f 2 , . . . , f k ), where each objective function can be denoted as f i : G G D A (1 i k ). Graph TheoryBased Formal Problem Types The four problem types already discussed can be formally denoted based on the relation between V and V and the characteristics of E . Two kinds of relations between V and V can be recognized:
Table 1. Relations between V and V and between E and E
Relation Code Meaning Subset Equal size Do not care V and V V V | Ui | = |V | E and E E E E

801

contiguous. In the example of Figure 2E, two contiguous subdivisions are created. Table 1 lists a set of Greek letters that can be used to represent the relation types discussed here. Using this notation, each type of problem can be identied by a combination of two conditions: the relation between V and V , and between E and E . Each combination is denoted by a string of two letters delimited by a slash (/):

r / : selection problems without spatial constraints r / : selection problems with spatial constraints r E / : partitioning problems without spatial conr E / : partitioning problems with spatial constraints
Because all geographical optimization problems require the arrangement of spatial units, it is unrealistic to have a type of V = . Additionally, if E is identical to E , the solution will have exactly the same structure of the input, which makes the problem trivial to solve. Therefore, problem type of E = E is not included in the typology. Finally, if a dot () is used to denote all possible conditions, one can specify the following general problem types: straints

r V V . V is a subset of V . Selection problems belong to this group (see Figures 2B and 2C, where three vertices are selected). r | Ui | = |V |. In this case, all vertices in V are used to construct a feasible solution and therefore V has the same size as V , although vertices in V may be assigned to different subdivisions or categories (denoted as Ui ). Partitioning problems have this characteristic (see Figures 2D and 2E, where two subdivisions are created). For the relations between E and E , it is known that E denes the spatial relations among all vertices in V , and that, equivalently, E connes the spatial relations among vertices in a solution. One can identify the following two relations between E and E :

r r r r r

/: selection problems E /: partitioning problems / : problems with spatial constraints / : problems without spatial constraints /: all geographical optimization problem types

r E = . For problems without spatial constraints, spatial relations among vertices in a solution are not needed and it is unnecessary to specify the explicit contents of E . Note that E = means that E is not needed in the solution representation (or we do not care), but it does not imply that a spatial relation among vertices in V does not physically or logically exist (see Figures 2B and 2D). r E E . Selection or partitioning problems with spatial constraints belong to this category. In the example of Figure 2C, the selected three vertices are

EA Design for Geographical Optimization Problems


Using the graph representation developed in the previous section, a set of principles for the design of EAs can be specied for each type of geographical optimization problem. The purpose here is not to develop universal operations that can be immediately plugged in to solve all problems. Instead, the focus is placed on the general principles that can be applied (and possibly extended)

802 to specic problems. An example of using a subset of these principles is presented in the next section. Encoding Strategies Although the use of a binary representation is useful for many optimization problems (Goldberg 1989), it has become common for researchers to choose or design a representation technique that is natural to the problem being addressed (Falkenauer 1994; Michalewicz 1996; D. B. Fogel and Angeline 1997). Geographical optimization problems can be conveniently encoded using graphs. For selection problems ( /), where V is a subset of V , it is not necessary to record the location of each vertex in a solution. Instead, the unique identication number of each vertex in an individual solution can be directly used. In a p -median problem, for example, each individual contains an array of p integers that represent the facility nodes. For partitioning problems ( E /), because | V | = |V |, a string of n integers can be used to represent feasible solutions; the value of the i th element of the string indicates the subdivision assigned to the corresponding spatial entity. For problems with spatial constraints (/ ), edge information must also be stored so that the spatial constraints can be effectively formulated and maintained. There are different approaches to storing edge information. In some previous studies, edge information is explicitly recorded for each vertex in an individual EA solution (Xiao, Bennett, and Armstrong 2002). This strategy, however, may be inefcient because whenever a solution is changed (e.g., some vertices in the solution are modied), edge information for all vertices in that solution must be accordingly updated. This issue can be addressed by making edge information, or E , available to the entire EA as a global variable. While E is available, it is unnecessary to store edge information for each individual vertex redundantly. Consequently, the data structure for / remain in the form of a string of integers (as identication numbers). Spatial Constraint Handling Three general strategies of constraint handling can be identied from the EA literature. The rst approach uses a specically designed encoding method such that infeasible solutions will not occur during a solution process. A decoding method is needed to translate the encoded information to a feasible solution. This method has been used to solve the traveling salesperson problem

Xiao (Grefenstette et al. 1985). Although this represents an elegant way of handling constraints, it is impractical to design a general encoding method that can be used for many geographical optimization problems. The second type of constraint handling method is based on a penalty function that can be used to decrease the tness values for infeasible individual solutions such that they are unlikely to be included in the next generation (Michalewicz 1996). This approach has been especially effective for numerical optimization problems. Many geographical optimization problems, however, often have a large number of infeasible solutions in an EA population, which makes it ineffective to use penalty functions to promote feasibility that only exhibit in a small number of individuals (see Bergey, Ragsdale, and Hoskote 2003). The third type of constraint handling method relies on various algorithms that will only create feasible solutions, sometimes through a repair mechanism that converts infeasible solutions to feasible ones (Michalewicz 1996). Previous research has found this method to be exible and effective when applied to geographical problems (Xiao, Bennett, and Armstrong 2002; Bergey, Ragsdale, and Hoskot 2003; Xiao 2006). Although the implementation of this approach can be problem dependent, with the graph representation dened it is possible to design algorithms for classes of problems that share common properties. The EA framework for solving geographical optimization problems discussed in this article is based on this approach. Design of Initialization Strategies For selection problems ( /), an initialization operation (called Algorithm I 1) can be designed based on the use of an accretion procedure in which a feasible solution is constructed from a set of seed vertices. In this algorithm, and others that follow, variable p is used to denote the number of vertices to be selected, or the number of subdivisions to be partitioned, and set V is always used to denote the set of vertices in a feasible solution.
Algorithm I 1. {Accretion for selection} Input: V , E , p Output: V 1. i := 0, V := , V1 := V 2. repeat until i = p 3. Randomly select a vertex (v ) from V1 4. Add v into V 5. Update V1 so that it only contains eligible vertices 6. i := i + 1

A Unied Conceptual Framework for Geographical Optimization Using Evolutionary Algorithms


1 5 (A) 9 13 10 14 11 15 12 16 9 13 10 14 11 15 12 16 9 13 10 14 11 15 12 16 2 6 3 7 4 8 1 5 2 6 3 7 4 8 1 5 2 6 3 7 4 8

803

V'={6} V1 = {2,5,7,10}

V' = {6,7} V1={1,2,3,5,8,9,10,11}

V' = {6,7,10} V1 = {1,2,3,5,8,9,11,14}

1 5 (B) 9 13

2 6 10 14

3 7 11 15

4 8 12 16

1 5 9 13

2 6 10 14

3 7 11 15

4 8 12 16

1 5 9 13

2 6 10 14

3 7 11 15

4 8 12 16

V' = {6} V1 = {1,3,4,8,9,11,12,13,14,15,16}

V' = {6,8} V1 = {1,3,6,11,13,14,15,16}

V' = {6,8,14} V1 = {1,3,9,11,16}

Figure 3. Initialization for (A) a constraint that requires all selected vertices to be contiguous and (B) a requirement that does not allow selected entities to be adjacent. For both (A) and (B), the process starts from the left gure and ends at the right, as shown by the arrows. A black circle represents a selected vertex, and a gray circle represents an eligible vertex maintained in V1.

In Algorithm I 1, a vertex is called eligible if it can be added into V without violating spatial constraints. Here, a set V1 is maintained to contain only eligible vertices. Figure 3 illustrates example initialization strategies for two kinds of spatial constraints. When no spatial constraint is required, every unselected vertex is eligible and V1 will contain all unselected vertices in V . A second initialization operation (Algorithm I 2) can be designed to generate initial feasible solutions to partitioning problems ( E /). Step 3 of Algorithm I 2 will yield V = {Ui }, where Ui (i = 1, . . . , p ) contains one and only one unique vertex that serves as the seed of the i th subdivision. The specic procedure used in step 4 may vary for different problems, although the general principle of the algorithm still holds and the concept of eligible vertices can also be applied. Figure 4 shows an example of this procedure for problems such as political redistricting. When no spatial constraint is specied, step 4 can be implemented in a random fashion with each vertex randomly assigned to a subdivision (see Bennett, Xiao, and Armstrong 2004).

Algorithm I 2. {Accretion for partitioning} Input: V , E , p Output: V 1. i := 0, V := 2. Randomly select p vertices from V 3. Add each of the p vertices to a subdivision (Ui ) in V 4. Assign each of the remaining n p vertices in V to a subdivision

Design of Recombination Operations The recombination operations reported in the EA literature share a similar behavior: The components from two selected parent individuals are used to create new child individual solutions by exchanging their components (cf. Figure 1B). For geographical optimization problems, an overlay and repair approach can be used to recombine two individual solutions and to create new ones. In this approach, an overlay operation is conducted rst to combine the vertices of two solutions into a temporary set, which normally will contain more vertices than a feasible solution (see examples that follow).

804

Xiao

Figure 4. Initialization for a partitioning problem that requires all subdivisions (two in this example) to be contiguous. The four gures show the progress of the process (indicated by the arrows) starting from two seed vertices (6 and 11), each assigned to a different subdivision.

A repair procedure is therefore needed to form a feasible solution. For selection problems ( /), the overlay operation will result in a superset that contains at least one feasible solution. The repair process can be developed by iteratively selecting vertices from the superset and adding them to a partial solution until a feasible solution is created. A recombination operation, called Algorithm R 1, based on this mechanism is outlined as follows:
Algorithm R 1. {Overlay and repair: selection} Input: V , E , V1 , V2 , p Output: V 1. V := , V3 := V1 V2 2. repeat until |V | = p 3. Randomly select a vertex (v ) from V3 4. Create a partial solution using v and V 5. if the partial solution does not violate spatial constraints then 6. V := V v 7. Remove v from V3

For some selection problems, however, the overlay and repair procedure may not always yield solutions

that are different from the parent solutions. For example, in cases when the two parent solutions V1 and V2 do not have overlapping vertices (i.e., V1 V2 = ), the repair procedure will return either V1 or V2 . To address this issue, Xiao (2006) developed a local search approach in which new solutions are created based on a single (instead of two) individual solution. Although the mechanism of such an operation seems to be similar to a mutation method (described in the next section), it performs in a different fashion. First, a local search is conducted based on the tness values of individuals. That is, individuals with high tness values have a high chance to be modied using the local search method. Mutation operations, however, are often performed regardless of tness values. Second, a local search is often designed to improve individual solutions in terms of their objective function values, whereas mutation operations tend to introduce randomness into an existing solution that may not necessarily be improved. This type of operation is also called an asexual crossover or transposition in which new (better) solutions are created based on a single solution (see Sim oes and Costa 2000).

A Unied Conceptual Framework for Geographical Optimization Using Evolutionary Algorithms


1 4 7 2 5 8 (A) 3 6 9 1 4 7 2 5 8 (B) 3 6 9 1 4 7 2 5 8 (C) 3 6 9 1 4 7 2 5 8 (D) 3 6 9

805

Figure 5. Overlay and repair for a partitioning problem ( p = 3) that requires contiguity of each subdivision. A cell here represents a vertex or a spatial unit and an edge exists between two cells if they are adjacent and on the same row or column. Numbers are used to uniquely identify each cell. (A) A hypothetical parent solution; (B) another hypothetical parent solution; (C) the result of overlay that creates ve subdivisions; (D) a possible result of a repair procedure.

For partitioning problems with spatial constraints ( E / ), a set of new distinct subdivisions will be generated after the overlay operation. Figure 5, for example, shows the result of overlaying two solutions in (A) and (B) to yield an intermediate solution as shown in (C), which can be repaired as shown in Figure 5D. This operation is outlined in Algorithm R 2. For the particular example in Figure 5, step 1 will generate a set V3 = {{1,2}, {3,5,6}, {4}, {7,8}, {9}} and the merging process (steps 46) merges the subdivisions of {1,2} and {4}, and of {7,8} and {9}.
Algorithm R 2. {Overlay and repair based on merging} Input: V, E, V1 , V2 , p Output: V 1. V3 : = {Ui | Ui is a subdivision after overlaying V1 and V2 } 2. V : = 3. if |V3 | = p , then V := V3 and stop 4. repeat until |V3 | = p 5. Randomly select two subdivisions, Ui and U j , from V3 6. Merge Ui and U j if doing so does not violate the constraints 7. V := V3

Design of Mutation Operations A simple, straightforward way of introducing randomness into a current EA population is to replace an existing solution with a new one created using an initialization operator. This method, called Algorithm M 1, is described next.
Algorithm M 1. {Mutation: reinitialization} Input: V, E, V Output: V 1. Create a new solution V using Algorithm I 1 or I 2

Algorithm R 2 also can be used to solve partitioning problems without spatial constraints ( E / ) by simply merging (Ui ) and (U j ) in step 6 without checking constraint violation. A more exible recombination approach, called Algorithm R 3, for this type of problem can be designed to assign a vertex to one of the subdivisions in its parent solutions.
Algorithm R3. {Recombination: partitioning without spatial constraints} Input: V, E, V1 , V2 , p Output: V 1. V := {Ui |Ui = , 1 i p} 2. for each vertex v in V 3. Randomly set i to be one of vs subdivision in V1 and V2 4. Add v to Ui in V

A more complicated mutation approach is to create a new solution based on an existing one (see Algorithm M2). For this type of algorithm, a mechanism is designed to modify the morphology of an existing solution by exchanging some of its vertices either with unselected vertices (for selection problems) or among different subdivisions (for partitioning problems). Here a set V2 is maintained to contain all moveable vertices in V . A vertex is moveable if it can be removed from V without creating a partial solution that violates spatial constraints, if any. A repair procedure is needed subsequently to create a feasible solution. For problems without spatial constraints, all vertices in V are moveable and V2 is identical to V .
Algorithm M 2. {Mutation: morphing} Input: V , E , V Output: V 1. Find all moveable vertices in V and put them in V2 2. Randomly select a vertex (v ) from V2 3. UPDATE(V , v )

Step 1 of Algorithm M 2 yields a set of all moveable vertices in V . In step 3, function UPDATE is given in a general form without further specications; this is because each specic problem may have different requirement and it is impossible to have a one-size-ts-all algorithm. In essence, however, function UPDATE(V , v ) repairs V by replacing v with a new vertex (selection problems) or assigning v to a new subdivision

806 (partitioning problems); this function also guarantees that the spatial constraint, if any, not be violated. Let us consider the solution in Figure 3A (the gure to the right) as an example. Here, because we have V2 = {7, 10}, we can remove either vertex 7 or 10 and the resulting partial solution, {6, 10} or {6, 7}, respectively, still satises the contiguity constraint. If vertex 7 is removed, vertices 2, 3, 5, 9, 10, 11, and 14 become eligible vertices, each of which can be added into the partial solution (caused by the removal of vertex 7) to form a feasible solution. Algorithm M 2 can be tailored for problems with or without spatial constraints. Moreover, it is possible to devise randomized versions of Algorithm M2 specifically for problems without spatial constraints (/ ). These algorithms, M3 for selection problems and M 4 for partitioning problems, as outlined next, swap the memberships of two randomly selected vertices.
Algorithm M 3. {Mutation for selection without spatial constraints} Input: V, E, V Output: V 1. Randomly select a vertex (v ) from V, v /V 2. Randomly select a vertex (v ) from V 3. Remove v from V 4. Add v into V Algorithm M 4. {Mutation for partitioning without spatial constraints} Input: V, E, V Output: V 1. Randomly select two subdivisions Ui and U j 2. Randomly select a vertex (vi ) from Ui 3. Randomly select a vertex (v j ) from U j 4. Assign vi to subdivision U j 5. Assign v j to subdivision Ui

Xiao EA. An example of using macro-level problem-specic knowledge is the design of representation strategy in solving the p -median problem with EAs (or genetic algorithms in this specic case). An early application of genetic algorithms to the p -median problem by Hosage and Goodchild (1986) showed relatively poor performance, partly due to the representation strategy of using a binary string with a length equal to the number of all demand nodes. Since then, signicant improvements were made by researchers when they changed the representation of the problem by using an integer string with the length equal to the number of facilities to locate (Bianchi and Church 1993; Dibble and Densham 1993). Studies along this research line have demonstrated that EAs can be successfully used to nd high-quality solutions to p -median (Estivill-Castro and Torres-Vel azquez 1999; Alp, Erkut, and Drezner 2003; Mladenovic et al. 2007). Another example of incorporating macro-level problem-specic knowledge into EAs is the use of existing solutions to a problem. Bennett, Xiao, and Armstrong (2004), for example, found that solutions generated by other methods can be used as part of an initial EA population and can improve the overall EA performance. The second type of approach to incorporating problem-specic knowledge is often implemented on a micro level because of the use of existing heuristic methods in the design of evolutionary operations in EAs. This idea follows the trend of hybridization between EAs and other search algorithms that are often derived from classic problem-specic algorithms (Fox 1993; Anderson 1996; Preux and Talbi 1999). Previous research has shown that the use of a hybridization strategy in EAs can greatly improve EA performance (Krzanowski and Raper 1999; Ruiz-Andino et al. 2000; Lin, Hwang, and Wang 2001). In the algorithms designed in this article, many steps are executed in a random fashion. For example, step 3 of Algorithm R 1 species that a vertex is randomly selected from a set. This operation, however, can be designed in a more heuristic way by utilizing problemspecic knowledge. Many researchers have imposed a greedy operation such that this process will only improve the solution being considered. The next section discusses the implementation of some knowledge-based EA operations. Although the use of problem-specic knowledge in EA design has shown generally positive results, hybridization must be executed with care, especially for large-size problems with a great number of local

Incorporating Problem-Specic Knowledge Knowledge about solutions to geographical optimization problems has proven to be useful in the development of traditional heuristic methods (see Densham and Rushton 1991, 1992). Such knowledge can also be incorporated into the design of EAs for solving geographical optimization problems. A variety of approaches can be used in all stages of EA development (representation, initialization, and evolutionary operations). More specically, in this article, I categorize potential incorporation strategies into two types. The rst type of incorporation approach can be implemented on a macro level in which knowledge about the entire problem and its solutions is used in an

A Unied Conceptual Framework for Geographical Optimization Using Evolutionary Algorithms optimal solutions. The randomness used in EAs underlines a sense of emergence, a hope that high-quality solutions will emerge from a random start (Holland 1975, 1998). Excessive use of problem-specic knowledge may decrease the ability of the hybrid algorithm in escaping local optima.

807

Applications
The algorithms already described provide a guideline for the design and implementation of EAs to solve the four types of geographical optimization problems. In general, an EA designed using this framework can be denoted as a 4-tuple:
GEA = C , I , R , M ,

where C is / , / , E / , or E / referring to the representation and encoding strategy, I is a set of initialization algorithms, R is a set of recombination methods, and M is a set of mutation methods. Table 2 summarizes these algorithms and lists example EAs for each problem type. For each of the evolutionary operations, it is possible to have multiple implementations. For example, both M1 and M2 can exist in an EA, but only one of them will be (randomly) chosen at one time. A variety of applications can be used to demonstrate the use of the framework set forth in this article. An example of using this framework to solve partitioning problems without spatial constraints is illustrated in the work of Bennett, Xiao, and Armstrong (2004), where the combination of land use types is sought to satisfy objectives such as maximizing environmental benet and minimizing public investment. Xiao (2006) employed some major concepts of this framework (e.g., graph representation, recombination, and mutation) for selection problems with spatial constraints in a case study of site search problems where a contiguous set of land parcels must be identied. For selection problems without spatial constraints, recent EA implementations in solving the p -median problem employed representation strate-

gies and evolutionary operations that are similar to what has been discussed here. In this section, the use of the EA framework is demonstrated by applying it to a partitioning problem with spatial constraints ( E / ). More specically, political redistricting problems are addressed. Redistricting problems are critical in the political system of the United States and solving them requires subdivision of a region (e.g., a state) into a number of districts that are as equal in population as possible. Two fundamental criteria (i.e., contiguity and population equality) are required by law, although many states may have additional requirements such as compact districts. A feasible redistricting plan must be contiguous, meaning that the spatial units of a district form an entire region and one can walk between any two points in the district without leaving it. Each district, of course, also must have at least one spatial unit. Political redistricting problems have been studied from various perspectives in the geography and related literature (Morrill 1976, 1981; Williams 1995; Eagles, Katz, and Mark 2000). This type of problem has been considered as a typical NP-complete problem (Altman 1998a);5 exact solution approaches are generally inefcient and many heuristic methods have been developed (Williams 1995). In practice, redistricting plans are often created using an interactive computer program such as a geographical information system (GIS) that allows users to modify the membership of each spatial unit and monitor the demographic impact when a change to a plan is made (Altman, MacDonald, and McDonald 2005).6 Although these tools are useful, they typically rely on manual, interactive processes that may not provide a sufcient guide to practitioners with high-quality solutions. The focus of this section is the application and evaluation of the EA framework in solving a particular type of problem. To keep the discussion relatively simple, the application concentrates on population equality, the primary concern of political redistricting, but district compactness is not explicitly considered.7 Being

Table 2. EA operations for different geographical optimization problem types


Problem type / / E/ E/ Initialization I1 I1 I2 I2 Recombination R1 R1 R2 R 2, R 3 Mutation M 1, M 2 M 1, M 2, M 3 M 1, M 2 M 1, M 2, M 4 Example EAs / , I 1, R 1, { M 1, M 2} / , I 1, R 1, M 3 E / , I 1, R 1, M 2 E / , I 1, R 1, { M 2, M 4}

Note: EA = evolutionary algorithm.

808 aware of the variety of existing solution approaches in the literature (see Morrill 1981; Williams 1995), I note that a full discussion of using EAs in this area warrants a longer and more detailed article, especially in a multiobjective context where EAs have been widely used (Deb 2001; Xiao, Bennett, and Armstrong 2007). An EA that can be used to nd high-quality redistricting plans can also be extended to include more criteria such as compactness and minority representation. The problem has population equality as the single objective function: min y = 100 1 P
r

Xiao
2. V := 3. if |V3 | = p , then V := V3 and stop 4. repeat until |V3 | = p 5.1 Randomly select a subdivisions, Ui from V3 5.2 Choose the subdivision with smallest population (U j ) from the neighboring subdivisions of Ui 6. Merge Ui and U j 7. V : = V3

The mutation operations used here also have two versions: random and knowledge-based. The random version is designed using the guideline specied in Algorithm M 2 and is called M2 R .
Algorithm M 2 R . {(Random) mutation for redistricting} Input: V, E, V Output: V 1. Find all moveable vertices in V and put them in V2 2. Randomly select a vertex (v ) from V2 3. Randomly assign v to one of its neighboring districts

| p j P |,
j =1

where P is the total population, r is the number of districts to be created, P is the ideal population for all districts (computed as the rounded integer value of P /r ), and p j is the population size of the j th district. In other words, the goal of solving the problem is to minimize the overall population deviation of each district from the ideal population. EA Implementation for Political Redistricting Problems The EA used to solve political redistricting problems can be denoted as E / , I 2, R 2, ( M1, M2) . The procedure outlined in Algorithm I 2 (illustrated in Figure 4) is used to create initial redistricting plans. For recombination operations, two versions are created, both based on Algorithm R 2 (see also Figure 5). The rst version of recombination operation is a straightforward application of Algorithm R 2 where step 5 is randomly executed. In the second version, problem-specic knowledge is incorporated in step 5 in Algorithm R 2. Here, instead of randomly choosing two subdivisions after overlay to merge, the algorithm (called R 2 K , details provided later) randomly chooses one (step 5.1) and then chooses another subdivision that has the smallest population (step 5.2). In this way, the merging process (step 6) is encouraged to produce new subdivisions with low population deviation from each other to pursue the goal of population equality.
Algorithm R 2 K . {Overlay and repair based on merging, with knowledge} Input: V, E, V1 , V2 , p Output: V 1. V3 : = {Ui |Ui is a subdivision after overlaying V1 and V2 }

The knowledge-based mutation operation is called M 2 K , also designed based on Algorithm M 2. Algorithm M2 K differs from M 2 R in terms of how a moveable vertex is chosen in step 2. Here, the moveable vertex with the smallest population is chosen. In this way, the overall population of each district will be least disturbed when a new solution with a different spatial conguration is created.
Algorithm M 2 K . {(Knowledge-based) mutation for redistricting} Input: V, E, V Output: V 1. Find all moveable vertices in V and put them in V2 2. Select from V2 the vertex (v ) that has the smallest population 3. Randomly assign v to one of its neighboring districts

Based on the preceding implementations, the EA for political redistricting, named GEAPR, is outlined next. The output of GEAPR is a set of solutions {(V , E )}, one of which is the best solution found by the EA that exhibits the lowest objective function (population deviation) value. GEAPR outlined next can be implemented in two versions: random and knowledge based. The difference is in steps 5.2 and 5.3. For the random version, Algorithm R 2 is used in step 5.2 for recombination and the mutation operation in step 5.3 is executed by randomly choosing between Algorithms M1 and M 2 R . The knowledge-based version uses Algorithm R 2 K in step 5.2 and randomly chooses between M1 and M 2 K in step 5.3.

A Unied Conceptual Framework for Geographical Optimization Using Evolutionary Algorithms


Algorithm GEAPR {EA for a selection problem with spatial constraints} Input: V , E Output: {(V , E )} 1. t := 0 2. Initialize population P (t ) using Algorithm I 1 based on the E / encoding 3. repeat until a user-specied termination criterion is met 4. Evaluate each individual in P (t ) 5. Generate offspring of P (t ) by 5.1 Selecting parent solutions from P (t ) 5.2 Using a recombination operation to create new individuals 5.3 Applying a mutation operation 6. Copy the new individuals to P (t ) 7. t := t + 1
19 13 19 20 14 12 18 16 15 18 16 16 12 10 13 18 16 11 12 13 15 19 13 19 17 19 13 19 20 14 12 18 16 15 18 16 16 12 10 13 18 16 11 12 13 15 19 13 19 17

809

19 13 19 20 14 12 18 16 15 18 16 16 12 10 13 18 16 11 12 13 15 19 13 19 17

(A)

(B)

(C)

Figure 6. Redistricting of twenty-ve spatial units in data set Grid1 into two (A), three (B), and four (C) districts. The population of each spatial unit is shown. Thick black lines indicate the boundaries between districts.

In step 4 of GEAPR, the objective function value is calculated for each solution and then the tness value is ymax y , where ymax and ymin are the maxcomputed as ymax ymin imum and minimum objective function values of all solutions in the current EA population, respectively, and y is the objective function value of the individual solution being evaluated. In this application, a proportional selection approach (Goldberg 1989) is utilized in step 5.1 where the probability of the i th individual to be selected is calculated as Nfi f , with the denominator
j =1 j

being the sum of tness values of all solutions in a population and f i the tness of the i th solution. The actual selection process can be described as throwing a ball onto a roulette wheel, where solutions with high tness values occupy a big sector and thus have a high chance of being selected. Each time, the algorithm selects two individuals (step 5.1) and, by a high chance (95 percent here), a recombination operation is used to create a new individual (step 5.2). Then, the new individual will have a small chance (5 percent here) of undergoing a mutation operation (step 5.3). This process continues until the number of new solutions equals the number of solutions in the current generation. At the end of each iteration, the newly created population is used to replace the current population and the process repeats (steps 6 and 7). Computational Experiments Two types of data sets are used to test GEAPR. The rst type of data set includes three population grids called Grid1, Grid2, and Grid3 of different sizes (see the rst two columns of Table 3); each cell in a grid represents a hypothetical spatial unit (a county or cen-

sus block, for example). The population of each cell is randomly determined and is shown in the following maps. The total population for each data set is shown in the third column of Table 3. Each data set is used to dene three redistricting problems with different numbers of districts (r ), as listed in the fourth column of Table 3. In the same table, the population size and number of iterations used in GEAPR are listed in the fth and sixth columns, respectively. These are often problemspecic EA parameters and must be determined based on a method of trial and error. The values adopted here have been determined to be reasonable in terms of their impact on EA performance during numerous prior experiments. To evaluate the performance of GEAPR, it is necessary to know the closeness of the best solution found to the global optimal solution. Although it is difcult to display the spatial conguration of the global optimal solution, it is possible to derive the objective function value of the theoretical global optimal solutions.8 These values are listed in the seventh column of Table 3. The CPU time reported in the table shows that knowledgebased approaches are slightly more efcient for most of the cases. This can be explained by examining algorithms R 2 and R 2 K . In Algorithm R 2 (for the random version), the repair process (steps 4, 5, and 6) does not guarantee that merging two randomly selected subdivisions will not violate the contiguity constraint, and the algorithm may go through a large number of these operations until a feasible solution can be created. Algorithm R 2 K , however, does not have this problem as each subdivision picked in step 5.1 will have at least one neighboring subdivision and merging can be conducted. The random and knowledge-based versions of GEAPR are executed ten times for each redistricting problem, and the minimum, median, and maximum objective function values found during the ten runs are reported in Table 3. The maps of the best solutions found are shown in Figures 6, 7, and 8. The

810
34 38 21 45 19 40 28 41 32 47 49 17 12 45 41 26 23 26 20 39 22 35 24 10 19 28 46 25 28 20 30 39 50 31 30 49 29 48 18 20 15 45 20 36 37 36 26 22 31 33 34 15 30 19 42 47 17 18 15 48 15 23 46 14 30 41 12 27 20 25 26 50 30 29 24 25 28 18 35 13 47 49 14 10 43 32 14 32 26 31 24 15 23 23 47 49 41 16 34 21 34 38 21 45 19 40 28 41 32 47 49 17 12 45 41 26 23 26 20 39 22 35 24 10 19 28 46 25 28 20 30 39 50 31 30 49 29 48 18 20 15 45 20 36 37 36 26 22 31 33 34 15 30 19 42 47 17 18 15 48 15 23 46 14 30 41 12 27 20 25 26 50 30 29 24 25 28 18 35 13 47 49 14 10 43 32 14 32 26 31 24 15 23 23 47 49 41 16 34 21

Xiao
34 38 21 45 19 40 28 41 32 47 49 17 12 45 41 26 23 26 20 39 22 35 24 10 19 28 46 25 28 20 30 39 50 31 30 49 29 48 18 20 15 45 20 36 37 36 26 22 31 33 34 15 30 19 42 47 17 18 15 48 15 23 46 14 30 41 12 27 20 25 26 50 30 29 24 25 28 18 35 13 47 49 14 10 43 32 14 32 26 31 24 15 23 23 47 49 41 16 34 21

Figure 7. Redistricting of 100 spatial units in data set Grid2 into three (A), four (B), and ve (C) districts. The population of each spatial unit is shown. Thick black lines indicate the boundaries between districts.

(A)

(B)

(C)
new, random solutions to the population in each iteration using the mutation operations. The second type of experiment is based on the Iowa congressional redistricting case using the 2000 census data. The Iowa constitution provides that the counties shall not be split.9 Subsequently, the ninety-nine counties are used to create ve congressional districts to minimize the deviation from the ideal population. A parameter setting of GEAPR similar to that of Grid2 with r = 5 is adopted here. In this particular case, the total population is 2,926,324 and the theoretical global optimal solution should have an objective function value of 0.0003. However, it is difcult to prove that such a solution would exist with the real data. Figure 10A shows the best solution found by the EA using the knowledge-based version, and Figure 10B shows the ofcially adopted redistricting plan. It can be noted that

results clearly demonstrate that GEAPR can be used to nd high-quality solutions to the experimental problems used here. Knowledge-based approaches can possibly nd good solutions more quickly (see the columns marked Last in Table 3) and, over ten independent runs, have a better chance to nd good (if not optimal) solutions. Note that the theoretical optimal objective function values were not reached for the two problems (Grid3 with r = 10 and 20), although we should realize that these theoretical optimal solutions may not exist. Figure 9 shows the performance of GEAPR for Grid1 with four districts (r = 4). The general trend of decreasing minimal, median, and maximal objective function values of each iteration suggests the effectiveness of the EA in improving current solutions. The irregular curve marked Max in the gure indicates the introduction of

Table 3. Test data and results


Random EA Data Grid1 n 55 Populationa 384 r 2 3 4 3 4 5 5 10 20 Popsizeb Iterationsc 50 50 50 50 50 50 50 50 50 500 500 500 1,000 1,000 1,000 1,500 1,500 1,500 Optd 0 0 0 0 0 0.07 0.0027 0.0040 0.0040 Mine 0 0 0 0 0 0.07 0.0027 0.0228 0.1594 Medianf 0 0 1.56 0 0.07 0.18 0.0039 0.0790 0.4861 Maxg Lasth Timei Min 0 0 0 0 0 0.07 0.0027 0.0174 0.0710 Knowledge-based EA Median 0 0 0.52 0 0 0.11 0.0043 0.0308 0.0978 Max Last Time

Grid2 10 10 Grid3 25 40

2,952 74,663

0 5 7 1.04 84 7 4.16 141 7 0.07 210 91 0.20 429 83 0.41 519 81 0.0230 1,039 21,506 2.4952 1,319 20,195 4.8605 1,387 19,755

0 4 7 1.56 33 6 1.56 73 6 0 207 73 0.07 290 66 0.14 618 64 0.0113 862 19,496 0.0440 1,185 18,901 0.2404 1,292 19,832

Note: EA = evolutionary algorithm. a The total population of all spatial units in a data set. b The number of individual solutions used in GEAPR. c The total number of iterations of GEAPR. d The theoretically possible optimal objective function value. e The objective function value of the best solution found in the ten runs. f The median of the best objective function values found in the ten runs. g The objective function value of the worst solution found in the ten runs. h Average last iteration when a better solution is found. i Average CPU time used for each run (in seconds).

A Unied Conceptual Framework for Geographical Optimization Using Evolutionary Algorithms

811

80 89 98 60 93 90 58 74 77 96 58 83 83 54 53 80 74 66 70 91 85 71 50 78 50 84 55 65 50 56 63 88 91 79 94 80 69 72 58 51 79 58 98 86 96 73 85 73 58 97 70 96 58 94 88 75 92 59 61 61 53 64 52 99 74 83 98 73 95 89 79 69 79 66 88 61 55 59 78 94 100 90 75 84 54 81 71 51 74 72 80 53 93 98 89 94 72 99 70 80 62 90 91 76 64 69 70 78 96 88 65 85 99 72 89 97 73 74 56 93 69 59 76 52 75 61 51 72 56 84 55 61 67 53 78 72 83 72 81 87 90 72 82 84 72 64 79 54 91 67 75 61 55 51 82 82 89 85 79 85 54 55 89 64 87 95 60 54 50 68 70 99 70 59 58 66 70 89 82 56 79 67 74 70 53 99 71 78 81 70 83 86 81 94 61 74 97 54 87 95 80 96 94 80 61 100 95 54 93 86 73 86 72 94 70 63 99 56 75 70 51 67 55 94 86 62 83 75 88 61 61 71 55 59 71 89 83 98 84 93 84 77 74 57 84 69 86 79 51 72 82 80 55 70 98 81 68 76 63 84 100 67 93 81 62 81 51 80 80 85 50 66 82 79 79 83 62 94 97 54 62 57 61 80 95 64 68 77 50 75 59 73 62 85 73 70 74 61 74 84 79 78 80 69 93 77 50 54 91 94 90 50 76 51 83 53 60 99 87 68 57 61 97 79 82 84 65 86 58 55 82 73 68 55 72 79 88 99 96 73 89 91 69 51 82 55 98 69 71 66 96 76 55 52 80 74 96 61 61 99 71 80 63 64 85 68 95 76 83 97 50 71 59 50 69 61 73 97 86 94 97 62 63 58 71 84 58 75 80 54 79 56 54 52 86 58 89 100 51 81 86 72 80 61 97 80 83 86 62 99 59 62 80 82 66 93 62 74 73 100 56 87 60 51 93 51 71 52 69 75 64 90 80 63 93 90 52 80 85 64 76 90 54 91 89 80 92 85 92 91 58 68 59 93 90 61 75 63 71 67 63 75 67 69 63 92 91 85 82 83 70 82 77 95 97 97 95 88 91 66

(A)

98 91 88 70 84 77 87 97 83 67 93 73 97 72 51 68 74 87 57 55 85 83 90 61 100 73 85 89 62 61 77 63 71 84 58 75 61 54 95 54 97 63 90 56 50 84 98 64 93 84 66 67 52 69 64 83 56 82 58 81 77 89 70 81 62 99 52 70 56 75 55 66 61 77 89 52 94 80 60 89 62 81 70 68 71 61 84 62 77 79 66 67 66 69 52 75 70 57 81 51 84 86 72 72 66 94 88 94 75 61 83 64 86 65 61 99 50 100 66 62 65 71 91 70 80 68 95 93 99 81 53 89 90 84 60 76 97 95 98 91 81 76 59 52 67 85 87 73 90 82 71 55 62 96 57 52 66 58 99 87 60 91 72 87 62 88 99 83 64 67 84 82 95 73 67 52 74 67 60 67 99 55 61 95 87 65 60 74 67 62 69 83 96 87 61 62 58 99 69 81 84 84 68 53 87 65 65 95 81 64 94 68 66 69 80 65 91 52 76 52 60 77 81 59 54 88 91 72 55 53 74 88 94 61 75 90 73 87 84 75 94 73 95 57 72 57 63 61 68 60 51 91 65 99 70 90 91 53 96 100 88 76 67 63 57 61 94 82 55 67 54 54 73 93 77 83 82 54 95 84 100 62 60 58 71 88 64 56 100 58 99 65 95 78 74 70 96 59 76 51 85 86 84 83 99 89 83 61 89 66 66 62 79 76 64 72 55 55 60 79 76 50 53 76 89 77 68 92 97 65 88 73 55 82 93 50 61 88 97 60 78 52 87 53 79 57 55 65 88 80 59 94 79 59 72 62 64 91 63 55 54 94 100 75 74 64 100 54 56 70 84 100 51 64 99 79 83 72 96 56 85 55 99 92 94 61 88 55 76 57 62 96 80 77 68 65 69 82 57 56 56 78 58 91 88 72 84 95 98 52 79 87 93 90 78 91 95 67 98 83 56 68 53 100 72 97 68 83 58 53 62 52 76 93 63 85 90 75 58 65 76 94 78 76 76 61 81 86 88 83 59 57 78 53 58 69 69 85 82 86 52 59 86 75 54 88 86 54 56 95 99 57 53 78 80 76 76 77 52 69 50 61 64 86 82 50 71 73 94 77 85 71 56 58 93 83 65 62 89 70 89 60 80 89 56 62 80 91 98 99 98 78 50 59 70 56 79 82 88 69

80 89 98 60 93 90 58 74 77 96 58 83 83 54 53 80 74 66 70 91 85 71 50 78 50 84 55 65 50 56 63 88 91 79 94 80 69 72 58 51 79 58 98 86 96 73 85 73 58 97 70 96 58 94 88 75 92 59 61 61 53 64 52 99 74 83 98 73 95 89 79 69 79 66 88 61 55 59 78 94 100 90 75 84 54 81 71 51 74 72 80 53 93 98 89 94 72 99 70 80 62 90 91 76 64 69 70 78 96 88 65 85 99 72 89 97 73 74 56 93 69 59 76 52 75 61 51 72 56 84 55 61 67 53 78 72 83 72 81 87 90 72 82 84 72 64 79 54 91 67 75 61 55 51 82 82 89 85 79 85 54 55 89 64 87 95 60 54 50 68 70 99 70 59 58 66 70 89 82 56 79 67 74 70 53 99 71 78 81 70 83 86 81 94 61 74 97 54 87 95 80 96 94 80 61 100 95 54 93 86 73 86 72 94 70 63 99 56 75 70 51 67 55 94 86 62 83 75 88 61 61 71 55 59 71 89 83 98 84 93 84 77 74 57 84 69 86 79 51 72 82 80 55 70 98 81 68 76 63 84 100 67 93 81 62 81 51 80 80 85 50 66 82 79 79 83 62 94 97 54 62 57 61 80 95 64 68 77 50 75 59 73 62 85 73 70 74 61 74 84 79 78 80 69 93 77 50 54 91 94 90 50 76 51 83 53 60 99 87 68 57 61 97 79 82 84 65 86 58 55 82 73 68 55 72 79 88 99 96 73 89 91 69 51 82 55 98 69 71 66 96 76 55 52 80 74 96 61 61 99 71 80 63 64 85 68 95 76 83 97 50 71 59 50 69 61 73 97 86 94 97 62 63 58 71 84 58 75 80 54 79 56 54 52 86 58 89 100 51 81 86 72 80 61 97 80 83 86 62 99 59 62 80 82 66 93 62 74 73 100 56 87 60 51 93 51 71 52 69 75 64 90 80 63 93 90 52 80 85 64 76 90 54 91 89 80 92 85 92 91 58 68 59 93 90 61 75 63 71 67 63 75 67 69 63 92 91 85 82 83 70 82 77 95 97 97 95 88 91 66

Figure 8. Redistricting of 1,000 spatial units in data set Grid3 into ve (A), ten (B), and twenty (C) districts. The population of each spatial unit is shown. Thick black lines indicate the boundaries between districts.

(B)

98 91 88 70 84 77 87 97 83 67 93 73 97 72 51 68 74 87 57 55 85 83 90 61 100 73 85 89 62 61 77 63 71 84 58 75 61 54 95 54 97 63 90 56 50 84 98 64 93 84 66 67 52 69 64 83 56 82 58 81 77 89 70 81 62 99 52 70 56 75 55 66 61 77 89 52 94 80 60 89 62 81 70 68 71 61 84 62 77 79 66 67 66 69 52 75 70 57 81 51 84 86 72 72 66 94 88 94 75 61 83 64 86 65 61 99 50 100 66 62 65 71 91 70 80 68 95 93 99 81 53 89 90 84 60 76 97 95 98 91 81 76 59 52 67 85 87 73 90 82 71 55 62 96 57 52 66 58 99 87 60 91 72 87 62 88 99 83 64 67 84 82 95 73 67 52 74 67 60 67 99 55 61 95 87 65 60 74 67 62 69 83 96 87 61 62 58 99 69 81 84 84 68 53 87 65 65 95 81 64 94 68 66 69 80 65 91 52 76 52 60 77 81 59 54 88 91 72 55 53 74 88 94 61 75 90 73 87 84 75 94 73 95 57 72 57 63 61 68 60 51 91 65 99 70 90 91 53 96 100 88 76 67 63 57 61 94 82 55 67 54 54 73 93 77 83 82 54 95 84 100 62 60 58 71 88 64 56 100 58 99 65 95 78 74 70 96 59 76 51 85 86 84 83 99 89 83 61 89 66 66 62 79 76 64 72 55 55 60 79 76 50 53 76 89 77 68 92 97 65 88 73 55 82 93 50 61 88 97 60 78 52 87 53 79 57 55 65 88 80 59 94 79 59 72 62 64 91 63 55 54 94 100 75 74 64 100 54 56 70 84 100 51 64 99 79 83 72 96 56 85 55 99 92 94 61 88 55 76 57 62 96 80 77 68 65 69 82 57 56 56 78 58 91 88 72 84 95 98 52 79 87 93 90 78 91 95 67 98 83 56 68 53 100 72 97 68 83 58 53 62 52 76 93 63 85 90 75 58 65 76 94 78 76 76 61 81 86 88 83 59 57 78 53 58 69 69 85 82 86 52 59 86 75 54 88 86 54 56 95 99 57 53 78 80 76 76 77 52 69 50 61 64 86 82 50 71 73 94 77 85 71 56 58 93 83 65 62 89 70 89 60 80 89 56 62 80 91 98 99 98 78 50 59 70 56 79 82 88 69

80 89 98 60 93 90 58 74 77 96 58 83 83 54 53 80 74 66 70 91 85 71 50 78 50 84 55 65 50 56 63 88 91 79 94 80 69 72 58 51 79 58 98 86 96 73 85 73 58 97 70 96 58 94 88 75 92 59 61 61 53 64 52 99 74 83 98 73 95 89 79 69 79 66 88 61 55 59 78 94 100 90 75 84 54 81 71 51 74 72 80 53 93 98 89 94 72 99 70 80 62 90 91 76 64 69 70 78 96 88 65 85 99 72 89 97 73 74 56 93 69 59 76 52 75 61 51 72 56 84 55 61 67 53 78 72 83 72 81 87 90 72 82 84 72 64 79 54 91 67 75 61 55 51 82 82 89 85 79 85 54 55 89 64 87 95 60 54 50 68 70 99 70 59 58 66 70 89 82 56 79 67 74 70 53 99 71 78 81 70 83 86 81 94 61 74 97 54 87 95 80 96 94 80 61 100 95 54 93 86 73 86 72 94 70 63 99 56 75 70 51 67 55 94 86 62 83 75 88 61 61 71 55 59 71 89 83 98 84 93 84 77 74 57 84 69 86 79 51 72 82 80 55 70 98 81 68 76 63 84 100 67 93 81 62 81 51 80 80 85 50 66 82 79 79 83 62 94 97 54 62 57 61 80 95 64 68 77 50 75 59 73 62 85 73 70 74 61 74 84 79 78 80 69 93 77 50 54 91 94 90 50 76 51 83 53 60 99 87 68 57 61 97 79 82 84 65 86 58 55 82 73 68 55 72 79 88 99 96 73 89 91 69 51 82 55 98 69 71 66 96 76 55 52 80 74 96 61 61 99 71 80 63 64 85 68 95 76 83 97 50 71 59 50 69 61 73 97 86 94 97 62 63 58 71 84 58 75 80 54 79 56 54 52 86 58 89 100 51 81 86 72 80 61 97 80 83 86 62 99 59 62 80 82 66 93 62 74 73 100 56 87 60 51 93 51 71 52 69 75 64 90 80 63 93 90 52 80 85 64 76 90 54 91 89 80 92 85 92 91 58 68 59 93 90 61 75 63 71 67 63 75 67 69 63 92 91 85 82 83 70 82 77 95 97 97 95 88 91 66

(C)

98 91 88 70 84 77 87 97 83 67 93 73 97 72 51 68 74 87 57 55 85 83 90 61 100 73 85 89 62 61 77 63 71 84 58 75 61 54 95 54 97 63 90 56 50 84 98 64 93 84 66 67 52 69 64 83 56 82 58 81 77 89 70 81 62 99 52 70 56 75 55 66 61 77 89 52 94 80 60 89 62 81 70 68 71 61 84 62 77 79 66 67 66 69 52 75 70 57 81 51 84 86 72 72 66 94 88 94 75 61 83 64 86 65 61 99 50 100 66 62 65 71 91 70 80 68 95 93 99 81 53 89 90 84 60 76 97 95 98 91 81 76 59 52 67 85 87 73 90 82 71 55 62 96 57 52 66 58 99 87 60 91 72 87 62 88 99 83 64 67 84 82 95 73 67 52 74 67 60 67 99 55 61 95 87 65 60 74 67 62 69 83 96 87 61 62 58 99 69 81 84 84 68 53 87 65 65 95 81 64 94 68 66 69 80 65 91 52 76 52 60 77 81 59 54 88 91 72 55 53 74 88 94 61 75 90 73 87 84 75 94 73 95 57 72 57 63 61 68 60 51 91 65 99 70 90 91 53 96 100 88 76 67 63 57 61 94 82 55 67 54 54 73 93 77 83 82 54 95 84 100 62 60 58 71 88 64 56 100 58 99 65 95 78 74 70 96 59 76 51 85 86 84 83 99 89 83 61 89 66 66 62 79 76 64 72 55 55 60 79 76 50 53 76 89 77 68 92 97 65 88 73 55 82 93 50 61 88 97 60 78 52 87 53 79 57 55 65 88 80 59 94 79 59 72 62 64 91 63 55 54 94 100 75 74 64 100 54 56 70 84 100 51 64 99 79 83 72 96 56 85 55 99 92 94 61 88 55 76 57 62 96 80 77 68 65 69 82 57 56 56 78 58 91 88 72 84 95 98 52 79 87 93 90 78 91 95 67 98 83 56 68 53 100 72 97 68 83 58 53 62 52 76 93 63 85 90 75 58 65 76 94 78 76 76 61 81 86 88 83 59 57 78 53 58 69 69 85 82 86 52 59 86 75 54 88 86 54 56 95 99 57 53 78 80 76 76 77 52 69 50 61 64 86 82 50 71 73 94 77 85 71 56 58 93 83 65 62 89 70 89 60 80 89 56 62 80 91 98 99 98 78 50 59 70 56 79 82 88 69

812
100

Xiao a broader view, this research represents a starting point on a quest to establish a unied framework for geographical optimization problems. Although the focus is placed on the construction of a framework, the algorithm guidelines discussed here are useful for many particular geographical problems and an application of this framework is also discussed. It is also worthwhile to note possible limitations of the approach discussed here. The typology does not distinguish between problems with a xed number of spatial entities to locate (e.g., the p -median problem) and those that may require a variable number of spatial entities (e.g., the set-covering problem). The latter is especially important for problems such as spatial cluster detection in point data sets (Openshaw and Perr ee 1996). Nevertheless, the graph representation discussed here is exible and can be applied to guide the design of new types of spatial EAs not discussed in the previous sections. To incorporate such new encoding types, however, additional algorithms may be needed for initialization, recombination, and mutation operations. As a future research topic, techniques used in variable-length EAs (see Srikanth, George, and Warsi 1995; Wu and Banzhaf 1999) can be used to implement EAs for geographical optimization problems that require solutions with variable lengths. EAs do not come without drawbacks, however. A major issue of EAs is their computational load. Because EAs are population-based computer programs, simultaneously manipulating individual solutions across multiple generations requires a signicant amount of computing resources. A straightforward way to speed up EA performance is to exploit the implicit parallel nature of EAs; a variety of parallel models have been developed in the literature (see, for example, Cant u-Paz 2000; Xiao and Armstrong 2003). Xu et al. (2003) suggested ve speed-up strategies that utilize a new representation scheme and a set of efcient selection operations, which can be used to reduce the overall search space and therefore expedite the search process. Another type of approach to addressing the computational burden is to hybridize EAs with other heuristics that can help nd high-quality solutions quickly. Researchers have discussed possible ways to apply other metaheuristic approaches in EAs. For example, Bergey, Ragsdale, and Hoskot (2003) discussed the use of simulated annealing in a genetic algorithm to expedite the search for optimal solutions to an electrical power redistricting problem. More comprehensive approaches to incorporating a variety of metaheuristic methods

Max Median Min

80

Objective function value

60

40

20

0 0 100 200 Iteration 300 400 500

Figure 9. The performance of the random version of GEAPR when used to solve the problem with twenty-ve spatial units and four districts. The minimal, median, and maximal objective function values in each iteration are marked as Min, Median, and Max, respectively. The optimal solution is found at iteration 67.

the solution found by the EA has a higher population equality, although the shape may be considered to be less compact. The point here is not to suggest a new redistricting plan. However, it can be observed through this application that EAs are useful in generating interesting, high-quality alternative plans for consideration in the decision-making process.

Discussion and Conclusions


The rapid development of computing techniques during the past several decades has encouraged an optimistic view toward the computational issues that face geographers (see, for example, Dobson 1983). However, researchers must bear in mind that the ultimate limitation on computation is the inherent complexity of the problem to be solved, rather than the speed of computer systems (Garey and Johnson 1979; Armstrong 1993). Developing new methods to overcome the shortcomings of existing algorithms has always been a motivation that leads to progress in optimization research. This research echoes geographers recent interests in computational science (see Openshaw 1994; Fotheringham 1998; Armstrong 2000). This article addresses issues that are fundamental to geographical optimization in particular, and GIScience in general, including conceptualization of geographical problems, spatial representation, and algorithm analysis. In

A Unied Conceptual Framework for Geographical Optimization Using Evolutionary Algorithms

813

11763

7003

16424

11027 17163

11723

7909

10874

9932

21310

14675

31589

15102

17372

10147

12100

46447

16900

13095 22008 18678

24849

13035

20411

8662

10381

14334

10704

15305

23325

103877

7837

11529

11115

40235

16438

18812

12369

128012

21093

18404

89143

10020

16942

21421

10366

26224

79981

39311

18103

25308

191701

20221

20296

50149 15666 18187 13173 6830 11353 40750 374601 37213 18815 15671 111 006 41722 87704 14684 8243 14019 40671 32052 22335 11400 20670 12183 14547 11771 4482 12309 9133 9422 8016 36051 16181 20336 158668

42351

8010

16976

6958

5469

8689

6730

13721

8541

7809 38052

(A)

11763

7003

16424

11027 17163

11723

7909

10874

9932

21310

14675

31589

15102

17372

10147

12100

46447

16900

13095 22008 18678

24849

13035

20411

8662

10381

14334

10704

15305

23325

103877

7837

11529

11115

40235

16438

18812

12369

128012

21093

18404

89143

10020

16942

21421

10366

26224

79981

39311

18103

25308

191701

20221

20296 50149

15666

18187 13173 6830 11353 40750 374601 37213 18815 15671 111 006 41722 158668

87704

14684

8243

14019

40671

32052

22335

11400

20670 12183

14547

11771

4482

12309

9133

9422

8016

36051

16181

20336

42351

8010

16976

6958

5469

8689

6730

13721

8541

7809 38052

(B)
Figure 10. Redistricting for Iowa where ninety-nine counties are subdivided into ve congressional districts. (A) The best solution found by GEAPR with an objective function value of 0.0045, and the total absolute deviation from the ideal population is 131 persons. (B) The ofcial plan adopted by Iowa in 2000, with an objective function value of 0.0080, and the total absolute deviation from the ideal population is 235 persons. Numbers printed on the maps show the population of each county.

have also been discussed (see, for example, Anderson 1996; Preux and Talbi 1999). The algorithms developed in this article have their roots in the heuristic optimization literature. For example, the essence of the add, drop, and interchange algorithms developed

to solve the locationallocation problems (Kuehn and Hamburger 1963; Feldman, Lehrer, and Ray 1966; Teitz and Bart 1968) can be found in most of the algorithms developed in this article. These traditional approaches are placed in an EA context as a way to create solutions

814 and handle spatial constraints. This hybrid approach has been common in solving geographical optimization problems using EAs (see, for example, Reeves 1997; Estivill-Castro and Torres-Vel azquez 1999; Krzanowski and Raper 1999; Nalle, Arthur, and Sessions 2002). As a nal note, although the literature has been generally supportive about EA applications, a number of authors have also cautioned about overoptimistic views of EAs (Dowsland 1996; Ross 1997). To ne-tune an EA, for example, the user and designer must make a series of decisions about EA parameters (such as the population size and probability of recombination and mutation operations). In addition, the success of EAs often relies on the use of diversication methods so that individual solutions do not concentrate on a few good solutions and therefore avoid being trapped in local optima; these methods are often sensitive to their parameters (Goldberg 1989). These issues have been discussed in the literature with respect to EA performance (De Jong 2006); the understanding of their impact on EA performance for spatial problems, however, is limited (Xiao 2006) and needs to be further studied. Although the focus of this article is placed on the more geographical aspects of EA design, being aware of implementation issues will help researchers develop a clear picture about the problem-solving landscape as they continue to design better solution strategies for geographical optimization problems.

Xiao
2. If a problem can be solved in a time frame that is polynomial with respect to input size, the problem is placed into class P and is typically regarded to be easy to solve. For problems in class NP-complete (NP stands for nondeterministic polynomial), however, polynomial time algorithms have not been developed and it is likely that such algorithms do not exist. See Garey and Johnson (1979) and Cormen et al. (2001, ch. 34) for a more formal discussion of this issue. 3. Geographical optimization problems often contain important social, economic, and political factors that are difcult, if not impossible, to place in a mathematical formulation. Optimal solutions to mathematically wellformulated problems will become nonoptimal when the unmodeled factors or objectives are taken into account. Therefore, near optimal (or second best) solutions to a problem may be favorable to decision makers (Simon 1960; Brill 1979; Hopkins 1984). 4. It is important to distinguish spatial constraints and other constraints that may have spatial implications. In this article, spatial constraints refer to explicit requirements of the topological conguration of spatial units. For many applications, some constraints may have spatial implications but they do not require the conguration (or pattern) of spatial units. For example, in the work of Bennett, Xiao, and Armstrong (2004), it is required that the total area of particular land use types should not exceed 25 percent of the overall size of the study area. This requirement, however, does not conne the topological relationship between spatial units and therefore is not considered as a spatial constraint in the context of this research. 5. For a redistricting problem of partitioning n spatial units into r districts, the exact number of possible redistricting plans is difcult to compute. However, we know that the upper bound of this number is a Stirling number of the second kind that is dened r! r n as S (n , r ) = r1! r i =0 (1) [ (r i )!r ! ] (r i ) (see, Even 1973, 60), which occurs when each unit is adjacent to all other units. The lower bound of the number of possible plans occurs when the units are least connected, meaning units arranged as a line and, except for the units at both ends, each unit has only two adjacent units; in (n 1)! this case, there are S1 (n , r ) = (n possible plans. r )!(r 1)! For the example of Iowa redistricting, where n = 99 and r = 5, the total number of possible redistricting plans is between 3.6 106 and 1.3 1067 . 6. Interactive redistricting software tools are available in many GIS packages such as an ArcView extension (http://www.esri.com/software/arcview/extensions/districting) and a commercial package called Maptitude for Redistricting (http://www.caliper.com/mtredist.htm). Both URLs were last accessed on 18 December 2007. 7. Although compactness is a critical factor, studies did not show that there is no necessary relation between the shape of redistricting plans and gerrymandering, a process intended to favor a particular political party or interest group (Taylor and Johnston 1979), and different compactness measures may lead to different conclusions (Altman 1998b). 8. If we assume all spatial units are adjacent to each other, the optimal solution occurs when the population

Acknowledgments
I thank Marc P. Armstrong and Iris Hui for their valuable comments. Comments from Mei-Po Kwan and anonymous reviewers are also acknowledged.

Notes
1. The term geographical optimization problem is loosely used in this article. In general, such a problem requires search for the conguration of a set of discrete spatial entities. In other words, a solution to this type of problem exhibits a spatial pattern that can be displayed on a map. A solution to a p -median problem (Hakimi 1965), for example, represents a spatial conguration where the demand in each geographical unit is served by its nearest facility and a service area map can be drawn accordingly. Some optimization problems with spatial components (see, for example, Leung, Li, and Xu 1998) may not necessarily have this characteristic and therefore are not directly considered in this article, although research incorporating these problems will be an interesting future topic.

A Unied Conceptual Framework for Geographical Optimization Using Evolutionary Algorithms


difference between any two districts is not greater than 1. Accordingly, the optimal objective function value can 1 be calculated as 100 P | P r P |. When the total population can be evenly divided by r , the theoretical optimal solution should have an objective function value of zero. It is important to stress the assumption used here because such a theoretical optimal objective function value may not exist in the spatial connectivity of a real data set. 9. Iowa Code 42.4(1)(b).

815

References
Alp, O., E. Erkut, and Z. Drezner. 2003. An efcient genetic algorithm for the p -median problem. Annals of Operations Research 122:2142. Altman, M. 1998a. Districting principles and democratic representation. Ph.D. thesis, California Institute of Technology, Pasadena, CA. . 1998b. Modeling the effect of mandatory district compactness on partisan gerrymanders. Political Geography 17 (8): 9891012. Altman, M., K. MacDonald, and M. McDonald. 2005. From crayons to computers: The evolution of computer use in redistricting. Social Science Computer Review 23 (3): 33446. Anderson, E. J. 1996. Mechanisms for local search. European Journal of Operational Research 88:13951. Armstrong, M. P. 1993. On automated geography! The Professional Geographer 45 (4): 44041. . 2000. Geography and computational science. Annals of the Association of American Geographers 90 (1): 146 56. Armstrong, M. P., G. Rushton, R. Honey, B. T. Dalziel, P. Lolonis, S. De, and P. J. Densham. 1991. Decision support for regionalization: A spatial decision support system for regionalizing service delivery systems. Computers, Environment and Urban Systems 15:3753. Armstrong, M. P., N. Xiao, and D. A. Bennett. 2003. Using genetic algorithms to create multicriteria class intervals for choropleth maps. Annals of the Association of American Geographers 93 (3): 595623. Bac a o, F., V. Lobo, and M. Painho. 2005. Applying genetic algorithms to zone design. Soft Computing 9:34148. B ack, T., D. B. Fogel, and Z. Michalewicz, Ed. 1997. Handbook of evolutionary computation. New York: Oxford University Press/IOP. Bennett, D. A., N. Xiao, and M. P. Armstrong. 2004. Exploring the geographic ramications of environmental policy using evolutionary algorithms. Annals of the Association of American Geographers 94 (4): 82747. Bergey, P. K., C. T. Ragsdale, and M. Hoskote. 2003. A simulated annealing genetic algorithm for the electrical power districting problem. Annals of Operations Research 121:3355. Bianchi, G., and R. Church. 1993. A non-binary encoded GA for a facility location problem. Working paper, Department of Geography, University of California, Santa Barbara. Brandeau, M. L., and S. S. Chiu. 1989. Overview of representative problems in location research. Management Science 35 (6): 64574.

Brill, E. D., Jr. 1979. The use of optimization models in public-sector planning. Management Science 25 (5): 413 22. Brookes, C. J. 2001. A genetic algorithm for designing optimal patch congurations in GIS. International Journal of Geographical Information Science 15 (6): 53959. Cant u-Paz, E. 2000. Efcient and accurate parallel genetic algorithms. Boston: Kluwer Academic. Church, R. L., D. M. Stoms, and F. W. Davis. 1996. Reserve selection as a maximal covering location problem. Biological Conservation 76:10512. Cooper, L. 1964. Heuristic methods for locationallocation problems. SIAM Review 6:3754. Cormen, T. H., C. E. Leiserson, R. L. Rivest, and C. Stein. 2001. Introduction to algorithms. 2nd ed. Cambridge, MA: MIT Press. Cova, T. J., and R. L. Church. 2000. Contiguity constraints for single-region site search problems. Geographical Analysis 32 (4): 30629. Deb, K. 2001. Multi-objective optimization using evolutionary algorithms. Chichester, U.K.: Wiley. De Jong, K. A. 2006. Evolutionary computation: A unied approach. Cambridge, MA: MIT Press. Densham, P. J., and G. Rushton. 1991. Designing and implementing strategies for solving large locationallocation problems with heuristic methods. Technical Report 91 10, National Center of Geographic Information and Analysis, Santa Barbara, CA. . 1992. A more efcient heuristic for solving large p median problems. Papers in Regional Science: The Journal of RSAI 41 (3): 30729. . 1996. Providing spatial decision support for rural public service facilities that require a minimum workload. Environment and Planning B: Planning and Design 23:55374. Dibble, C., and P. J. Densham. 1993. Generating interesting alternatives in GIS and SDSS using genetic algorithms. In GIS/LIS 93, ed. C. Dibble and P. J. Densham, 180 89. Minneapolis, MN: ACSM, ASPRS, AM/FM International, AAG, URISA. Diestel, R. 2000. Graph theory. New York: Springer-Verlag. Dobson, J. E. 1983. Automated geography. The Professional Geographer 35 (2): 13543. Dorigo, M., V. Maniezzo, and A. Colorni. 1991. Positive feedback as a search strategy. Milan, Italy: Dipartimento di Elettronica, Politecnico di Milano. Dowsland, K. A. 1996. Genetic algorithmsA tool for OR? Journal of the Operational Research Society 47:550 61. Eagles, M., R. S. Katz, and D. Mark. 2000. Controversies in political redistricting: GIS, geography, and society. Political Geography 19:13539. Estivill-Castro, V., and R. Torres-Vel azquez. 1999. Hybrid genetic algorithm for solving the p -median problem. In Second Asia-Pacic conference on simulated evolution and learning, SEAL 98, ed. X. Yao, 1825. Berlin: SpringerVerlag. Evans, J., and E. Minieka. 1992. Optimization algorithms for network and graphs. 2nd ed. New York: Marcel Dekker. Even, S. 1973. Algorithmic combinatorics. New York: Macmillan. Falkenauer, E. 1994. A new representation and operators for genetic algorithms applied to grouping problems. Evolutionary Computation 2 (2): 12344.

816
Feldman, E., F. A. Lehrer, and T. L. Ray. 1966. Warehouse location under continuous economies of scale. Management Science 12:67084. Fogel, D. B., and P. J. Angeline. 1997. Guidelines for a suitable encoding. In Handbook of evolutionary computation, ed. T. B ack, D. B. Fogel, and Z. Michalewicz, C1.7:12. New York: Oxford University Press/IOP. Fogel, L. J. 1962. Autonomous automata. Industrial Research 4:1419. Forrest, S. 1993. Genetic algorithms: Principles of natural selection applied to computation. Science 261:872 78. Fotheringham, A. S. 1998. Trends in quantitative geography II: Stressing the computational. Progress in Human Geography 22 (2): 28392. Fox, B. L. 1993. Integrating and accelerating tabu search, simulated annealing, and genetic algorithms. Annals of Operations Research 41:4767. Garey, M. R., and D. S. Johnson. 1979. Computers and intractability: A guide to the theory of NP-completeness. San Francisco: Freeman. Gendreau, M., and J.-Y. Potvin. 2005. Metaheuristics in combinatorial optimization. Annals of Operations Research 140 (1): 189213. Ghosh, A., and F. Harche. 1993. Locationallocation models in the private sector: Progress, problems, and prospects. Location Science 1 (1): 81106. Glover, F., and M. Laguna. 1997. Tabu search. Boston: Kluwer Academic. Goldberg, D. E. 1989. Genetic algorithms in search, optimization and machine learning. Reading, MA: Addison-Wesley. Grefenstette, J. J., R. Gopal, B. Rosmaita, and D. van Gucht. 1985. Genetic algorithms for the traveling salesman problem. In Proceedings of the First International Conference on Genetic Algorithms and Their Applications, ed. J. J. Grefenstette, 16066. Hillsdale, NJ: Erlbaum. Hakimi, S. L. 1965. Optimum distribution of switch centers in a communication network and some related graph theoretic problems. Operations Research 13:46275. Hamacher, H. W., and S. Nickel. 1998. Classication of location models. Location Science 6:22942. Han, J., M. Kamber, and A. K. H. Tung. 2001. Spatial clustering methods in data mining: A survey. In Geographic data mining and knowledge discovery, ed. H. Miller and J. Han, 188217. London: Taylor & Francis. Hobbs, M. 1996. Spatial clustering with a genetic algorithm. In Innovations in GIS 3, ed. D. Parker, 8593. London: Taylor & Francis. Hof, J., and M. Bevers. 1998. Spatial optimization for managed ecosystems. New York: Columbia University Press. Holland, J. 1962. Outline for a logical theory of adaptive systems. Journal of the ACM 3:297314. . 1975. Adaptations in natural and articial systems. Ann Arbor: University of Michigan Press. . 1998. Emergence: From chaos to order. Reading, MA: Perseus Books. Hopkins, L. D. 1984. Evaluation of methods for exploring illdened problems. Environment and Planning B 11:339 48. Hosage, C. M., and M. F. Goodchild. 1986. Discrete space locationallocation solutions from genetic algorithms. Annals of Operations Research 6:3546.

Xiao
Houck, C. R., J. A. Joines, and M. G. Kay. 1996. Comparison of genetic algorithms, random restart and two-opt switching for solving large locationallocation problems. Computers & Operations Research 23 (6): 58796. Huang, B., R. L. Cheu, and Y. S. Liew. 2004. GIS and genetic algorithms for HAZMAT route planning with security considerations. International Journal of Geographical Information Science 18 (8): 76987. Jaramillo, J. H., J. Bhadury, and R. Batta. 2002. On the use of genetic algorithms to solve location problems. Computers & Operations Research 29:76179. Kirkpatrick, S., C. D. Gelatt, and M. P. Vecchi Jr. 1983. Optimization by simulated annealing. Science 220:671 80. Koza, J. R. 1992. Genetic programming: On the programming of computers by means of natural selection. Cambridge, MA: MIT Press. Krarup, J., and P. M. Pruzan. 1990. Ingredients of locational analysis. In Discrete location theory, ed. P. B. Mirchandani and R. L. Francis, 154. New York: Wiley. Krzanowski, R. M., and J. Raper. 1999. Hybrid genetic algorithm for transmitter location in wireless networks. Computers, Environment and Urban Systems 23:35982. . 2001. Spatial evolutionary modeling. Oxford, U.K.: Oxford University Press. Kuehn, A. A., and M. J. Hamburger. 1963. A heuristic program for locating warehouses. Management Science 9:64366. Leung, Y., G. Li, and Z.-B. Xu. 1998. A genetic algorithm for the multiple destination routing problems. IEEE Transactions on Evolutionary Computation 2 (4): 15061. Li, X., and A. G.-O. Yeh. 2005. Integration of genetic algorithms and GIS for optimal location search. International Journal of Geographical Information Science 19 (5): 581 601. Lin, Y.-C., K.-S. Hwang, and F.-S. Wang. 2001. Coevolutionary hybrid differential evolution for mixedinteger optimization problems. Engineering Optimization 33:66382. Michalewicz, Z. 1996. Genetic algorithms + data structures = evolution programs. Berlin: Springer-Verlag. Mladenovic, N., J. Brimberg, P. Hansen, and J. A. MorenoPerez. 2007. The p -median problem: A survey of metaheuristic approaches. European Journal of Operational Research 179 (3): 92739. Mooney, P., and A. Winstanley. 2006. An evolutionary algorithm for multicriteria path optimization problems. International Journal of Geographical Information Science 20 (4): 40123. Morrill, R. L. 1976. Redistricting revisited. Annals of the Association of American Geographers 66:54856. . 1981. Political redistricting and geographic theory. Washington, DC: Association of American Geographers. Murray, A. T., and R. L. Church. 1995. Measuring the efcacy of adjacency constraint structure in forest planning models. Canadian Journal of Forest Research 25:141624. Nalle, D. J., J. L. Arthur, and J. Sessions. 2002. Designing compact and contiguous reserve networks with a hybrid heuristic algorithm. Forest Science 48 (1): 5968. Openshaw, S. 1994. Computational human geography: Towards a research agenda. Environment and Planning A 26:499505.

A Unied Conceptual Framework for Geographical Optimization Using Evolutionary Algorithms


Openshaw, S., and T. Perr ee. 1996. User-centred intelligent spatial analysis off point data. In Innovations in GIS 3, ed. D. Parker, 11934. London: Taylor & Francis. Osman, I. H., and J. P. Kelly. 1996. Meta-heuristics: Theory and application. Boston: Kluwer. Preux, P., and E.-G. Talbi. 1999. Towards hybrid evolutionary algorithms. International Transactions in Operational Research 6 (6): 55770. Rechenberg, I. 1965. Cybernetic solution path of an experimental problem. Translation No. 1122. Farnborough, Hants, U.K.: Royal Aircraft Establishment. Reeves, C. R., ed. 1993. Modern heuristic techniques for combinatorial problems. Oxford, U.K.: Blackwell. . 1997. Genetic algorithms for the operations researchers. INFORMS Journal on Computing 9 (3): 231 51. Ross, P. 1997. What are genetic algorithms good at? INFORMS Journal on Computing 9 (3): 26062. Ruiz-Andino, A., L. Araujo, F. S aenz, and J. Ruz. 2000. A hybrid evolutionary approach for solving constrained optimization problems over nite domains. IEEE Transactions on Evolutionary Computation 4 (4): 35372. Rushton, G. 1988. Location theory, locationallocation models, and service development planning in the third world. Economic Geography 64 (2): 97120. Schilling, D. A., V. Jayaraman, and R. Barkhi. 1993. A review of covering problems in facility location. Location Science 1 (1): 2555. Scott, A. J. 1971. Combinatorial programming, spatial analysis, and planning. London: Methuen. Sim oes, A., and E. Costa. 2000. Using genetic algorithms with sexual or asexual transposition: A comparative study. In Proceedings of the 2000 Congress on Evolutionary Computation (CEC 2000), 1196203. Washington, DC: IEEE Computer Society. Simon, H. A. 1960. The new science of management decision. New York: Harper & Row. Srikanth, R., R. George, and N. Warsi. 1995. A variablelength genetic algorithm for clustering and classication. Pattern Recognition Letters 16:789800.

817

Taylor, P. J., and R. J. Johnston. 1979. The geography of elections. London: Croom, Helm & Penguin. Teitz, M. B., and P. Bart. 1968. Heuristic methods for estimating the generalized vertex median of a weighted graph. Operations Research 16:95561. van Dijk, S., D. Thierens, and M. De Berg. 2002. Using genetic algorithms for solving hard problems in GIS. GeoInformatica 6 (4): 381413. Vazirani, V. V. 2001. Approximation algorithms. Berlin: Springer. Williams, J. C. 2002. A zero-one programming model for contiguous land acquisition. Geographical Analysis 34 (4): 33049. . 1995. Political redistricting: A review. Papers in Regional Science 74 (1): 1340. Wu, A. S., and W. Banzhaf. 1999. Introduction to the special issue: Variable-length representation and noncoding segments for evolutionary algorithms. Evolutionary Computation 6 (4): iiivi. Xiao, N. 2006. An evolutionary algorithm for site search problems. Geographical Analysis 38 (3): 227 47. Xiao, N., and M. P. Armstrong. 2003. A specialized island model and its application in multiobjective optimization. In Genetic and evolutionary computationGECCO 2003, ed. E. Cant u-Paz, J. A. Foster, K. Deb, L. D. Davis, R. Roy, U.-M. OReilly, H.-G. Beyer, et al., 153040. Berlin: Springer-Verlag. Xiao, N., D. A. Bennett, and M. P. Armstrong. 2002. Using evolutionary algorithms to generate alternatives for multiobjective site search problems. Environment and Planning A 34 (4): 63956. . 2007. Interactive evolutionary approaches to multiobjective spatial decision making: A synthetic review. Computers, Environment and Urban Systems 30:232 52. Xu, Z. B., K. S. Leung, Y. Liang, and Y. Leung. 2003. Efciency speed-up strategies for evolutionary computation: Fundamentals and fast-GAs. Applied Mathematics and Computation 142:34188.

Correspondence: Department of Geography, 1036 Derby Hall, 154 N. Oval Mall, The Ohio State University, Columbus, OH 43210, e-mail: xiao.37@osu.edu.

You might also like