What to Feed a gerrymander Control 1421 Mathematics Contest in Modeling February 12, 2007
What to Feed a Gerrymander Control # 1421 Mathematics Contest in Modeling February 12, 2007
abstract Gerrymandering, the practice of dividing political districts into winding and unfair geome- tries, has a deleterious effect on democratic accountability and participation. Incumbent politicians have an incentive to create districts to their advantage(California in 2000, Texas in 2003 )so one proposed remedy for gerrymandering is to adopt an objective, possibly com- puterized, methodology for districting. We present two computationally efficient algorithms for solving the districting problem by modeling it as a Markov decision process rewarding traditional measures of district"goodness": equality of population, continuity, preservation of county lines, and compactness of shape. Our Multi-Seeded Growth Model simulates the creation of a fixed number of districts for an arbitrary geography by "planting seeds"for districts and specifying particular growth rules. The result of this process is refined im mensely in our Partition Optimization Model which uses stochastic domain hill-climbing to make small changes in district lines that improve goodness. We include, as an extension,an optimization to minimize projected inequality in district populations between redistrictings As a case study, we implement our models to create an unbiased, geographically simple dis- tricting of New York using tract-level data from the 2000 Census. We conclude with an open letter to members of the New York State Assembly
Abstract Gerrymandering, the practice of dividing political districts into winding and unfair geometries, has a deleterious effect on democratic accountability and participation. Incumbent politicians have an incentive to create districts to their advantage (California in 2000, Texas in 2003) so one proposed remedy for gerrymandering is to adopt an objective, possibly computerized, methodology for districting. We present two computationally efficient algorithms for solving the districting problem by modeling it as a Markov decision process rewarding traditional measures of district “goodness”: equality of population, continuity, preservation of county lines, and compactness of shape. Our Multi-Seeded Growth Model simulates the creation of a fixed number of districts for an arbitrary geography by “planting seeds” for districts and specifying particular growth rules. The result of this process is refined immensely in our Partition Optimization Model which uses stochastic domain hill-climbing to make small changes in district lines that improve goodness. We include, as an extension, an optimization to minimize projected inequality in district populations between redistrictings. As a case study, we implement our models to create an unbiased, geographically simple districting of New York using tract-level data from the 2000 Census. We conclude with an open letter to members of the New York State Assembly
Control 1421 Page 1 out of 35 What to Feed a gerrymander Team 1421 1 What is gerrymandering Gerrymandering is the division of an area into political districts that give special ad- vantages to one group. Frequently, this involves concentrating "unfavorable"voters in few districts to ensure that "favorable"voters will win in many more districts. In order to squeeze all of the unfavorable voters into a few districts, gerrymandering creates snaky and odd shaped regions. The eponymous label was created when politician Elbridge Gerry pio- eered this technique in early 19 n Century and his opponents claimed the districts resembled salamanders igure 1: The original"Gerry-mander"from the Boston Centinel (1812) 1.1 Basic Terminology Packing- Forcing a disproportionately high concentration of a particular group into one district to lessen their impact in nearby districts Cracking Spreading out members of some group in several districts in order to reduce their impact in each of these districts Forfeit district-a district where group A packs the members of group B so that group B wins this district but loses several surrounding districts which B may have von with a different districting scheme
Control # 1421 Page 1 out of 35 What to Feed a Gerrymander Team 1421 1 What is Gerrymandering? Gerrymandering is the division of an area into political districts that give special advantages to one group. Frequently, this involves concentrating “unfavorable” voters in a few districts to ensure that “favorable” voters will win in many more districts. In order to squeeze all of the unfavorable voters into a few districts, gerrymandering creates snaky and odd shaped regions. The eponymous label was created when politician Elbridge Gerry pioneered this technique in early 19th Century and his opponents claimed the districts resembled salamanders. Figure 1: The original “Gerry-mander” from the Boston Centinel (1812) 1.1 Basic Terminology • Packing - Forcing a disproportionately high concentration of a particular group into one district to lessen their impact in nearby districts. • Cracking - Spreading out members of some group in several districts in order to reduce their impact in each of these districts. • Forfeit district - A district where group A packs the members of group B so that group B wins this district but loses several surrounding districts which B may have won with a different districting scheme
Control 1421 Page 2 out of 35 Wasted Vote-A vote cast by a member of group A in a district where A is already assured victory so voting has no bearing on the result. In general, the group with more wasted votes is made worse off by a districting plan 1.2 Why is it so bad? Politicians today still gerrymander federal and state-level electoral districts and the public outcry is still strongly negative. Before we set out to eliminate this practice we should discuss why gerrymandering is considered problematic First off, gerrymandering reduces electoral competition within districts since cracking packing makes elections uncompetitive. Further, incumbent representatives are in no real danger of losing elections so they do not campaign vigorously which can lead to lower voter turnout. Exacerbating the problem, incumbents' increased advantage means they are less incentivized to govern based on their constituents' interests so democratic accountability and engagement mutually deteriorate Gerrymandering also presents the practical problem that it is difficult to explain to voters why district shapes are so labyrinthine. Some districts connect demographically similar but geographically distant regions using thin filaments such as the district depicted in Figure 2 "Niceness "of district shape almost always takes a back seat to political and racial concerns when districts are being created. Example: In the 2000 California realignment, Democrats and Republicans united to design incumbent -favoring districts which resulted in the reeled- tion of all of the 153 involved legislators in 2004. How can one argue that this is in voters However, it should be noted that gerrymandering can be considered appropriate in specific situations. For instance, the Arizona Legislature gerrymandered a division between the historically hostile Hopi and Navajo tribes even though the Hopi reservation is entirely surrounded by the Navajo reservation Congressional District 4 Figure 2: A present-day gerrymander, the Illinois 4th congressional district (The two "earmuffs"are connected by a narrow band along Highway 294
Control # 1421 Page 2 out of 35 • Wasted Vote - A vote cast by a member of group A in a district where A is already assured victory so voting has no bearing on the result. In general, the group with more wasted votes is made worse off by a districting plan. 1.2 Why is it so bad? Politicians today still gerrymander federal and state-level electoral districts and the public outcry is still strongly negative. Before we set out to eliminate this practice we should discuss why gerrymandering is considered problematic. First off, gerrymandering reduces electoral competition within districts since cracking/- packing makes elections uncompetitive. Further, incumbent representatives are in no real danger of losing elections so they do not campaign vigorously which can lead to lower voter turnout. Exacerbating the problem, incumbents’ increased advantage means they are less incentivized to govern based on their constituents’ interests so democratic accountability and engagement mutually deteriorate. Gerrymandering also presents the practical problem that it is difficult to explain to voters why district shapes are so labyrinthine. Some districts connect demographically similar but geographically distant regions using thin filaments such as the district depicted in Figure 2. “Niceness” of district shape almost always takes a back seat to political and racial concerns when districts are being created. Example: In the 2000 California realignment, Democrats and Republicans united to design incumbent-favoring districts which resulted in the reelection of all of the 153 involved legislators in 2004. How can one argue that this is in voters’ best interests? However, it should be noted that gerrymandering can be considered appropriate in specific situations. For instance, the Arizona Legislature gerrymandered a division between the historically hostile Hopi and Navajo tribes even though the Hopi reservation is entirely surrounded by the Navajo reservation. Figure 2: A present-day gerrymander, the Illinois 4th congressional district. (The two “earmuffs” are connected by a narrow band along Highway 294.)
Control 1421 Page 3 out of 35 1.3 The legality of gerrymandering We should be clear on one point: though gerrymandering is objectionable to many, it is legal around the country. Interestingly, the Voting Rights Act of 1965 which eliminated poll taxes and other discriminatory voting policies may have inadvertently increased the prevalence of gerrymandering. One interpretation of the Act was that it mandated nondiscriminatory election results which led to a strange reversal of vocabulary where creating "ma jority minority"districts was considered beneficial. These gerrymandered districts were packed with minorities which guaranteed minority representation in Congress However, in Shaw v. Reno(1993), and later in Miller v. Johnson(1995), the Supreme Court ruled that racial/ethnic gerrymanders were unconstitutional. Nevertheless, Hunt v Cromatrie approved of a seemingly racial gerrymandering since the motivation was mostly partisan rather than racial. The recent case League of United Latin American Citizens v Perry (June 2006) upheld the position that states are free to redistrict as often as they like o long as these redistrictings follow are not purely racially motivated. 2 Assumptions and Notation 2.1 What can we consider when districting? We have compiled the following list of possible factors one might consider is districting a State. The list is ranked with factors we consider more important or legitimate at the top 1. Population equality between districts (legally mandated) 2. Continuity of districts(legally mandated, excepting islands) 3. Respect for legal boundaries(counties, city limits, townships) 4. Respect for natural geographic boundaries 5. Compactness of district shapes 6. Respect for man-made boundaries(highways, parks, etc. 7. Respect for socio-economic similarity of constituents 8. Similarity to past district boundaries 9. Partisan political concerns 10. Desire to make districts(un)competitive 11. Racial/ethnic concerns 12. Desire to protect (or unseat)incumbent politicians We consider only the top seven factors in our model. Factors 9-12 are all related to political or racial concerns which our model is specifically designed to ignore. The case SC State Conference of Branches v. Riley(1982)ruled that past districts(Factor 8)are legitimate tool for creating new districts but we choose to ignore past districts since they are heay avily biased by Factors 9-12
Control # 1421 Page 3 out of 35 1.3 The legality of gerrymandering We should be clear on one point: though gerrymandering is objectionable to many, it is legal around the country. Interestingly, the Voting Rights Act of 1965 which eliminated poll taxes and other discriminatory voting policies may have inadvertently increased the prevalence of gerrymandering. One interpretation of the Act was that it mandated nondiscriminatory election results which led to a strange reversal of vocabulary where creating “majorityminority” districts was considered beneficial. These gerrymandered districts were packed with minorities which guaranteed minority representation in Congress. However, in Shaw v. Reno (1993), and later in Miller v. Johnson (1995), the Supreme Court ruled that racial/ethnic gerrymanders were unconstitutional. Nevertheless, Hunt v. Cromatrie approved of a seemingly racial gerrymandering since the motivation was mostly partisan rather than racial. The recent case League of United Latin American Citizens v. Perry (June 2006) upheld the position that states are free to redistrict as often as they like so long as these redistrictings follow are not purely racially motivated. 2 Assumptions and Notation 2.1 What can we consider when districting? We have compiled the following list of possible factors one might consider is districting a State. The list is ranked with factors we consider more important or legitimate at the top. 1. Population equality between districts (legally mandated) 2. Continuity of districts (legally mandated, excepting islands) 3. Respect for legal boundaries (counties, city limits, townships) 4. Respect for natural geographic boundaries 5. Compactness of district shapes 6. Respect for man-made boundaries (highways, parks, etc.) 7. Respect for socio-economic similarity of constituents 8. Similarity to past district boundaries 9. Partisan political concerns 10. Desire to make districts (un)competitive 11. Racial/ethnic concerns 12. Desire to protect (or unseat) incumbent politicians We consider only the top seven factors in our model. Factors 9-12 are all related to political or racial concerns which our model is specifically designed to ignore. The case SC State Conference of Branches v. Riley (1982) ruled that past districts (Factor 8) are a legitimate tool for creating new districts but we choose to ignore past districts since they are heavily biased by Factors 9-12
Control 1421 Page 4 out of 35 2.2 Geography and similar characteristics The US Census Bureau provides a great deal of data on legal, natural, and man-made boundaries as well as socio-economic similarity of regions. In each census, the United States is broken up into several degrees of accuracy, the smallest of which are: blocks(40 people on average), block groups(1500 people), and tracts(4500 people) We follow the practice in Young(1988) by districting based on a maximum level of resolution which in our Case Study( Section 5) is census tracts. Notational note: we refer to the smallest unit of division generally as a tract A reference from the Caliper Corporation describes tracts in the following quotation Census tract boundaries normally follow visible features, but may follow go ernmental unit boundaries and other non-visible features, and they always nest within counties. Census tracts are designed to be relatively homogenous units at the time the users established thenr? s, economic status, and living conditions For these reasons we believe that units at the tracts size(or less)are acceptably small and nous to use as a base unit. Further, tracts are completely contained within counties so we can easily check whether or not a district breaks county integrity 2. 3 Notation Define m to be the number of census tracts. and n the number of districts We denote our districts by D,1≤j≤m, and our tracts by T,1≤l≤m. Denote the set of all tracts by I=TihI<I<m; we call this a State. Denote the set of all districts at a particular time by A=(DiJ1sisn. We call this a partition for the State 2.3.1 Adjacency Define the symmetric relation Tp N Tg for tract pairs(Tp, Tq) which are adjacent. Define the function d(Ti) to be the district to which the tract Ti belongs. We also naturally extend the definition of d to sets of tracts Define the neighbor set of tract Ti by ar(T1)=TpErIT Tp) to be the set of all census acts neighboring Ti, and define ap(ti)=d(ar(i) to be the set of all districts containing ghbors of Ti. Every tract borders at least one other tracts, so ar(Ti) and ap(li) have 2.3.2 Borders Define the border of district D; as aDi=iT E Dilap(Ti)+(Di which is the set tracts in Di that are adjacent to at least one district other than D. The interior of district D; is Ii= DiaDi, the set of census tracts in D; whose neighbors are all in D;. Denote the total number of tracts in district D; as mi= Dil the number of border tracts as b;=laDl The frontier of D; is denoted Fi=UTED, ar(TD)\Di, i.e. the set of all tracts outside of D i that border the boundry tracts of D
Control # 1421 Page 4 out of 35 2.2 Geography and similar characteristics The US Census Bureau provides a great deal of data on legal, natural, and man-made boundaries as well as socio-economic similarity of regions. In each census, the United States is broken up into several degrees of accuracy, the smallest of which are: blocks (40 people on average), block groups (1500 people), and tracts (4500 people). We follow the practice in Young (1988) by districting based on a maximum level of resolution which in our Case Study (Section 5) is census tracts. Notational note: we refer to the smallest unit of division generally as a tract. A reference from the Caliper Corporation describes tracts in the following quotation: Census tract boundaries normally follow visible features, but may follow governmental unit boundaries and other non-visible features, and they always nest within counties. Census tracts are designed to be relatively homogenous units with respect to population characteristics, economic status, and living conditions at the time the users established them. For these reasons we believe that units at the tracts size (or less) are acceptably small and homogenous to use as a base unit. Further, tracts are completely contained within counties so we can easily check whether or not a district breaks county integrity. 2.3 Notation Define m to be the number of census tracts, and n the number of districts. We denote our districts by Di , 1 ≤ j ≤ n, and our tracts by Tl , 1 ≤ l ≤ m. Denote the set of all tracts by Γ = {Tl}1≤l≤m; we call this a State. Denote the set of all districts at a particular time by ∆ = {Di}1≤j≤n. We call this a partition for the State. 2.3.1 Adjacency Define the symmetric relation Tp ∼ Tq for tract pairs (Tp, Tq) which are adjacent. Define the function d(Tl) to be the district to which the tract Tl belongs. We also naturally extend the definition of d to sets of tracts. Define the neighbor set of tract Tl by aT (Tl) = {Tp ∈ Γ|Tl ∼ Tp} to be the set of all census tracts neighboring Tl , and define aD(Tl) = d(aT (Tl)) to be the set of all districts containing neighbors of Tl . Every tract borders at least one other tracts, so aT (Tl) and aD(Tl) have cardinality at least one for all Tl . 2.3.2 Borders Define the border of district Di as ∂Di = {Tl ∈ Di |aD(Tl) 6= {Di}} which is the set tracts in Di that are adjacent to at least one district other than Di . The interior of district Di is Ii = Di\∂Di , the set of census tracts in Di whose neighbors are all in Di . Denote the total number of tracts in district Di as mi = |Di | the number of border tracts as bi = |∂Di |. The frontier of Di is denoted Fi = (∪Tl∈DiaT (Tl))\Di , i.e. the set of all tracts outside of Di that border the boundry tracts of Di
Control 1421 Page 5 out of 35 2.3.3 Counties We denote a county as C; and the set of all counties as A. Districts can(and often do) break county boundaries but tracts are contained entirely within counties so we can think of a county as a set of districts. Districts are also sets of tracts so we interpret the set intersection Di n C as the set of tracts in both district Di and county C;. From this, we define c(Di)=C lDinCit0 to be the set of counties which overlap with D 2.3.4 Population Let the population of our State be P and we denote the optimal district size, 5, as p. We use the function p( to generally denote the population of an object, for instance P(Ti) and P(Ci) are the populations of tract Ti and county C;, respectively. Due to frequent use, we use the shorthand p;=P(Di)for the population of districts Table 1 is a useful reference of these numerous definitions Table 1: Variables and their meanings Variable Definition Number of congressional districts D The in district(1≤i≤m) Set of all districts in a State, a partition Number of census tracts T The ltn tractin(1≤l≤m) Set of all tracts in a State d(Ti) District to which tract Ti belongs Tracts Tp and Tq are adjacent (T) Set of districts containing tracts neighboring Ti OD Border of d tracts that neighbor another district Ii Interior of Di, tracts with do not neighbor another district Number of tracts in D b Number of tracts in aD F Set of all tracts outside of D, that border aD C he jth county C(Ti) The county to which tract Ti belongs C(Di) The set of counties containing district D; Total population of the State P Average population of a district p() Population of an arbitrary object Shorthand for P(Di), population of district D
Control # 1421 Page 5 out of 35 2.3.3 Counties We denote a county as Cj and the set of all counties as Λ. Districts can (and often do) break county boundaries but tracts are contained entirely within counties so we can think of a county as a set of districts. Districts are also sets of tracts so we interpret the set intersection Di ∩ Cj as the set of tracts in both district Di and county Cj . From this, we define c(Di) = {Cj |Di ∩ Cj 6= ∅} to be the set of counties which overlap with Di . 2.3.4 Population Let the population of our State be P and we denote the optimal district size, P n , as ¯p. We use the function p(·) to generally denote the population of an object, for instance p(Tl) and p(Cj ) are the populations of tract Tl and county Cj , respectively. Due to frequent use, we use the shorthand pi = p(Di) for the population of districts. Table 1 is a useful reference of these numerous definitions. Table 1: Variables and their meanings Variable Definition n Number of congressional districts Di The ith district (1 ≤ i ≤ n) ∆ Set of all districts in a State, a partition m Number of census tracts Tl The lth tractfin (1 ≤ l ≤ m) Γ Set of all tracts in a State d(Tl) District to which tract Tl belongs Tp ∼ Tq Tracts Tp and Tq are adjacent aT (Tl) Set of tracts adjacent to tract Tl aD(Tl) Set of districts containing tracts neighboring Tl ∂Di Border of Di , tracts that neighbor another district Ii Interior of Di , tracts with do not neighbor another district mi Number of tracts in Di bi Number of tracts in ∂Di Fi Set of all tracts outside of Di that border ∂Di Cj The jth county c(Tl) The county to which tract Tl belongs c(Di) The set of counties containing district Di P Total population of the State p¯ Average population of a district p(·) Population of an arbitrary object pi Shorthand for p(Di), population of district Di
Control 1421 Page 6 out of 35 2. 4 Past models Prior to explaining our modeling approach we would like discuss some previous work in the literature on congressional districting and gerrymandering. We used these papers as guides as we thought about and further refined our algorithm and implementation Cirincione et al.(2000)judge the quality of a districting plan based on equal population preservation of county integrity, and district area compactness. They require that district populations differ by no more than 1% from exact equality in the number of constituents and require point contiguity of the districts. The algorithm constructs districts by picking a random block group(their unit size), then adding additional block groups to the new district until the district population reaches p. At this point they repeat the process starting with a new random block group. Compactness of districts is based on their minimum bounding rectangles and county integrety is encouraged by "randomly"selecting new block groups with a preference for block groups in already inhabited counties Mehrotra et al.(1998) and Garfinkel and Nemhauser(1970) implement a"branch-and- price"method in the optimization step. They first obtain a districting, and optimize over their constraints such that population values are allowed to vary in the final solution of the optimization step. In a final step they split up population units to ensure population equality. They define compactness in a graph-theoretical manner where connected nodes are adjacent tracts. They define the "center"of a district to be the node(tract)with the lowest maximum distance to another other tract. They consider a graph(district )more compact when sum of distances from each node to the center is small We do not use this measure, as it does not uniquely define the center of a graph, and contrary to their claims, does allow for oddly-shaped districts, such as a district whose graph is a star-shaped tree with one tract in the center and many non-contiguous paths emanating from it. In our case study simulations, prior to the incorporation of a compactness factor in the objective function, we often obtain such a tree structure, which is one of the salient features of gerrymandering We also do not use a"branch-and-price"method of optimization. Following suggestions of Nagel(1965) and Kaiser(1966), we employ a local search algorithm in which tracts are swapped between existing districts to maximize some objective function. We describe this process in detail in Section 4 2.5 Measuring compactness The notion of compactness of a planar region has no uniformly accepted definition and research done by Young(1988) suggests that any reasonable measure of compactness fails to work well for certain geographic configurations. He further suggests that any good measure of compactness in such problems should consider the population units(census tracts in our case study) as indivisible units, and therefore that the measure of compactness should be made independently of the predetermined shapes of the population units. We follow this directive in our definition of compactness In fact, the compactness measures given in Young(1998)are not reasonable in the first instance, and do not include any notion of the area of a district, or comparing it to the perimeter. The measures include the maximum total perimeter of a district in a districting, determining the relative height and width of the district, and finding the moment of inertia
Control # 1421 Page 6 out of 35 2.4 Past Models Prior to explaining our modeling approach we would like discuss some previous work in the literature on congressional districting and gerrymandering. We used these papers as guides as we thought about and further refined our algorithm and implementation. Cirincione et al. (2000) judge the quality of a districting plan based on equal population, preservation of county integrity, and district area compactness. They require that district populations differ by no more than 1% from exact equality in the number of constituents, and require point contiguity of the districts. The algorithm constructs districts by picking a random block group (their unit size), then adding additional block groups to the new district until the district population reaches ¯p. At this point they repeat the process starting with a new random block group. Compactness of districts is based on their minimum bounding rectangles and county integrety is encouraged by “randomly” selecting new block groups with a preference for block groups in already inhabited counties. Mehrotra et al. (1998) and Garfinkel and Nemhauser (1970) implement a “branch-andprice” method in the optimization step. They first obtain a districting, and optimize over their constraints such that population values are allowed to vary in the final solution of the optimization step. In a final step they split up population units to ensure population equality. They define compactness in a graph-theoretical manner where connected nodes are adjacent tracts. They define the “center” of a district to be the node (tract) with the lowest maximum distance to another other tract. They consider a graph (district) more compact when sum of distances from each node to the center is small. We do not use this measure, as it does not uniquely define the center of a graph, and, contrary to their claims, does allow for oddly-shaped districts, such as a district whose graph is a star-shaped tree with one tract in the center and many non-contiguous paths emanating from it. In our case study simulations, prior to the incorporation of a compactness factor in the objective function, we often obtain such a tree structure, which is one of the salient features of gerrymandering. We also do not use a “branch-and-price” method of optimization. Following suggestions of Nagel (1965) and Kaiser (1966), we employ a local search algorithm in which tracts are swapped between existing districts to maximize some objective function. We describe this process in detail in Section 4. 2.5 Measuring compactness The notion of compactness of a planar region has no uniformly accepted definition and research done by Young (1988) suggests that any reasonable measure of compactness fails to work well for certain geographic configurations. He further suggests that any good measure of compactness in such problems should consider the population units (census tracts in our case study) as indivisible units, and therefore that the measure of compactness should be made independently of the predetermined shapes of the population units. We follow this directive in our definition of compactness. In fact, the compactness measures given in Young (1998) are not reasonable in the first instance, and do not include any notion of the area of a district, or comparing it to the perimeter. The measures include the maximum total perimeter of a district in a districting, determining the relative height and width of the district, and finding the moment of inertia
Control 1421 Page 7 out of 35 of the district. All of these measures fail to consider both perimeter and area simultaneously, which seems to be a reasonable requirement of a good compactness measure The Isoperimetric Theorem, first proved (non-rigorously) by J. Steiner in 1838, states that the quantity A/P2, given by the ratio of the area A of a planar region(not necessarily continuous) to the square of its perimeter is maximized when the region is circular. The maximum achievable compactness, that of a circle with radius r, is given by 472r2 =4 so we define compactness of a region as the ratio(4T A)/P2. This ratio is bounded within(0, 1] where higher values indicate greater compactness We believe this serves as a good measure of the broadly defined "regularity "of a region which is so important to the study of Congressional districting and gerrymandering. Specifi cally, any shear of factor s applied to a circle decreases the compactness by a factor of s, and any concave region has lower compactness than does its convex hull. It is easy to see that we can make an even stronger statement: the convex hull of a concave region has greater area and smaller perimeter Observe that a square gets close to the optimum, with a compactness of tr N 0.785. This implies that the set of possible compactness values for rectangles is(0, 0.785) since a square is the most compact rectangle 3 The Multi-Seeded growth Model We take a two-stage approach to finding the best districts for a given State. In the multi- Seeded growth Model. referred to as MSgm hereafter. we find an initial allocation of n districts so that the partition has modest levels of population equality and county preser- vation. Our more precise Partition Optimization Model, or POM, edits and improves the rough sketch from MSGM into until it becomes, hopefully, a work of art The reason that our model runs in two phases is simple: speed. Our knee-jerk reaction to the problem was to randomly allocate tracts to the n districts and then optimize by swapping tracts trying to improve some objective function. However, a random initial configuration is so far from the global maximum that the search might take millions of years The MSGM generates a very crude coloring of a State that ensures district contiguity and tries, but does not guarantee, to achieve population equality and county preservation The districts created by MSGM are completely unacceptable for an actual plan but save enormous amounts of computing time for our solution 3. 1 How it works At first, our task seems daunting. How do we allocate n districts equally, even to a rough approximation? Our solution is to grow the n districts simultaneously until they cover the State We start by allocating the entire State to a blank, dummy district Do, and then al- locating n tracts that serve as the initial "seeds" for the final districts. such that each Di, iEl,.,n begins as only a single tract. Now while Dol>0, we take the set S of all
Control # 1421 Page 7 out of 35 of the district. All of these measures fail to consider both perimeter and area simultaneously, which seems to be a reasonable requirement of a good compactness measure. The Isoperimetric Theorem, first proved (non-rigorously) by J. Steiner in 1838, states that the quantity A/P2 , given by the ratio of the area A of a planar region (not necessarily continuous) to the square of its perimeter is maximized when the region is circular. The maximum achievable compactness, that of a circle with radius r, is given by πr2 4π2r 2 = 1 4π so we define compactness of a region as the ratio (4πA)/P2 . This ratio is bounded within (0, 1] where higher values indicate greater compactness. We believe this serves as a good measure of the broadly defined “regularity” of a region which is so important to the study of Congressional districting and gerrymandering. Specifi- cally, any shear of factor s applied to a circle decreases the compactness by a factor of s, and any concave region has lower compactness than does its convex hull. It is easy to see that we can make an even stronger statement: the convex hull of a concave region has greater area and smaller perimeter. Observe that a square gets close to the optimum, with a compactness of 4π 16 ≈ 0.785. This implies that the set of possible compactness values for rectangles is (0, 0.785) since a square is the most compact rectangle. 3 The Multi-Seeded Growth Model We take a two-stage approach to finding the best districts for a given State. In the MultiSeeded Growth Model, referred to as MSGM hereafter, we find an initial allocation of n districts so that the partition has modest levels of population equality and county preservation. Our more precise Partition Optimization Model, or POM, edits and improves the rough sketch from MSGM into until it becomes, hopefully, a work of art. The reason that our model runs in two phases is simple: speed. Our knee-jerk reaction to the problem was to randomly allocate tracts to the n districts and then optimize by swapping tracts trying to improve some objective function. However, a random initial configuration is so far from the global maximum that the search might take millions of years. The MSGM generates a very crude coloring of a State that ensures district contiguity and tries, but does not guarantee, to achieve population equality and county preservation. The districts created by MSGM are completely unacceptable for an actual plan but save enormous amounts of computing time for our solution. 3.1 How it works At first, our task seems daunting. How do we allocate n districts equally, even to a rough approximation? Our solution is to grow the n districts simultaneously until they cover the State. We start by allocating the entire State to a blank, dummy district D0, and then allocating n tracts that serve as the initial “seeds” for the final districts, such that each Di , i ∈ {1, . . . , n} begins as only a single tract. Now while |D0| > 0, we take the set S of all
Control 1421 Page 8 out of 35 possible moves which involve taking a district from Do while preserving contiguity. That is Where M(T, Di, Di)represents a move of tract Ti from D; to Di, corresponding to the exit of Ti from Di and the entrance of Ti into D;. We then sort the moves in S by our heuristic function y (D1,., D,)-R, a function increasing in the desirability of our prospectiv partition. Each move is scored by the heuristic value that would result if we were to accept only that move. We then conclude by performing the moves corresponding to the top 3% of the scored moves in S. Note that this method preserves contiguity, because by definition any Ti E Fi must be contiguous with Di, and thus the Di are contiguous at each step Had we but world enough, and time, we would only perform the best possible move found in S before recalculating the frontier. Even though in the msgm we do not consider moves between two "true"districts (rather, we consider only moves between a true district and the dummy district), the value of a move does not exist in isolation. Consider two distinct districts D; and D;, and two tracts Ti E FinFi and Tk E FinF. The acceptance of M(Tk, Do, Di)alters the heuristic value of every move associated with Fi, which could poten- tially affect the optimality of further moves with Di, such as the acceptance of M(Ti, Do, Di rather than M(Ti, Do, Di). Furthermore, the acceptance of M(Ti, Do, Di) likely expands the size of Fi. Perhaps there is an optimal move opened up in this new frontier that we do not even consider, because we have not even calculated its value It would be better to only perform the best move, but such a strategy was found to be too computationally intensive. We compromise by taking only a small, elite fraction of the moves in each step before recalculating S and the values of its associated moves. In this respect, our approach is analogous to the strategy of modified policy iteration for solving a Markov decision problem. And just as modified policy iteration excels in practice, we found that the tradeoff of possible inefficiency is more than compensated for by the speed gains of the algorithm, especially considering that the solution obtained by MSGM will be further refined by POM In true modified policy iteration, k rounds of value iteration are made in-between policy iterations such that k is fixed. Our MSGM scheme uses a variable number of moves in- between recalculating the value of the frontier. We selected our scheme because it causes us to be delicate in our selections of tract allocations, making moves virtually one-at-a- time, at the beginning and end of the MsGm. By focusing on the beginning and end of the problem, we attempt to avoid having a single district grow too large through possible inefficient allocation Unlike Cirincione(2000)we use random initial seeds weighted by population rather than seeds that are equally spaced around the State. The process works as follows: while there are still random seeds to be selected, we find a candidate initial seed tract Ti in Do. Letting the largest tract in our State have population p, we accept Ti as an initial seed with probability P(Ti/p. We thus select tracts in linear proportion to their population. We found that the MSGm algorithm produces the best initial results when all the districts have the same amount of population, rather than the same number of tracts around which to grow. The geographically optimal placement of five, or fewer, starting seeds in the NYC Metropolitan
Control # 1421 Page 8 out of 35 possible moves which involve taking a district from D0 while preserving contiguity. That is: S = [n i=1 [ Tl∈Fi M(Tl , D0, Di) Where M(Tl , Di , Dj ) represents a move of tract Tl from Di to Dj , corresponding to the exit of Tl from Di and the entrance of Tl into Dj . We then sort the moves in S by our heuristic function Ψ(D1, . . . , Dn) → R, a function increasing in the desirability of our prospective partition. Each move is scored by the heuristic value that would result if we were to accept only that move. We then conclude by performing the moves corresponding to the top 3% of the scored moves in S. Note that this method preserves contiguity, because by definition any Tl ∈ Fi must be contiguous with Di , and thus the Di are contiguous at each step. Had we but world enough, and time, we would only perform the best possible move found in S before recalculating the frontier. Even though in the MSGM we do not consider moves between two “true” districts (rather, we consider only moves between a true district and the dummy district), the value of a move does not exist in isolation. Consider two distinct districts Di and Dj , and two tracts Tl ∈ Fi ∩Fj and Tk ∈ Fi ∩F c j . The acceptance of M(Tk, D0, Di) alters the heuristic value of every move associated with Fi , which could potentially affect the optimality of further moves with Di , such as the acceptance of M(Tl , D0, Di) rather than M(Tl , D0, Dj ). Furthermore, the acceptance of M(Tl , D0, Di) likely expands the size of Fi . Perhaps there is an optimal move opened up in this new frontier that we do not even consider, because we have not even calculated its value. It would be better to only perform the best move, but such a strategy was found to be too computationally intensive. We compromise by taking only a small, elite fraction of the moves in each step before recalculating S and the values of its associated moves. In this respect, our approach is analogous to the strategy of modified policy iteration for solving a Markov decision problem. And just as modified policy iteration excels in practice, we found that the tradeoff of possible inefficiency is more than compensated for by the speed gains of the algorithm, especially considering that the solution obtained by MSGM will be further refined by POM. In true modified policy iteration, k rounds of value iteration are made in-between policy iterations, such that k is fixed. Our MSGM scheme uses a variable number of moves inbetween recalculating the value of the frontier. We selected our scheme because it causes us to be delicate in our selections of tract allocations, making moves virtually one-at-atime, at the beginning and end of the MSGM. By focusing on the beginning and end of the problem, we attempt to avoid having a single district grow too large through possible inefficient allocation. Unlike Cirincione (2000) we use random initial seeds weighted by population rather than seeds that are equally spaced around the State. The process works as follows: while there are still random seeds to be selected, we find a candidate initial seed tract Tl in D0. Letting the largest tract in our State have population ˆp, we accept Tl as an initial seed with probability p(Tl)/pˆ. We thus select tracts in linear proportion to their population. We found that the MSGM algorithm produces the best initial results when all the districts have the same amount of population, rather than the same number of tracts around which to grow. The geographically optimal placement of five, or fewer, starting seeds in the NYC Metropolitan