Why Weight A Cluster-Theoretic Approach to Political Districting Abstract Political districting has been one of the most contentious issues within American politics over the last two centuries. Since the landmark case of Baker u. Carr, in which the United States Supreme Court ruled that the constitutionality of a state's legislated districting is within the jurisdiction of a federal court, many within academia have attempted to produce a rigorous system for determining a set of districts for a given state. In this paper, we attempt to improve upon these past efforts. We propose both a modified form of classical K-means clustering and an interesting algorithm called the shortest-splitline algorithm to accomplish impartial redistricting. As an example, we apply our methods to redistricting the state of New York, and, as further examples, to Texas and Colorado. Both methods use only population density data and state boundaries as inputs and run in a feasible amount of time. Our criteria for successful redistrict- ing include contiguity, compactness, and sufficiently uniform population The K-means method produces districts similar to convex polygons and the splitline method guarantees that the resulting districts have piecewise linear boundaries. The K-means method has the advantage of allowing eeding of the district centers. The centers of the generated districts then roughly correlate to the existing districts, by proper seeding, but the re-
Why Weight? A Cluster-Theoretic Approach to Political Districting February 17, 2007 Abstract Political districting has been one of the most contentious issues within American politics over the last two centuries. Since the landmark case of Baker v. Carr, in which the United States Supreme Court ruled that the constitutionality of a state’s legislated districting is within the jurisdiction of a federal court, many within academia have attempted to produce a rigorous system for determining a set of districts for a given state. In this paper, we attempt to improve upon these past efforts. We propose both a modified form of classical K-means clustering and an interesting algorithm called the shortest-splitline algorithm to accomplish impartial redistricting. As an example, we apply our methods to redistricting the state of New York, and, as further examples, to Texas and Colorado. Both methods use only population density data and state boundaries as inputs and run in a feasible amount of time. Our criteria for successful redistricting include contiguity, compactness, and sufficiently uniform population. The K-means method produces districts similar to convex polygons and the splitline method guarantees that the resulting districts have piecewise linear boundaries. The K-means method has the advantage of allowing seeding of the district centers. The centers of the generated districts then roughly correlate to the existing districts, by proper seeding, but the resulting boundaries are vastly simpler. 1
Control No. 1036 2of18 Contents 1 Introduction 11 Plan of atta 1.2 Defining Simpleness 1.3 Towards a Suitable Conception of Compactness 1.4 Defining Fairness 2 Applying the Theory of Data Clustering 3 The K-means Algorithm 3.1 Standard Algorithm 3.2 Weighted Algorithm 33445677899 4 Splitline algorithm 4.2 Demonstration 5 An Application: Considering the Congressional Districting of New York State 11 5.1K- Algorithm 11 5.2 Splitline Algorithm 6 Conclusions 6.1 Towards a Suitable Definition of Compactnes 13 a Various Definitions of Compactness 15 B Numerical Results 16 1 Compactness quotient results B 2 Population distribution error
Control No. 1036 2 of 18 Contents 1 Introduction 3 1.1 Plan of Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Defining Simpleness . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Towards a Suitable Conception of Compactness . . . . . . . . . . 4 1.4 Defining Fairness . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Applying the Theory of Data Clustering 6 3 The K-means Algorithm 7 3.1 Standard Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 Weighted Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 8 4 Splitline Algorithm 9 4.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 4.2 Demonstration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 5 An Application: Considering the Congressional Districting of New York State 11 5.1 K-means Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 11 5.2 Splitline Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 12 6 Conclusions 12 6.1 Towards a Suitable Definition of Compactness . . . . . . . . . . . 13 A Various Definitions of Compactness 15 B Numerical Results 16 B.1 Compactness quotient results . . . . . . . . . . . . . . . . . . . . 16 B.2 Population distribution error . . . . . . . . . . . . . . . . . . . . 17
Control No. 1036 f18 1 Introduction In the words of Ted Harrington, chair of political science at the University of North Carolina There is no issue that is more sensitive to politicians of all colors and ideological perst than redistricting. It will determine who wi and loses for eight years [You88 The writers of the constitution created the House of Representatives with the intention that it would be the branch of government most responsive to the people. The reality is just the opposite. Though representatives are elected every 2 years, instead of every 4 or 6 years, almost 400 of the 435 seats of the House are not contested as a result of the extraordinary power of gerryman- dering. With the immensely detailed amount of data and unlimited computing power available to politicians today, gerrymandering has been elevated to an art. With only the requirements that districts be connected and all have equal population, it is possible to pinpoint candidates and place them in a different district than their neighbors Too03 Though undemocratic, gerrymandering is nearly always legal(see, for in stance,Bac86) and has been used to obtain striking results. In 2002 only 4 incumbent representatives lost their bid for reelection -the lowest total ever Too03. We will argue that it is certainly true that any attempt to fairly restruc ture legislative districts needs to ignore the human factors that overwhelmingly determine the current redistricting process. Defining some measure of compact ness is essential to ensure fair districts. Both methods we describe produce districts that at first glance are clearly simpler than the existing ones Restructuring the districts with no regard to the current layout would be ore difficult to implement. We will use the centers of the existing districts seeds for our clustering algorithm. Thus, the new districts have some correla- tion to the existing districts, but their boundaries will be determined in a fair nanner. The core of many districts will be roughly the same, while the bound- aries will be dramatically simpler. This will effectively counteract the effects of gerrymandering, without being overly difficult to put into use immediately. 1.1 Plan of Attack Our goal is to develop an algorithmic process for dividing an arbitrary region into k legislative districts, which satisfy some heuristic definition of fairness. In order to do so we must do the following Define terms. Crucial to creating a model is defining the somewhat ambiguous terms fairness and simpleness Define metrics for comparing algorithms
Control No. 1036 3 of 18 1 Introduction In the words of Ted Harrington, chair of political science at the University of North Carolina, There is no issue that is more sensitive to politicians of all colors and ideological persuasions than redistricting. It will determine who wins and loses for eight years [You88]. The writers of the constitution created the House of Representatives with the intention that it would be the branch of government most responsive to the people. The reality is just the opposite. Though representatives are elected every 2 years, instead of every 4 or 6 years, almost 400 of the 435 seats of the House are not contested as a result of the extraordinary power of gerrymandering. With the immensely detailed amount of data and unlimited computing power available to politicians today, gerrymandering has been elevated to an art. With only the requirements that districts be connected and all have equal population, it is possible to pinpoint candidates and place them in a different district than their neighbors [Too03]. Though undemocratic, gerrymandering is nearly always legal (see, for instance, [Bac86]) and has been used to obtain striking results. In 2002 only 4 incumbent representatives lost their bid for reelection — the lowest total ever [Too03]. We will argue that it is certainly true that any attempt to fairly restructure legislative districts needs to ignore the human factors that overwhelmingly determine the current redistricting process. Defining some measure of compactness is essential to ensure fair districts. Both methods we describe produce districts that at first glance are clearly simpler than the existing ones. Restructuring the districts with no regard to the current layout would be more difficult to implement. We will use the centers of the existing districts as seeds for our clustering algorithm. Thus, the new districts have some correlation to the existing districts, but their boundaries will be determined in a fair manner. The core of many districts will be roughly the same, while the boundaries will be dramatically simpler. This will effectively counteract the effects of gerrymandering, without being overly difficult to put into use immediately. 1.1 Plan of Attack Our goal is to develop an algorithmic process for dividing an arbitrary region into k legislative districts, which satisfy some heuristic definition of fairness. In order to do so, we must do the following: • Define terms. Crucial to creating a model is defining the somewhat ambiguous terms fairness and simpleness. • Define metrics for comparing algorithms
Control No 4of18 Figure 1: The compactness quotients of the circle, square, and gerrymander are 1,丌/4≈0.79,and23x/576≈0.13, respectively 1.2 Defining Simpleness Ve say that district A is more simple than district B if district A is contiguous and district A is more compact than district B Contiguity. We say that a district is contiguous if it is arcwise-connected hat is, if one can travel from any point a to any other point b in district A hile remaining entirely within district A. If A contains regions separated by bodies of water, A is contiguous if all regions are connected by water and each region is arcwise-connected Compactness Intuitively, we say that a district is compact if it does not meander exces- sively. This is a hard concept to formalize; many authors give only a hasty definition of compactness, and some have even argued that compactne is ambiguous to the point of being irrelevant in a serious treatment of dis- tricting. Nonetheless, we will now attempt to develop a suitable definition of compactness 1.3 Towards a Suitable Conception of Compactness In [You88, Young gives compelling reasons for abandoning all of the definitions of compactness mentioned in the second appendix. Interestingly enough, Young does not consider the following adjusted version of the Schwartzberg Test, which is alluded to in GN7o
Control No. 1036 4 of 18 Figure 1: The compactness quotients of the circle, square, and gerrymander are 1, π/4 ≈ 0.79, and 23π/576 ≈ 0.13, respectively. 1.2 Defining Simpleness We say that district A is more simple than district B if district A is contiguous, and district A is more compact than district B. • Contiguity. We say that a district is contiguous if it is arcwise-connected; that is, if one can travel from any point a to any other point b in district A while remaining entirely within district A. If A contains regions separated by bodies of water, A is contiguous if all regions are connected by water and each region is arcwise-connected. • Compactness. Intuitively, we say that a district is compact if it does not meander excessively. This is a hard concept to formalize; many authors give only a hasty definition of compactness, and some have even argued that compactness is ambiguous to the point of being irrelevant in a serious treatment of districting. Nonetheless, we will now attempt to develop a suitable definition of compactness. 1.3 Towards a Suitable Conception of Compactness In [You88], Young gives compelling reasons for abandoning all of the definitions of compactness mentioned in the second appendix.. Interestingly enough, Young does not consider the following adjusted version of the Schwartzberg Test, which is alluded to in [GN70]:
Control No. 1036 f18 Definition 1. We say that district A is more compact than district B if (Perimeter A)2(PerimeterS)2 Call the quantity 4 Area/ Perimeter the compactness quotient For a circle of radius r, this ratio is equal t 4丌 It is well-known that the shape with the largest ratio of area to squared perimeter is the circle( see, for instance, Fol02 ) Because of this, the quantity Area 4丌 Perimeter is restricted to the interval [0, 1] As seen in figure 1, a compactness quotient of 0. 13 is visually quite bad Using the fact given in Bou88 that the area of a non-self-intersecting closed N-gon(with the k-th vertex taken in counterclockwise order equal to(ak, yk)) we have calculated the compactness quotients of several actual districts by ap proximating their boundaries by piecewise linear segments. The results illustrate the inappropriate nature of the districts currently in place. Two of New Yorks nore sprawling districts, the &th and 28th, produced compactness quotients of 0.097 and 0.101, respectively -even worse then the gerrymander shown in figure 1! The two most compact districts in New York, the 26th and 21st had compactness quotients of 0.406 and 0.498, respectively. We decided that the mean for any state should be at least. 6. With this condition the average district in every state would be better than the best districts currently in New York. Furthermore we insist that 25 should be more than 2 standard deviations from the mean. It is not possible to require that all districts be greater than. 25 as several districts will inevitably end up having most of their border coincide with the border of the state 1. 4 Defining fairness Almost all unfairness occurs when political and social measures factor into redis- tricting decisions. Practices such as concentrating supporting voters in a single district, diluting opposing voters over several districts, placing two incumbents n the same district and forcing them to run against each other, and isolating minorities have been seen many times before(see Too03 and Hay 96 ) and are all the result of districing being controlled by those who attempt to skew voting patterns. In general, one can summarize past districting patterns in the ollowing way:
Control No. 1036 5 of 18 Definition 1. We say that district A is more compact than district B if 4π AreaA (PerimeterA) 2 > 4π AreaB (PerimeterB) 2 . Call the quantity 4π Area /Perimeter2 the compactness quotient. For a circle of radius r, this ratio is equal to 4π · πr2 (2πr) 2 = 1. It is well-known that the shape with the largest ratio of area to squared perimeter is the circle (see, for instance, [Fol02]). Because of this, the quantity 4π · Area Perimeter2 is restricted to the interval [0, 1]. As seen in figure 1, a compactness quotient of 0.13 is visually quite bad. Using the fact given in [Bou88] that the area of a non-self-intersecting closed N-gon (with the k-th vertex taken in counterclockwise order equal to (xk, yk)) is equal to 1 2 N X−1 i=1 (xiyi+1 − xi+1yi), we have calculated the compactness quotients of several actual districts by approximating their boundaries by piecewise linear segments. The results illustrate the inappropriate nature of the districts currently in place. Two of New York’s more sprawling districts, the 8th and 28th, produced compactness quotients of 0.097 and 0.101, respectively — even worse then the gerrymander shown in figure 1! The two most compact districts in New York, the 26th and 21st, had compactness quotients of 0.406 and 0.498, respectively. We decided that the mean for any state should be at least .6. With this condition the average district in every state would be better than the best districts currently in New York. Furthermore we insist that .25 should be more than 2 standard deviations from the mean. It is not possible to require that all districts be greater than .25 as several districts will inevitably end up having most of their border coincide with the border of the state. 1.4 Defining Fairness Almost all unfairness occurs when political and social measures factor into redistricting decisions. Practices such as concentrating supporting voters in a single district, diluting opposing voters over several districts, placing two incumbents in the same district and forcing them to run against each other, and isolating minorities have been seen many times before (see [Too03] and [Hay96]), and are all the result of districing being controlled by those who attempt to skew voting patterns. In general, one can summarize past districting patterns in the following way:
Control No. 1036 6of18 Luke onward Lockport igure 2: Current districts 8, 28, 26, and 21, from left to right and top to bottom, with compactness quotients of 0.097, 0.101, 0.406, and 0.498, respectively Unfair districting stems from either human biases or poorly de- igned algorithms Our computer simulations do not use any of this extraneous data. The only data that we have used is population density and the boundary of the state Therefore, the determination of districts is completely unbiased. While it may hold a district may be unfair on a local scale, in that it divides up a community with a common interest-for instance, a community of apple-growers may be split between two districts -on the national scale, such imbalances will even out. Because of this, there will be no pathological examples of disproportionate representation. 2 Applying the Theory of Data Clustering The theory of data clustering is the theory of classifying n observations(or objects) into m groups -for instance, placing two carts'full of groceries into into paper sacks. There are two main benefits of applying a cluster-theoretic
Control No. 1036 6 of 18 Figure 2: Current districts 8, 28, 26, and 21, from left to right and top to bottom, with compactness quotients of 0.097, 0.101, 0.406, and 0.498, respectively. • Unfair districting stems from either human biases or poorly designed algorithms. Our computer simulations do not use any of this extraneous data. The only data that we have used is population density and the boundary of the state. Therefore, the determination of districts is completely unbiased. While it may hold a district may be unfair on a local scale, in that it divides up a community with a common interest — for instance, a community of apple-growers may be split between two districts — on the national scale, such imbalances will even out. Because of this, there will be no pathological examples of disproportionate representation. 2 Applying the Theory of Data Clustering The theory of data clustering is the theory of classifying n observations (or objects) into m groups — for instance, placing two carts’ full of groceries into into paper sacks. There are two main benefits of applying a cluster-theoretic
Control No. 1036 7of18 algorithm to a given data set Data clustering often reveals an internal structure that may not have been initially apparent It is often much easier to work with a small number of clusters than with a large number of raw data The philosophy of data clustering is that we should be able to divide our data into a(not necessarily fixed)number of clusters, and that the elements of a given clusters should be somehow similar. In general, data clustering is applied to problems that deal with a large number of variables. For instance, when data clustering is used to create an animal taxonomy, there are a myriad of variables mode of reproduction, mode of transportation, presence and type of spine ideal diet, preferred habitat, and so forth [And73! Because of this, it is usuall very difficult to determine the "proper"way to cluster data [AC84 In the case of attempting to draw up simple and fair congressional districts we can apply data clustering in the following w Split the state into small, discrete units. Our units correspond to geographic locations of census population measurements fIESIN Determine some partition of these units, such that the subsets of this partition can be viewed as clusters. Note that the only variables resent are the location and population of each unit After defining a method for ordering the preference of cluster partitions possible cluster partitions and choose the best one! However, this turns out to be not feasible. In [AS68, Abramowitz and Stegun give a proof of the fact that the number of ways of sorting n observations into m groups is a Stirling number of the second kind: 1 For instance, there are more than 10- ways to sort 25 objects into 5 groups. It clear that we need some sort of algorithmic process in order to determine an appropriate partition of clusters 3 The K-means algorithm 3.1 Standard Algorithm The K-means algorithm is an iterative method for data clustering. Let D [ C r be the data to be clustered, and let S=(siis be a set of seeds Suppose we desire D to be partitioned into K clusters; let the i-th cluster be
Control No. 1036 7 of 18 algorithm to a given data set: • Data clustering often reveals an internal structure that may not have been initially apparent. • It is often much easier to work with a small number of clusters than with a large number of raw data. The philosophy of data clustering is that we should be able to divide our data into a (not necessarily fixed) number of clusters, and that the elements of a given clusters should be somehow similar. In general, data clustering is applied to problems that deal with a large number of variables. For instance, when data clustering is used to create an animal taxonomy, there are a myriad of variables — mode of reproduction, mode of transportation, presence and type of spine, ideal diet, preferred habitat, and so forth [And73]! Because of this, it is usually very difficult to determine the “proper” way to cluster data [AC84]. In the case of attempting to draw up simple and fair congressional districts, we can apply data clustering in the following way: • Split the state into small, discrete units. Our units correspond to geographic locations of census population measurements [fIESIN]. • Determine some partition of these units, such that the subsets of this partition can be viewed as clusters. Note that the only variables present are the location and population of each unit. After defining a method for ordering the preference of cluster partitions, we may suppose we are done with the problem: all that is left is to look at all possible cluster partitions and choose the best one! However, this turns out to be not feasible. In [AS68], Abramowitz and Stegun give a proof of the fact that the number of ways of sorting n observations into m groups is a Stirling number of the second kind: S (n) m = 1 m! Xm k=0 (−1)m−k m k k n . For instance, there are more than 1015 ways to sort 25 objects into 5 groups. It is clear that we need some sort of algorithmic process in order to determine an appropriate partition of clusters. 3 The K-means Algorithm 3.1 Standard Algorithm The K-means algorithm is an iterative method for data clustering. Let D = {xj} N j=1 ⊂ R n be the data to be clustered, and let S = {sj} K j=1 be a set of seeds. Suppose we desire D to be partitioned into K clusters; let the i-th cluster be
Control No. 1036 8of18 denoted by Ci. Associate to the i-th cluster a geographical center, denoted by ci Given an distance function f: RXR-R, the K-means algorithm proceeds foll . Initialization: for all Ci let ci=s . Iteration Assign points to clusters: For all x E D, associate x to the center Update cluster centers: Redefine Repetition: If updating cluster centers changes at least one cluster cen- ter, repeat the iteration step. Otherwise, stop 3.2 Weighted Algorithm To generate districts of appropriate population, we have added a weighting system to the standard algorithm. Let each cluster correspond to a legislative district. Let D=(xia CR2 be the set of census coordinates. Thus, XE D corresponds to the position of a population measurement. Define a population function p: D-R such that pi is the population at the coordinates specific by xi. A cluster C, is defined by its points x C R, its center xE R and some weight aj. Define f to be the Euclidean distance function in R2.Our weighted K-means algorithm proceeds as follows: Initialization: Using the standard K-means algorithm, assign points t clusters and centers to appropriate positions . Iteration Assign points to clusters: For all x E D, associate x to the center ci such that a;f(x, ci) is minimized Update cluster centers: Redefin f(r,ci) Update cluster weights Redefin Qi=9 defined be
Control No. 1036 8 of 18 denoted by Ci . Associate to the i-th cluster a geographical center, denoted by ci . Given an distance function f : R n × R n → R, the K-means algorithm proceeds as follows. • Initialization: for all Ci , let ci = si . • Iteration: – Assign points to clusters: For all x ∈ D, associate x to the center ci such that f (x, ci) is minimized. – Update cluster centers: Redefine ci = P x∈Ci f (x, ci) P x∈Ci . • Repetition: If updating cluster centers changes at least one cluster center, repeat the iteration step. Otherwise, stop. 3.2 Weighted Algorithm To generate districts of appropriate population, we have added a weighting system to the standard algorithm. Let each cluster correspond to a legislative district. Let D = {xj} N j=1 ⊂ R 2 be the set of census coordinates. Thus, x ∈ D corresponds to the position of a population measurement. Define a population function p : D → R such that pi is the population at the coordinates specified by xi . A cluster Cj is defined by its points x ⊂ R 2 , its center xj ∈ R 2 , and some weight αj . Define f to be the Euclidean distance function in R 2 . Our weighted K-means algorithm proceeds as follows: • Initialization: Using the standard K-means algorithm, assign points to clusters and centers to appropriate positions. • Iteration: – Assign points to clusters: For all x ∈ D, associate x to the center ci such that αif (x, ci) is minimized. – Update cluster centers: Redefine ci = P x∈Ci pif (x, ci) P x∈Ci pi . – Update cluster weights: Redefine αi = g X x∈Ci pi ! , where g is defined below
Control No. 1036 f18 Repetition: If the properties of the clusters are within our tolerance levels we stop. Otherwise, repeat the iteration step. By adjusting the weights, we are able to control the growth or decay of the ters. If the weight of a cluster increases, data points are more likely to be grouped in other clusters. Similarly, decreasing the weight helps to increase the opulation of a cluster. Thus the weight function g: RXR-IR is crucial in the performance of the algorithm. We define: where i is the current iteration, io is the maximum number of iterations, and Po is the desired population for each cluster. Towards the beginning of the algorithm,i/io is low causing the term w* p/po*VI-ilio to dominate the weight function. As the i increases, the weight fluctuates less because w*sarti/io begins to dominate w*p/po*VI-i/io. This enables the weights to change rapidly at the beginning of the iterative process causing the clusters to vary greatly between iterations. However, by the end of the algorithm, the weights do not change as readily, allowing stabilization over a optimal clustering. This is somewhat similar to the process of simulated annealing where initial negative actions allow the algorithm to escape local optimums and the probability a negative action is taken decreases over time 4 Splitline algorithm Recently, a very elegant algorithm for districting has been proposed by applied mathematician Warren B Smith Smil 4.1 Method The idea behind the splitline algorithm is quite simple e Start with the number of districts for the state. Divide that number in two as evenly as possible, using integers(for instance, 18=9+9 and Find the shortest line that divides the state into two parts such the ratio of their populations is the same as the ratio determined in the previous Repeat this process recursively on the subdivided parts until the number of parts is the same as the number of districts. At every step, the division is just a line, and so the resulting districts have piecewise linear bound aries. Using the shortest line ensures that the districts will have a good
Control No. 1036 9 of 18 • Repetition: If the properties of the clusters are within our tolerance levels we stop. Otherwise, repeat the iteration step. By adjusting the weights, we are able to control the growth or decay of the clusters. If the weight of a cluster increases, data points are more likely to be grouped in other clusters. Similarly, decreasing the weight helps to increase the population of a cluster. Thus the weight function g : R × R → R is crucial in the performance of the algorithm. We define: g (p, w) = w r i i0 + w · p p0 · r 1 − i i0 , where i is the current iteration, i0 is the maximum number of iterations, and p0 is the desired population for each cluster. Towards the beginning of the algorithm, i/i0 is low causing the term w ∗ p/p0 ∗ p 1 − i/i0 to dominate the weight function. As the i increases, the weight fluctuates less because w∗sqrti/i0 begins to dominate w ∗ p/p0 ∗ p 1 − i/i0. This enables the weights to change rapidly at the beginning of the iterative process causing the clusters to vary greatly between iterations. However, by the end of the algorithm, the weights do not change as readily, allowing stabilization over a optimal clustering. This is somewhat similar to the process of simulated annealing where initial negative actions allow the algorithm to escape local optimums and the probability a negative action is taken decreases over time. 4 Splitline Algorithm Recently, a very elegant algorithm for districting has been proposed by applied mathematician Warren B. Smith [Smi]. 4.1 Method The idea behind the splitline algorithm is quite simple: • Start with the number of districts for the state. Divide that number in two as evenly as possible, using integers (for instance, 18 = 9 + 9 and 35 = 17 + 18). • Find the shortest line that divides the state into two parts such the ratio of their populations is the same as the ratio determined in the previous step. • Repeat this process recursively on the subdivided parts until the number of parts is the same as the number of districts. At every step, the division is just a line, and so the resulting districts have piecewise linear boundaries. Using the shortest line ensures that the districts will have a good compactness quotient
Control No. 1036 10of18 Step 3 Step I Step 2 ● Data points Figure 3: An illustration of the splitline method 4.2 Demonstration Figure 3 is a demonstration of the splitline algorithm creating 5 districts from a simple data set of 15 points. With 15 points and 5 districts there need to be 3 points in each district. 3: 2 is the most balanced integer ratio that 5 can be divided into. At step l the algorithm divides the state into two regions with 9 and 6 people respectively, the correct ratios for 3 and 2 districts. At step 2, it acts recursively on the 2 subdivisions. Thus the region that had 6 people is divided into regions that have 3 people each, with no more ubdivision needed. The other region is divided into regions with 6 and 3 people he appropriate numbers for 2 and 1 districts respectively. At the third and final step the last region is split in two and the process is complete. By using the shortest line at each step, none of the shapes end up with an un
Control No. 1036 10 of 18 Figure 3: An illustration of the splitline method. 4.2 Demonstration Figure 3 is a demonstration of the splitline algorithm creating 5 districts from a simple data set of 15 points. With 15 points and 5 districts there need to be 3 points in each district. 3:2 is the most balanced integer ratio that 5 can be divided into. At step 1 the algorithm divides the state into two regions with 9 and 6 people respectively, the correct ratios for 3 and 2 districts. At step 2, it acts recursively on the 2 subdivisions. Thus the region that had 6 people is divided into regions that have 3 people each, with no more subdivision needed. The other region is divided into regions with 6 and 3 people, the appropriate numbers for 2 and 1 districts respectively. At the third and final step, the last region is split in two and the process is complete. By using the shortest line at each step, none of the shapes end up with an unsatisfactory compactness quotient