《数学建模》美赛优秀论文：2007 A O A Cluster-Theoretic Approach to Political Districting.pdf_大学文库

Control No. 1036 f18 1 Introduction In the words of Ted Harrington, chair of political science at the University of North Carolina There is no issue that is more sensitive to politicians of all colors and ideological perst than redistricting. It will determine who wi and loses for eight years [You88 The writers of the constitution created the House of Representatives with the intention that it would be the branch of government most responsive to the people. The reality is just the opposite. Though representatives are elected every 2 years, instead of every 4 or 6 years, almost 400 of the 435 seats of the House are not contested as a result of the extraordinary power of gerryman- dering. With the immensely detailed amount of data and unlimited computing power available to politicians today, gerrymandering has been elevated to an art. With only the requirements that districts be connected and all have equal population, it is possible to pinpoint candidates and place them in a different district than their neighbors Too03 Though undemocratic, gerrymandering is nearly always legal(see, for in stance,Bac86) and has been used to obtain striking results. In 2002 only 4 incumbent representatives lost their bid for reelection -the lowest total ever Too03. We will argue that it is certainly true that any attempt to fairly restruc ture legislative districts needs to ignore the human factors that overwhelmingly determine the current redistricting process. Defining some measure of compact ness is essential to ensure fair districts. Both methods we describe produce districts that at first glance are clearly simpler than the existing ones Restructuring the districts with no regard to the current layout would be ore difficult to implement. We will use the centers of the existing districts seeds for our clustering algorithm. Thus, the new districts have some correla- tion to the existing districts, but their boundaries will be determined in a fair nanner. The core of many districts will be roughly the same, while the bound- aries will be dramatically simpler. This will effectively counteract the effects of gerrymandering, without being overly difficult to put into use immediately. 1.1 Plan of Attack Our goal is to develop an algorithmic process for dividing an arbitrary region into k legislative districts, which satisfy some heuristic definition of fairness. In order to do so we must do the following Define terms. Crucial to creating a model is defining the somewhat ambiguous terms fairness and simpleness Define metrics for comparing algorithms

Control No. 1036 3 of 18 1 Introduction In the words of Ted Harrington, chair of political science at the University of North Carolina, There is no issue that is more sensitive to politicians of all colors and ideological persuasions than redistricting. It will determine who wins and loses for eight years [You88]. The writers of the constitution created the House of Representatives with the intention that it would be the branch of government most responsive to the people. The reality is just the opposite. Though representatives are elected every 2 years, instead of every 4 or 6 years, almost 400 of the 435 seats of the House are not contested as a result of the extraordinary power of gerrymandering. With the immensely detailed amount of data and unlimited computing power available to politicians today, gerrymandering has been elevated to an art. With only the requirements that districts be connected and all have equal population, it is possible to pinpoint candidates and place them in a different district than their neighbors [Too03]. Though undemocratic, gerrymandering is nearly always legal (see, for instance, [Bac86]) and has been used to obtain striking results. In 2002 only 4 incumbent representatives lost their bid for reelection — the lowest total ever [Too03]. We will argue that it is certainly true that any attempt to fairly restructure legislative districts needs to ignore the human factors that overwhelmingly determine the current redistricting process. Defining some measure of compactness is essential to ensure fair districts. Both methods we describe produce districts that at first glance are clearly simpler than the existing ones. Restructuring the districts with no regard to the current layout would be more difficult to implement. We will use the centers of the existing districts as seeds for our clustering algorithm. Thus, the new districts have some correlation to the existing districts, but their boundaries will be determined in a fair manner. The core of many districts will be roughly the same, while the boundaries will be dramatically simpler. This will effectively counteract the effects of gerrymandering, without being overly difficult to put into use immediately. 1.1 Plan of Attack Our goal is to develop an algorithmic process for dividing an arbitrary region into k legislative districts, which satisfy some heuristic definition of fairness. In order to do so, we must do the following: • Define terms. Crucial to creating a model is defining the somewhat ambiguous terms fairness and simpleness. • Define metrics for comparing algorithms

Control No. 1036 f18 Definition 1. We say that district A is more compact than district B if (Perimeter A)2(PerimeterS)2 Call the quantity 4 Area/ Perimeter the compactness quotient For a circle of radius r, this ratio is equal t 4丌 It is well-known that the shape with the largest ratio of area to squared perimeter is the circle( see, for instance, Fol02 ) Because of this, the quantity Area 4丌 Perimeter is restricted to the interval [0, 1] As seen in figure 1, a compactness quotient of 0. 13 is visually quite bad Using the fact given in Bou88 that the area of a non-self-intersecting closed N-gon(with the k-th vertex taken in counterclockwise order equal to(ak, yk)) we have calculated the compactness quotients of several actual districts by ap proximating their boundaries by piecewise linear segments. The results illustrate the inappropriate nature of the districts currently in place. Two of New Yorks nore sprawling districts, the &th and 28th, produced compactness quotients of 0.097 and 0.101, respectively -even worse then the gerrymander shown in figure 1! The two most compact districts in New York, the 26th and 21st had compactness quotients of 0.406 and 0.498, respectively. We decided that the mean for any state should be at least. 6. With this condition the average district in every state would be better than the best districts currently in New York. Furthermore we insist that 25 should be more than 2 standard deviations from the mean. It is not possible to require that all districts be greater than. 25 as several districts will inevitably end up having most of their border coincide with the border of the state 1. 4 Defining fairness Almost all unfairness occurs when political and social measures factor into redis- tricting decisions. Practices such as concentrating supporting voters in a single district, diluting opposing voters over several districts, placing two incumbents n the same district and forcing them to run against each other, and isolating minorities have been seen many times before(see Too03 and Hay 96 ) and are all the result of districing being controlled by those who attempt to skew voting patterns. In general, one can summarize past districting patterns in the ollowing way:

Control No. 1036 5 of 18 Definition 1. We say that district A is more compact than district B if 4π AreaA (PerimeterA) 2 > 4π AreaB (PerimeterB) 2 . Call the quantity 4π Area /Perimeter2 the compactness quotient. For a circle of radius r, this ratio is equal to 4π · πr2 (2πr) 2 = 1. It is well-known that the shape with the largest ratio of area to squared perimeter is the circle (see, for instance, [Fol02]). Because of this, the quantity 4π · Area Perimeter2 is restricted to the interval [0, 1]. As seen in figure 1, a compactness quotient of 0.13 is visually quite bad. Using the fact given in [Bou88] that the area of a non-self-intersecting closed N-gon (with the k-th vertex taken in counterclockwise order equal to (xk, yk)) is equal to 1 2 N X−1 i=1 (xiyi+1 − xi+1yi), we have calculated the compactness quotients of several actual districts by approximating their boundaries by piecewise linear segments. The results illustrate the inappropriate nature of the districts currently in place. Two of New York’s more sprawling districts, the 8th and 28th, produced compactness quotients of 0.097 and 0.101, respectively — even worse then the gerrymander shown in figure 1! The two most compact districts in New York, the 26th and 21st, had compactness quotients of 0.406 and 0.498, respectively. We decided that the mean for any state should be at least .6. With this condition the average district in every state would be better than the best districts currently in New York. Furthermore we insist that .25 should be more than 2 standard deviations from the mean. It is not possible to require that all districts be greater than .25 as several districts will inevitably end up having most of their border coincide with the border of the state. 1.4 Defining Fairness Almost all unfairness occurs when political and social measures factor into redistricting decisions. Practices such as concentrating supporting voters in a single district, diluting opposing voters over several districts, placing two incumbents in the same district and forcing them to run against each other, and isolating minorities have been seen many times before (see [Too03] and [Hay96]), and are all the result of districing being controlled by those who attempt to skew voting patterns. In general, one can summarize past districting patterns in the following way:

Control No. 1036 7of18 algorithm to a given data set Data clustering often reveals an internal structure that may not have been initially apparent It is often much easier to work with a small number of clusters than with a large number of raw data The philosophy of data clustering is that we should be able to divide our data into a(not necessarily fixed)number of clusters, and that the elements of a given clusters should be somehow similar. In general, data clustering is applied to problems that deal with a large number of variables. For instance, when data clustering is used to create an animal taxonomy, there are a myriad of variables mode of reproduction, mode of transportation, presence and type of spine ideal diet, preferred habitat, and so forth [And73! Because of this, it is usuall very difficult to determine the "proper"way to cluster data [AC84 In the case of attempting to draw up simple and fair congressional districts we can apply data clustering in the following w Split the state into small, discrete units. Our units correspond to geographic locations of census population measurements fIESIN Determine some partition of these units, such that the subsets of this partition can be viewed as clusters. Note that the only variables resent are the location and population of each unit After defining a method for ordering the preference of cluster partitions possible cluster partitions and choose the best one! However, this turns out to be not feasible. In [AS68, Abramowitz and Stegun give a proof of the fact that the number of ways of sorting n observations into m groups is a Stirling number of the second kind: 1 For instance, there are more than 10- ways to sort 25 objects into 5 groups. It clear that we need some sort of algorithmic process in order to determine an appropriate partition of clusters 3 The K-means algorithm 3.1 Standard Algorithm The K-means algorithm is an iterative method for data clustering. Let D [ C r be the data to be clustered, and let S=(siis be a set of seeds Suppose we desire D to be partitioned into K clusters; let the i-th cluster be

Control No. 1036 7 of 18 algorithm to a given data set: • Data clustering often reveals an internal structure that may not have been initially apparent. • It is often much easier to work with a small number of clusters than with a large number of raw data. The philosophy of data clustering is that we should be able to divide our data into a (not necessarily fixed) number of clusters, and that the elements of a given clusters should be somehow similar. In general, data clustering is applied to problems that deal with a large number of variables. For instance, when data clustering is used to create an animal taxonomy, there are a myriad of variables — mode of reproduction, mode of transportation, presence and type of spine, ideal diet, preferred habitat, and so forth [And73]! Because of this, it is usually very difficult to determine the “proper” way to cluster data [AC84]. In the case of attempting to draw up simple and fair congressional districts, we can apply data clustering in the following way: • Split the state into small, discrete units. Our units correspond to geographic locations of census population measurements [fIESIN]. • Determine some partition of these units, such that the subsets of this partition can be viewed as clusters. Note that the only variables present are the location and population of each unit. After defining a method for ordering the preference of cluster partitions, we may suppose we are done with the problem: all that is left is to look at all possible cluster partitions and choose the best one! However, this turns out to be not feasible. In [AS68], Abramowitz and Stegun give a proof of the fact that the number of ways of sorting n observations into m groups is a Stirling number of the second kind: S (n) m = 1 m! Xm k=0 (−1)m−k m k k n . For instance, there are more than 1015 ways to sort 25 objects into 5 groups. It is clear that we need some sort of algorithmic process in order to determine an appropriate partition of clusters. 3 The K-means Algorithm 3.1 Standard Algorithm The K-means algorithm is an iterative method for data clustering. Let D = {xj} N j=1 ⊂ R n be the data to be clustered, and let S = {sj} K j=1 be a set of seeds. Suppose we desire D to be partitioned into K clusters; let the i-th cluster be

Control No. 1036 f18 Repetition: If the properties of the clusters are within our tolerance levels we stop. Otherwise, repeat the iteration step. By adjusting the weights, we are able to control the growth or decay of the ters. If the weight of a cluster increases, data points are more likely to be grouped in other clusters. Similarly, decreasing the weight helps to increase the opulation of a cluster. Thus the weight function g: RXR-IR is crucial in the performance of the algorithm. We define: where i is the current iteration, io is the maximum number of iterations, and Po is the desired population for each cluster. Towards the beginning of the algorithm,i/io is low causing the term w* p/po*VI-ilio to dominate the weight function. As the i increases, the weight fluctuates less because w*sarti/io begins to dominate w*p/po*VI-i/io. This enables the weights to change rapidly at the beginning of the iterative process causing the clusters to vary greatly between iterations. However, by the end of the algorithm, the weights do not change as readily, allowing stabilization over a optimal clustering. This is somewhat similar to the process of simulated annealing where initial negative actions allow the algorithm to escape local optimums and the probability a negative action is taken decreases over time 4 Splitline algorithm Recently, a very elegant algorithm for districting has been proposed by applied mathematician Warren B Smith Smil 4.1 Method The idea behind the splitline algorithm is quite simple e Start with the number of districts for the state. Divide that number in two as evenly as possible, using integers(for instance, 18=9+9 and Find the shortest line that divides the state into two parts such the ratio of their populations is the same as the ratio determined in the previous Repeat this process recursively on the subdivided parts until the number of parts is the same as the number of districts. At every step, the division is just a line, and so the resulting districts have piecewise linear bound aries. Using the shortest line ensures that the districts will have a good

Control No. 1036 9 of 18 • Repetition: If the properties of the clusters are within our tolerance levels we stop. Otherwise, repeat the iteration step. By adjusting the weights, we are able to control the growth or decay of the clusters. If the weight of a cluster increases, data points are more likely to be grouped in other clusters. Similarly, decreasing the weight helps to increase the population of a cluster. Thus the weight function g : R × R → R is crucial in the performance of the algorithm. We define: g (p, w) = w r i i0 + w · p p0 · r 1 − i i0 , where i is the current iteration, i0 is the maximum number of iterations, and p0 is the desired population for each cluster. Towards the beginning of the algorithm, i/i0 is low causing the term w ∗ p/p0 ∗ p 1 − i/i0 to dominate the weight function. As the i increases, the weight fluctuates less because w∗sqrti/i0 begins to dominate w ∗ p/p0 ∗ p 1 − i/i0. This enables the weights to change rapidly at the beginning of the iterative process causing the clusters to vary greatly between iterations. However, by the end of the algorithm, the weights do not change as readily, allowing stabilization over a optimal clustering. This is somewhat similar to the process of simulated annealing where initial negative actions allow the algorithm to escape local optimums and the probability a negative action is taken decreases over time. 4 Splitline Algorithm Recently, a very elegant algorithm for districting has been proposed by applied mathematician Warren B. Smith [Smi]. 4.1 Method The idea behind the splitline algorithm is quite simple: • Start with the number of districts for the state. Divide that number in two as evenly as possible, using integers (for instance, 18 = 9 + 9 and 35 = 17 + 18). • Find the shortest line that divides the state into two parts such the ratio of their populations is the same as the ratio determined in the previous step. • Repeat this process recursively on the subdivided parts until the number of parts is the same as the number of districts. At every step, the division is just a line, and so the resulting districts have piecewise linear boundaries. Using the shortest line ensures that the districts will have a good compactness quotient