Peter Selb and Simon Munzert field experiments (Gerber and Green 2000),and nat- that group-level observations can be highly misleading ural experiments(Huber and Arceneaux 2007). when aggregate data are used to make inferences about While the historical perspective of our research pre- individuals.Even under unconfoundedness (e.g.,by cludes attractive design options that employ random- random assignment of campaign appearances),we can- ization or survey data,3 a number of recent stud- not unambiguously attribute higher turnout or voter ies try to gauge the impact of campaign stops on support in visited localities to increased propensities voting behavior by using widely available informa- among those who attended the campaign events to turn tion about the candidates'campaign itineraries and out and vote for the candidate.Such aggregate effects local-level election results (Campbell 2008;Herr 2002; may also come about.for instance,in the fashion of Hill,Rodriquez,and Wooden 2010;Holbrook 2002; an indirect two-step flow of communication (Katz and Jones 1998;King and Morehouse 2004;Sellers and Lazarsfeld 1955),in which opinion leaders who would Denton 2006;Vavreck,Spiliotes,and Fowler 2002). turn out and support their candidate anyway (i.e.,for Such studies typically struggle with identification is- whom the individual effect of attendance is essentially sues that challenge causal claims.Like any observa- zero)attend the event and are then motivated to mo- tional study,they are subject to potential confound- bilize and persuade others within their personal net- ing (see Goldstein and Holleque 2010).Confound- works (Rosenstone and Hansen 1993).In our partic- ing would occur,for instance,if candidates and their ular case,positive effects of Hitler's campaign visits staff deliberately selected locations for their appear- on local Nazi vote shares might also have occurred in- ances where they expected a large pool of easy-to- directly through intimidation.As Childers and Weiss mobilize supporters or anticipated a close race.If a re (1990)document,violence was an integral part of Nazi searcher failed to properly take into account such con- mobilization strategy at the end of the Weimar Repub- founders (e.g.,latent support,marginality),she would lic.If Hitler's appearances were regularly accompanied probably overestimate the effect of appearances on by assaults on political opponents,increases in Nazi the candidate vote share and voter turnout.We use a vote shares at the following election could have been 4号元 semi-parametric difference-in-differences estimation the result of selective abstention by supporters of op- strategy to account for potential confounding due to ponent parties.Either way,campaign effects on local- observed and unobserved variables (see Abadie 2005: level election outcomes are,like other neighborhood Heckman,Ichimura,and Todd 1997).In doing so,we effects,emergent properties of the social interaction of specify a parametric model to predict the probability of the residents (Oakes 2004).Therefore,one has to be Hitler appearances in a geographic unit and match vis- cautious not to interpret even internally valid aggre- ited units with control units that feature a similar pre- gate estimates of the effect of campaign appearances dicted probability before the difference-in-differences on election results in terms of the impact of individual analysis. attendance on voting behavior. Beyond potential confounding,such studies of can- Finally,spatial ecological studies potentially suffer didate appearances may be considered what epidemi- from ambiguities in separating exposure from nonex- ologists call spatial ecological studies (Wakefield 2004). posure units.Effects of campaign events need not be re- Spatial ecological studies use geographic proximity to stricted to the areal units for which we observe the out- a presumed cause(in our case:campaign appearances) comes of interest.For instance,voters and opinion lead- 675:.101 as a surrogate for individual exposure to the cause ers from neighboring units may also attend the events (attendance to the campaign event)and measure the and thereby carry individual and network effects back response (voting behavior)at the level of geographic home.Likewise,the geographic range of news media units(communities or counties).5 A number of addi- that cover the events may well exceed the borders of tional biases may arise in such a design.It is well known the units of analysis.Such spatial spillovers would vio- late the non-interference assumption underlying most methods for causal inference.Non-interference is an See Collier (1944)for an early(non-randomized)experiment on the attitudinal effects of Nazi propaganda materials on a sample of essential aspect of the stable unit-treatment value as- U.S.college students in 1941-1942.Also see Reuband (2006),who sumption (SUTVA),which implies that a treatment uses a retrospective survey conducted in 1949 to assess mass support applied to one unit does not affect the outcome of during the Nazi regime. other units.This allows researchers to employ multiple 4 In their original study.,Shaw and Gimpel(2012)randomize a can. units for estimating causal effects(Rubin 1980).To il- didate's travel schedule during the 2006 Texas gubernatorial race to make campaign appearances statistically independent of other fac lustrate the implications,imagine that Hitler's appear- 四 tors related to the outcome of interest.While such a randomized field ances actually had their intended effect on Nazi sup- experiment is a powerful design for valid causal inference,even the port in the visited county,but that this effect carried authors seem surprised that the candidate's staff actually agreed to over to neighboring counties through travel activity, let scholars interfere in their strategic planning (Shaw and Gimpel 2012,140).Moreover,this is an apparently infeasible approach for a personal networks,or media coverage.If these neigh- retrospective study like ours. boring counties served as controls when assessing the Shaw and Gimpel(2012)field a large-scale survey of registered effect of Hitler's appearance on the NSDAP vote in voters that includes items on both exposure to the campaign events the exposure county,the effect estimate would obvi- and candidate support.Such data would have allowed them to esti- ously be biased downward because the average over- mate causal effects of individual exposure by using an instrumental variable approach(see Angrist,Imbens,and Rubin 1996).However time difference in outcomes among control units would Shaw and Gimpel (2012)limit their empirical analysis to before- not properly reflect the expected developments in after comparisons within and between geographic units. the absence of the appearance.We,therefore,exclude 1052Peter Selb and Simon Munzert field experiments (Gerber and Green 2000), and natural experiments (Huber and Arceneaux 2007). While the historical perspective of our research precludes attractive design options that employ randomization or survey data,3 a number of recent studies try to gauge the impact of campaign stops on voting behavior by using widely available information about the candidates’ campaign itineraries and local-level election results (Campbell 2008; Herr 2002; Hill, Rodriquez, and Wooden 2010; Holbrook 2002; Jones 1998; King and Morehouse 2004; Sellers and Denton 2006; Vavreck, Spiliotes, and Fowler 2002). Such studies typically struggle with identification issues that challenge causal claims. Like any observational study, they are subject to potential confounding (see Goldstein and Holleque 2010). Confounding would occur, for instance, if candidates and their staff deliberately selected locations for their appearances where they expected a large pool of easy-tomobilize supporters or anticipated a close race. If a researcher failed to properly take into account such confounders (e.g., latent support, marginality), she would probably overestimate the effect of appearances on the candidate vote share and voter turnout.4 We use a semi-parametric difference-in-differences estimation strategy to account for potential confounding due to observed and unobserved variables (see Abadie 2005; Heckman, Ichimura, and Todd 1997). In doing so, we specify a parametric model to predict the probability of Hitler appearances in a geographic unit and match visited units with control units that feature a similar predicted probability before the difference-in-differences analysis. Beyond potential confounding, such studies of candidate appearances may be considered what epidemiologists call spatial ecological studies (Wakefield 2004). Spatial ecological studies use geographic proximity to a presumed cause (in our case: campaign appearances) as a surrogate for individual exposure to the cause (attendance to the campaign event) and measure the response (voting behavior) at the level of geographic units (communities or counties).5 A number of additional biases may arise in such a design. It is well known 3 See Collier (1944) for an early (non-randomized) experiment on the attitudinal effects of Nazi propaganda materials on a sample of U.S. college students in 1941-1942. Also see Reuband (2006), who uses a retrospective survey conducted in 1949 to assess mass support during the Nazi regime. 4 In their original study, Shaw and Gimpel (2012) randomize a candidate’s travel schedule during the 2006 Texas gubernatorial race to make campaign appearances statistically independent of other factors related to the outcome of interest.While such a randomized field experiment is a powerful design for valid causal inference, even the authors seem surprised that the candidate’s staff actually agreed to let scholars interfere in their strategic planning (Shaw and Gimpel 2012, 140). Moreover, this is an apparently infeasible approach for a retrospective study like ours. 5 Shaw and Gimpel (2012) field a large-scale survey of registered voters that includes items on both exposure to the campaign events and candidate support. Such data would have allowed them to estimate causal effects of individual exposure by using an instrumental variable approach (see Angrist, Imbens, and Rubin 1996). However, Shaw and Gimpel (2012) limit their empirical analysis to before– after comparisons within and between geographic units. that group-level observations can be highly misleading when aggregate data are used to make inferences about individuals. Even under unconfoundedness (e.g., by random assignment of campaign appearances), we cannot unambiguously attribute higher turnout or voter support in visited localities to increased propensities among those who attended the campaign events to turn out and vote for the candidate. Such aggregate effects may also come about, for instance, in the fashion of an indirect two-step flow of communication (Katz and Lazarsfeld 1955), in which opinion leaders who would turn out and support their candidate anyway (i.e., for whom the individual effect of attendance is essentially zero) attend the event and are then motivated to mobilize and persuade others within their personal networks (Rosenstone and Hansen 1993). In our particular case, positive effects of Hitler’s campaign visits on local Nazi vote shares might also have occurred indirectly through intimidation. As Childers and Weiss (1990) document, violence was an integral part of Nazi mobilization strategy at the end of the Weimar Republic. If Hitler’s appearances were regularly accompanied by assaults on political opponents, increases in Nazi vote shares at the following election could have been the result of selective abstention by supporters of opponent parties. Either way, campaign effects on locallevel election outcomes are, like other neighborhood effects, emergent properties of the social interaction of the residents (Oakes 2004). Therefore, one has to be cautious not to interpret even internally valid aggregate estimates of the effect of campaign appearances on election results in terms of the impact of individual attendance on voting behavior. Finally, spatial ecological studies potentially suffer from ambiguities in separating exposure from nonexposure units.Effects of campaign events need not be restricted to the areal units for which we observe the outcomes of interest.For instance, voters and opinion leaders from neighboring units may also attend the events and thereby carry individual and network effects back home. Likewise, the geographic range of news media that cover the events may well exceed the borders of the units of analysis. Such spatial spillovers would violate the non-interference assumption underlying most methods for causal inference. Non-interference is an essential aspect of the stable unit-treatment value assumption (SUTVA), which implies that a treatment applied to one unit does not affect the outcome of other units. This allows researchers to employ multiple units for estimating causal effects (Rubin 1980). To illustrate the implications, imagine that Hitler’s appearances actually had their intended effect on Nazi support in the visited county, but that this effect carried over to neighboring counties through travel activity, personal networks, or media coverage. If these neighboring counties served as controls when assessing the effect of Hitler’s appearance on the NSDAP vote in the exposure county, the effect estimate would obviously be biased downward because the average overtime difference in outcomes among control units would not properly reflect the expected developments in the absence of the appearance. We, therefore, exclude 1052 Downloaded from https://www.cambridge.org/core. Shanghai JiaoTong University, on 26 Oct 2018 at 03:56:49, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0003055418000424