On Political Methodology - Gary King - Harvard University

Report 1 Downloads 92 Views
Gary King. "On Political Methodology," Political Analysis, Vol. 2 (1991): Pp. 1-30.

On Political Methodology Gary King

1. Introduction

"Politimetrics" (Gurr 1972), "polimetrics," (Alker 1975), "politometrics" (Hilton 1976), "political arithmetic" (Petty [1672] 1971), "quantitative Political Science (QPS)," "governmetrics," "posopolitics" (Papayanopoulos 1973), "political science statistics" (Rai and Blydenburgh 1973), "political statistics" (Rice 1926). These are some of the names that scholars have used to describe the field we now call "political methodology."1 The history of political methodology has been quite fragmented until recently, as reflected by this patchwork of names. The field has begun to coalesce during the past decade; we are developing persistent organizations, a growing body of scholarly literature, and an emerging consensus about important problems that need to be solved. I make one main point in this article: If political methodology is to play an important role in the future of political science, scholars will need to find ways of representing more interesting political contexts in quantitative analyses. This does not mean that scholars should just build more and more complicated statistical models. Instead, we need to represent more of the essence of political phenomena in our models. The advantage of formal and quantitative approaches is that they are abstract representations of the political world and are, thus, much clearer. We need methods that enable us to abstract the right parts of the phenomenon we are studying and exclude everything superfluous.

This paper was presented at the annual meeting of the American Political Science Association, Atlanta, Georgia. My thanks go to the section head, John Freeman, who convinced me to write this paper, and the discussants on the panel, George Downs, John Jackson, Phil Schrodt, and Jim Stimson, for many helpful comments. Thanks also to the National Science Foundation for grant SES-89-09201, to Nancy Bums for research assistance, and to Neal Beck, Nancy Bums, and Andrew Gclman for helpful discussions. I. Johnson and Schrodt's 1989 paper gives an excellent sense of the breadth of formal and quantitative political methods, a broad focus but still much narrower than the diverse collection of methods routinely used in the discipline. For this paper, I narrow my definition of political methodology even further to include only statistical methods.

2

Political Analysis

Despite the fragmented history of quantitative political analysis, a version of this goal has been voiced frequently by both quantitative researchers and their critics (see sec. 2). However, while recognizing this shortcoming, earlier scholars were not in the position to rectify it, lacking the mathematical and statistical tools and, early on, the data. Since political methodologists have made great progress in these and other areas in recent years, I argue that we are now capable of realizing this goal. In section 3, 1 suggest specific approaches to this problem. Finally, in section 4, I provide two modem examples to illustrate these points. 2. A Brief History of Political Methodology

In this section, I describe five distinct stages in the history of political methodology.2 Each stage has contributed, and continues to contribute, to the evolution of the subfield but has ultimately failed to bring sufficient political detail into quantitative analyses. For the purpose of delineating these five stages, I have collected data on every article published in the American Political Science Review {APSR) from 1906 to 1988.3 The APSR was neither the first political science journal nor the first to publish an article using quantitative methods, and it does not always contain the highest quality articles. 4 Nevertheless, APSR has consistently reflected the broadest crosssection of the discipline and has usually been among the most visible political science journals. Of the 2,529 articles published through 1988, 619, or 24.5 percent, used quantitative data and methods in some way. At least four phases can be directly discerned from these data. I begin by briefly describing these four stages and a fifth stage currently in progress. In these accounts, I focus on the ways in which methodologists have attempted 2. Although these stages in the history of political methodology seem to emerge naturally from the data I describe below, this punctuation of historical time is primarily useful for expository purposes. 3. Not much of methodological note happened prior to 1906, even though the history of quantitative analysis in the discipline dates at least to the origins of American political science a century ago: "The Establishment of the Columbia School not only marked the beginnings of political science in the U.S., but also the beginnings of statistics as an academic course, for it was at that same time and place—Columbia University in 1880—that th'e first course in statistics was offered in an American university. The course instructor was Richmond Mayo-Smith (18541901) who, despite the lack of disciplinary boundaries at the time, can quite properly be called a political scientist" (Gow 1985, 2). In fact, the history of quantitative analysis of political data dates back at least two centuries earlier, right to the beginnings of the history of statistics (see Stigler 1986; Petty, 1672). 4. Although quantitative articles on politics were published in other disciplines, the first quantitative article on politics published in a political science journal was Ogbum and Coltra 1919.

On Political Methodology

1.00

0.75

I

0.50

eg c co O

Xf) (another parameterization of the unconditional expected value) and variance

22

Political Analysis

Alternatively, we can write a different conditional model with explanatory variables and spatially lagged random shocks as follows: E(*,\VjXJ

*i)=X,p+

PWie,

(10)

Note how this model compares to the conditional model in equation 8. Both are linear functions of a vector of explanatory variables. In addition, instead of neighboring values of the dependent variable affecting the current dependent variable, this model assumes that random shocks (unexpected values of the dependent variable) in neighboring regions affect a region. For example, a reasonable hypothesis is that, after taking into account the explanatory variables, only unexpected levels of international conflict in neighboring countries will produce conflict in one's own country (see Doreian 1980). The unconditional version of this model takes a surprisingly simple form: £ ( £ ( y,|y,,V/ * i)) = X,)3 + pW f £(e I )

(11)

What this means is that values of the explanatory variables have effects only in the region for which they are measured. Unlike the first model in equations 8 (the conditional version) and 9 (the unconditional version), the explanatory variables do not have effects that lop over into contiguous regions in this second model in equations 10 (the conditional version) and 11 (the unconditional version). Only unexpected, or random, shocks affect the neighboring region. Once these shocks affect the neighboring region, however, they disappear; no "second-order" effects occur where something happens in one county, which affects the next county, which affects the next, etc. A more sophisticated model includes all three spatial fundamentals in the same model (see Brandsma and Ketallapper 1978; Ooreian 1982; and Dow 1984; on the "biparametric" approach): EiY^YjXj

+ i) = Xrf + PlWuy + p2W2le.

(12)

This model incorporates many interesting special cases, including the previous models, but it is still wholly inadequate to represent the enormous variety of conceivable spatial processes. For example, I have never seen a single model estimated with social science data with more than these two

On Political Methodology

23

spatial parameters (p, and ft) or a model with more than one conditional spatial lag. Another difficulty is the very definition of the spatial lag operator. How does one define the "distance" between irregularly shaped spatial units? If "distance" is to mean actual mileage between pairs of U.S. states, should the measure be calculated between capital cities, largest cities, closest borders, or just 0/1 variables indicating neighborhoods (Cressie and Chan 1989)? More generally, we can also use more substantively meaningful definitions of distance, such as the proportion of shared common borders, numbers of commuters traveling daily (or migrating permanently) between pairs of states, or combinations of these or other measures (see Cliff and Ord 1973 and 1981). Choosing the appropriate representation is obviously difficult, and the choice makes an important difference in practice (Stetzer 1982). These concerns are also important because unmodeled spatial variation will incorrectly appear to the analyst as spatial autocorrelation. However, a much more serious problem is that the W matrix is not a unique representation of the spatial processes it models. This is not the usual problem of fitting a model to empirical data. It is the additional problem of fitting the model to the theoretical spatial process. For example, begin with a spatial process, and represent it with a matrix W. The problem is that one cannot reconstruct the identical map from this matrix. Since the W matrix (and X) is the only way spatially distributed political variables are represented in these models, this nonuniqueness is a fundamental problem. In order to get more politics into this class of models, we need to develop better, more sophisticated models and probably some other way to represent spatial information.17 Through all of these models, the same problem remains: statistical models of spatial data do not represent enough of the political substance existing in the data. I do not have a solution to this problem, but one possibility may lie in a literature now forming on the statistical theory of shape (see Kendall 1989 for a review). The motivation behind this literature is often archaeological or biological; for example, scholars sometimes want to know, apart from random error, if two skulls are from the same species. In this form, the literature has little to contribute to our endeavors (although political scientists are sometimes interested in shape alone; see Niemi et al. 1989). However, these scholars are working on ways of representing shapes in statistical models, and 17. Many other approaches have been suggested for these models. For example, Arora and Brown (1977) suggest, but do not estimate, a variety of more traditional approaches. Burridge (1981) demonstrates how to test for a common factor in spatial models: the purpose of this is to reduce the parametrization (just as Hendry and Mizon [1978] do in time-series models). For a comprehensive review of many models, see Anselin 1988 from a linear econometric viewpoint and Besag 1974 from a statistical perspective.

24

Political Analysis

geographic shapes are just two-dimensional special cases of their models. Eventually, some kind of spatially continuous model that includes the shape of geographic areas along with information about continuous population densities across these areas may help to represent more politics in these statistical models. Until then, graphical approaches may be the only reasonable option.18 5. Concluding Remarks

Although the quantitative analysis of political data is probably older than the discipline of political science, the systematic and self-conscious study of political methodology began much more recently. In this article, I have argued for a theme that has pervaded the history of quantitative political science and the critics of this movement: In a word, the future of political methodology is in taking our critics seriously and finding ways to bring more politics into our quantitative analyses. My suggestions for including more political context include using more sophisticated stochastic modeling; understanding and developing our own approach to, and perspectives on, theories of inference; and developing and using graphic analysis more often. I believe these are most important, but other approaches may turn out to be critical as well. For example, the probabilistic models I favor usually begin with assumptions about individual behavior, and this is precisely the area where formal modelers have the most experience. If our stochastic models are to be related in meaningful ways to 18. Another way geographic information has been included in statistical models is through "hierarchical" or "multilevel" models. This is different from time-series-cross-sectional models (see Stimson 1985; Dielman 1989). Instead, the idea is to use a cross-section or time-series within each geographic unit to estimate a separate parameter. One then posits a second model with these parameter estimates as the dependent variable varying spatially (the standard errors on each of these coefficients are usually used as weights in the second stage). These models have been developed most in education (see Raudenbush 1988; Raudenbush and Bryk 1986; Bryk and Thum 1988). The same problems of representing political information exist as in the previous section, but these models have an additional problem that has not even been noticed, much less been solved. The problem is selection bias (see Achen 1986b), and it is probably clearest in education, where hierarchical models are in the widest use. For example, if schools are the aggregates, the problem is that they often choose students on the basis of expected quality, which is obviously correlated with the dependent variable at the first stage. The result is that the coefficients on the within-school regressions are differentially afflicted by selection bias. Much of the aggregate level regression, then, may just explain where selection bias is worse rather than the true effects of social class on achievement. This problem is less severe in political data based on fixed geographic units like states (King and Browning 1987; King 1991), but one should check for problems that could be caused by intentional or unintentional gerrymandering.

On Political Methodology

25

political science theory, formal theory will need to make more progress and the two areas of research must also be more fully integrated. Finally, as the field of political methodology develops, we will continue to influence the numerous applied quantitative researchers in political science. Our biggest influence should probably always be in emphasizing to our colleagues (and ourselves) the limitations of all kinds of scientific analysis. Most of the rigorous statistical tools we use were developed to keep us from fooling ourselves into seeing patterns or relationships where none exist. This is one area where quantitative analysis most excels over other approaches, but, just like those other approaches, we still need to be cautious. Anyone can provide some evidence that he or she is right; a better approach is to try hard to show that you are wrong and to publish only if you fail to do so. Eventually we may have more of the latter than the former.

REFERENCES Achen, Christopher H. 1977. "Measuring Representation: Perils of the Correlation Coefficient." American Journal of Political Science 21 (4): 805-15. Achen, Christopher H. 1981. "Towards Theories of Data." In Political Science: The State of the Discipline, ed. Ada Finifter. Washington, D.C.: American Political Science Association. Achen, Christopher H. 1983. "If Party ID Influences the Vote, Goodman's Ecological Regression is Biased (But Factor Analysis is Consistent)." Photocopy. Achen, Christopher H. 1986a. "Necessary and Sufficient Conditions for Unbiased Aggregation of Cross-Sectional Regressions." Presented at the 1986 meeting of the Political Methodology Group, Cambridge, Mass. Achen, Christopher H. 1986b. The Statistical Analysis of Quasi-Experiments. Berkeley: University of California Press. Arora, Swamjit S., and Murray Brown. 1977. "Alternative Approaches to Spatial .Correlation: An Improvement Over Current Practice." International Regional Science Review 2 (1): 67-78. Alker, Hayward R., Jr. 1975. "Polimetrics: Its Descriptive Foundations." In Handbook of Political Science, ed. Fred Greenstein and Nelson Polsby. Reading: AddisonWesley. Alt, James, and Gary King. N.d. "A Consistent Model for Ecological Inference." In progress. Anselin, Luc. 1988. Spatial Econometrics: Methods and Models. Boston: Kluwer Academic Publishers. Bartels, Larry. 1989. "Misspecification in Instrumental Variables Estimators." Presented at the annual meeting of the Political Science Methodology Group, Minneapolis. Beck, Nathaniel. 1983. 'Time-varying Parameter Regression Models." American Journal of Political Science 27:557-600.

26

Political Analysis

Beck, Nathaniel. 1986. "Estimating Dynamic Models is Not Merely a Matter of Technique." Political Methodology 11 (1-2): 71-90. Beck, Nathaniel. 1987. "Alternative Dynamic Specifications of Popularity Functions." Presented at the annual meeting of the Political Methodology Group, Durham, N.C. Besag, Julian. 1974. "Spatial Interaction and the Statistical Analysis of Lattice Systems" [with comments]. Journal of the Royal Statistical Society, ser. B, 36:192— 236. Brandsma, A. S., and R. H. Ketellapper. 1978. "A Biparametric Approach to Spatial Correlation." Environment and Planning 11:51-58. Brown, Lawrence A., and John Paul Jones HI. 1985. "Spatial Variation in Migration Processes and Development: A Costa Rican Example of Conventional Modeling Augmented by the Expansion Method." Demography 22 (3): 327-52. Brown, Philip J., and Clive D. Payne. 1986. "Aggregate Data, Ecological Regression, and Voting Transitions. "Journal of the American Statistical Association 81 (394): 452-60. Brueckner, Jan K. 1986. "A Switching Regression Analysis of Urban Population Densities." Journal of Urban Economics 19:174-89. Bryk, Anthony S., and Yeow Meng Thum. 1988. "The Effects of High School Organization on Dropping Out: An Exploratory Investigation." University of Chicago. Mimeo. Burridge, P. 1981. 'Testing for a Common Factor in a Spatial Correlation Model." Environment and Planning, A. 13:795-800. Cleveland, William S. 1985. The Elements of Graphing Data. Monterey, Calif.: Wads worth. Cliff, Andrew D., and J. Keith Ord. 1973. Spatial Correlation. London: Pion. Cliff, Andrew D., and J. Keith Ord. 1981. Spatial Processes: Models and Applications. London: Pion. Cressie, Noel, and Ngai H. Chan. 1989. "Spatial Modeling of Regional Variables." Journal of the American Statistical Association 84 (406): 393-401. Dahl, Robert. 1961. Who Governs? Democracy and Power in an American City. New Haven: Yale University Press. Dielman, Terry E. 1989. Pooled Cross-Sectional and Time Series Data Analysis. New York: M. Dekker. Doreian, Patrick. 1980. "Linear Models with Spatially Distributed Data: Spatial Disturbances or Spatial Effects?" Sociological Methods and Research 9(1): 29-60. Doreian, Patrick. 1982. "Maximum Likelihood Methods for Linear Models." Sociological Methods and Research 10 (3): 243-69. Dow, Malcolm M. 1984. "A Biparametric Approach to Network Correlation." Sociological Methods and Research 13 (2): 210-17. Eisner, Robert. 1989. "Divergences of Measurement and Theory and Some Implications for Economic Policy." American Economic Review 79 (1): 1-13. Franklin, Charles. 1989. "Estimation across Data Sets: Two-Stage Auxiliary Instrumental Variables Estimation (2SAIV)." Political Analysis 1:1-24. Freeman, John. 1983. "Granger Causality and Time Series Analysis of Political Relationships." American Journal of Political Science 27:327-58.

On Political Methodology

27

Freeman, John. 1989. "Systematic Sampling, Temporal Aggregation, and the Studyof Political Relationships." Political Analysis 1:61-98. Goodman, Leo. 1953. "Ecological Regression and the Behavior of Individuals." American Journal of Sociology 64:610—25. Gosnell, Harold F. 1933. "Statisticians and Political Scientists." American Political Science Review 27 (3): 392-403. Gosnell, Harold F., and Norman N. Gill. 1935. "An Analysis of the 1932 Presidential vote in Chicago" American Political Science Review 29:967-84. Gow, David John. 1985. "Quantification and Statistics in the Early Years of American Political Science, 1880-1922." Political Methodology 11 (1-2): 1-18. Gurr, Ted Robert. 1972. Politimetrics: An Introduction to Quantitative Macropolitics. Engelwood Cliffs, N.J.: Prentice-Hall. Hanushek, Eric, and John Jackson. 1977. Statistical Methods for Social Scientists. New York: Academic Press. Harvey, A. C. 1981. The Econometric Analysis of Time Series. Oxford: Philip Allan. Hendry, David, and G. Mizon. 1978. "Serial Correlation as a Convenient Simplification, Not a Nuisance: A Comment on a Study of the Demand for Money by the Bank of England." Economic Journal 88:549-63. Hibbs, Douglas. 1974. "Problems of Statistical Estimation and Causal Inference in Time-Series Regression Models." Sociological Methodology 1974, 252-308. Hilton, Gordon. 1976. Intermediate Politometrics. New York: Columbia University Press. Jackson, John E. 1975. "Issues, Party Choices, and Presidential Voting." American Journal of Political Science 19:161-86. Jackson, John E. 1989. "An Errors-in-Variables Approach to Estimating Models with Small Area Data." Political Analysis 1:157-80. Jackson, John E. 1990. "Estimation of Variable Coefficient Models." Presented at the annual meeting of the American Political Science Association, Chicago. Johnson, Paul E., and Phillip A. Schrodt. 1989. "Theme Paper. Analytic Theory and Methodology." Presented at the annual meeting of the Midwest Political Science Association, Chicago. Kendall, David G. 1989. "A Survey of the Statistical Theory of Shape." Statistical Science 4 (2): 87-120. Key, V. O. 1949. Southern Politics in State and Nation. New York: Knopf. King, Gary. 1986. "How Not to Lie with Statistics: Avoiding Common Mistakes in Quantitative Political Science." American Journal of Political Science 30 (3): 666-87. King, Gary. 1988. "Statistical Models for Political Science Event Counts: Bias in Conventional Procedures and Evidence for The Exponential Poisson Regression Model." American Journal of Political Science 32 (3): 838-63. King, Gary. 1989a. "Representation Through Legislative Redistricting: A Stochastic Model." American Journal of Political Science 33 (4): 787-824. King, Gary. 1989b. Unifying Political Methodology: The Likelihood Theory of Statistical Inference. New York: Cambridge University Press. King, Gary. 1991. "Constituency Service and Incumbency Advantage." British Journal of Political Science. Forthcoming.

28

Political Analysis

King, Gary, and Robert X. Browning. 1987. "Democratic Representation and Partisan Bias in Congressional Elections." American Political Science Review 81:125173. King, Gary, and Andrew Gelman. 1991. "Estimating Incumbency Advantage Without Bias." American Journal of Political Science. Forthcoming. Kritzer, Herbert M. 1978a. "Analyzing Contingency Tables by Weighted Least Squares: An Alternative to the Goodman Approach." Political Methodology 5 (4): 277-326. Kritzer, Herbert M. 1978b. "An Introduction to Multivariate Contingency Table Analysis." American Journal of Political Science 21:187-226. Lowell, A. Lawrence. 1910. "The Physiology of Politics." American Political Science Review 4(1): 1-15. Merriam, Charles. 1921. "The Present State of the Study of Politics." American Political Science Review 15 (2): 173-85. Niemi, Richard G., Bernard Grofman, Carl Carlucci, and Thomas Hofeller. 1989. "Measuring Compactness and the Role of a Compactness Standard in a Test for Partisan Gerrymandering." Photocopy. Ogbum, William, and Inez Goltra. 1919. "How Women Vote: A Study of an Election in Portland, Oregon." Political Science Quarterly 34:413-33. Papayanopoulos, L., ed. 1973. "Democratic Representation and Apportionment: Quantitative Methods, Measures, and Criteria." Annals of the New York Academy of Sciences 219:3-4. Petty, William. [1672] 1971. Political Anatomy of Ireland. Reprint. Totowa, N.J.: Rowman and * ttlefield. Rai, Kul B., and John C. Blydenburth. 1973. Political Science Statistics. Boston: Holbrook Press. Raudenbush, Stephen W. 1988. "Educational Applications of Hierarchical Linear Models: A Review." Journal of Educational Statistics 13 (2): 85-116. Raudenbush, Stephen W., and Anthony S. Bryk. 1986. "A Hierarchical Model for Studying School Effects." Sociology of Education 59:1-17. "Reports of the National Conference on the Science of Politics." 1924. American Political Science Review 18 (I): 119-48.

"Report of the Committee on Political Research." 1924. American Political Science Review 18 (3): 574-600. "Reports of the Second National Conference on the Science of Politics." 1925. American Political Science Review 19(1): 104-10.

"Report of the Third National Conference on the Science of Politics." 1926. American Political Science Review 20 (1): 124-39. Rice, Stuart A. 1926. "Some Applications of Statistical Method to Political Research." American Political Science Review 20 (2): 313-29. Rivers, Douglas. 1988. "Heterogeneity in Models of Electoral Choice." American Journal of Political Science 32 (3): 737-57. Shively, Philip. 1969. "Ecological Inference: The Use of Aggregate Data to Study Individuals." American Political Science Review 63:1183-96. Stetzer, F. 1982. "Specifying Weights in Spatial Forecasting Models: The Results of Some Experiments." Environment and Planning 14:571-84.

On Political Methodology

29

Stigler, Stephen. 1986. The History of Statistics: The Measurement of Uncertainty before 1900. Cambridge, Mass.: Harvard University Press. Stimson, James A. 1985. "Regression in Time and Space." American Journal of Political Science 29 (4): 914-47. Tobler, W. 1979. "Cellular Geography." In Philosophy in Geography, ed. S. Gale and G. Olsson. Dordrecht: Reidel. "Hifte, Edward R. 1969. "Improving Data Analysis in Political Science." World Politics 21 (4): 641-54. Tbfte, Edward R. 1983. The Visual Display of Quantitative Information. New Haven: Graphics Press. Weisberg, Herbert. 1974. "Dimensionland: An Excursion into Spaces." American Journal of Political Science 18:743-76.