Speaker Series: Past Years

Spring 2020 Schedule

The PML speaker series for Spring 2020 will be in the Millikan Room (E53-482) on Mondays at 12-1:30pm.  Lunch will be provided.


Monday May 11, 2020: 12-1:30 p.m.

Kosuke Imai (Harvard)

Title: TBD 


Monday April 13, 2020: 12-1:30 p.m.

Ludovic Rheault (University of Toronto)

Title: TBD 


Monday March 30, 2020: 12-1:30 p.m.

Kirk Bansak (UCSD)

Title: TBD 


Monday February 10, 2020: 12-1:30 p.m.

Zachary Steinert-Threlkeld (UCLA)

Title: "Image as Data: Automated Content Analysis for Political Images

Paper draft here

Presentation slides here 


Fall 2019 Schedule

The PML speaker series for Fall 2019 will be in the Millikan Room (E53-482) on Tuesdays at 12-1:30pm.  Lunch will be provided.


Tuesday October 29, 2019: 12-1:30 p.m.

Xun Pang

Title: "A Bayesian Group-Multifactor Spatio-Temporal Model for Identifying and explaining Social Effects With Longitudinal Network Data

Presentation slides here


Spring 2019 Schedule

The PML speaker series for AY 2018-19 will be in the Millikan Room (E53-482) on Mondays at 12-1:30pm.  Lunch will be provided.

Friday, March 15, 2019: 12-1:30 p.m.

Jacob Montgomery

Title: "Ends Against the Middle: Introducing the Generalized Graded Unfolding Model for Non-monotonic Item Response Functions

Abstract: Standard methods for measuring ideology from voting records assume strict monotonicity of responses in individuals’ latent traits. If this assumption holds, we should not observe instances where individuals at the extremes act together in opposition to moderates. In practice, however, there are many times when individuals from both extremes may behave identically but for opposing reasons. For example, both liberal and conservative justices may dissent from the same Supreme Court decision but provide ideologically contradictory reasons. In this paper, we introduce to the political science literature the generalized graded unfolding model (GGUM), first proposed by Roberts, Donoghue, and Laughlin (2000), which accommodates non-monotonic response functions consistent with single-peaked preferences. In addition to explaining the method, we provide a novel estimation method and software that outperforms existing routines. We then apply this method to voting data from the U.S. Supreme Court and Congress and show that the GGUM outperforms standard methods in terms of both predictive accuracy and substantive insights.


Wednesday, April 17, 2019: 12-1:30 p.m.

Chad Hazlett 

Title: "Credible or Confounded? Applying Sensitivity Analyses to Improve Research and its Evaluation under Imperfect Identification (with Francesca Parente)"

Abstract: Social scientists pose important questions about the effects of potential causes, but often cannot eliminate all possible confounders in defense of causal claims. Sensitivity analyses can be useful in these circumstances, providing a route to rigorously investigate causal questions despite imperfect identification. Further, if more widely adopted, these tools have the potential to improve upon standard practice for communicating the robustness causal claims, while suggesting new ways for readers and reviewers to judge research. We illustrate these uses of sensitivity analysis in an application that examines two potential causes of support for the 2016 Colombian referendum for peace with the FARC. Conventional regression analyses find "statistically and substantively significant" estimated effects for both causes. Yet, sensitivity analyses reveal very weak confounders could overturn one cause (exposure to violence), but extremely powerful confounders are needed to overturn the other (political affiliation with the deal's champion).

Working paper: Here


2017 - 2018 Schedule 

The PML speaker series for AY 2017-18 will be in Millikan Room on Mondays at 12-1:30pm.  Lunch will be provided.

Monday, November 20, 2017: 12-1:30 p.m.

Colin Fogarty, Massachusetts Institute of Technology

Title: Studentized sensitivity analysis in paired observational studies

Abstract: A fundamental limitation of causal inference in observational studies is that perceived evidence for an effect might instead be explained by factors not accounted for in the primary analysis. Methods for assessing the sensitivity of a study's conclusions to unmeasured confounding have been established under the assumption that the treatment effect is constant across all individuals. In the potential presence of unmeasured confounding, it has been argued that certain patterns of effect heterogeneity may conspire with unobserved covariates to render the performed sensitivity analysis inadequate. We present a new method for conducting a sensitivity analysis for the sample average treatment effect in the presence of effect heterogeneity in paired observational studies. Our recommended procedure, called the studentized sensitivity analysis, represents an extension of recent work on studentized permutation tests to the case of observational studies, where randomizations are no longer drawn uniformly. The method naturally extends conventional tests for the sample average treatment effect in paired experiments to the case of unknown, but bounded, probabilities of assignment to treatment. In so doing, we illustrate that concerns about certain sensitivity analyses operating under the presumption of constant effects are largely unwarranted.

The presentation material is available here.


Monday, December 4, 2017: 12-1:30 p.m.

Francisco Cantu, University of Houston

Title: The Fingerprints of Fraud: Evidence from Mexico's 1988 Presidential Election

Abstract: This paper unpacks the formal and informal opportunities for fraud during the1988 presidential election in Mexico. In particular, I study how the alteration of votereturns came after an electoral reform that centralized the vote-counting process. Using an original image database of the vote-tally sheets for that election, and applying Convolutional Neural Networks (CNN) to analyze the sheets, I find evidence of blatant alterations in about a third of the tallies in the country. The empirical analysis shows that altered tallies were more prevalent in polling stations where the opposition was not present and in states controlled by governors with grassroots experience of managing the electoral operation. This research has implications for understanding the ways in which autocrats control elections as well as introducing a new methodology to audit the integrity of vote tallies.

The paper is available here


Monday, February 26, 2018: 12-1:30 p.m.

Alberto Abadie, Massachusetts Institute of Technology

Title: The Risk of Machine Learning 

Abstract: Many applied settings in empirical economics involve simultaneous estimation of a large number of parameters. In particular, applied economists are often interested in estimating the effects of many-valued treatments (like teacher effects or location effects), treatment effects for many groups, and prediction models with many regressors. In these settings, machine learning methods that combine regularized estimation and data-driven choices of regularization parameters are useful to avoid over-tting. In this article, we analyze the performance of a class of machine learning estimators that includes ridge, lasso and pretest in contexts that require simultaneous estimation of many parameters. Our analysis aims to provide guidance to applied researchers on (i) the choice between regularized estimators in practice and (ii) data-driven selection of regularization parameters. To address (i), we characterize the risk (mean squared error) of regularized estimators and derive their relative performance as a function of simple features of the data generating process. To address (ii), we show that data-driven choices of regularization parameters, based on Stein's unbiased risk estimate or on cross-validation, yield estimators with risk uniformly close to the risk attained under the optimal (unfeasible) choice of regularization parameters. We use data from recent examples in the empirical economics literature to illustrate the practical applicability of our results.

The presentation material is available here.


Monday, March 5, 2018: 12-1:30 p.m.

Fredrik Sävje, Yale University

Title: Average treatment effects in the presence of unknown interference

Abstract: “We investigate large-sample properties of treatment effect estimators under unknown interference in randomized experiments. The inferential target is a generalization of the average treatment effect estimand that marginalizes over potential spillover effects. We show that estimators commonly used to estimate treatment effects under no-interference are consistent for the generalized estimand for most experimental designs under limited but otherwise arbitrary and unknown interference. The rates of convergence depend on the rate at which the amount of interference grows and the degree to which it aligns with dependencies in treatment assignment. Importantly for practitioners, the results imply that if one erroneously assumes that units do not interfere in a setting with limited, or even moderate, interference, standard estimators are nevertheless likely to be close to an average treatment effect if the sample is sufficiently large.”

Paper: https://arxiv.org/abs/1711.06399


Monday, March 19, 2018: 12-1:30 p.m.

Margaret (Molly) E. Roberts, University of California, San Diego

Title: How to Make Causal Inferences Using Texts (with Naoki Egami, Christian Fong, Justin Grimmer and Brandon Stewart) 

Abstract: New text as data techniques offer a great promise: the ability to inductively discover measures that are useful for testing social science theories of interest from large collections of text. We introduce a conceptual framework for making causal inferences with discovered measures as a treatment or outcome. Our framework enables researchers to discover high-dimensional textual interventions and estimate the ways that observed treatments affect text-based outcomes. We argue that nearly all text-based causal inferences depend upon a latent representation of the text and we provide a framework to learn the latent representation. But estimating this latent representation, we show, creates new risks: we may introduce an identification problem or overfit. To address these risks we describe a split-sample framework and apply it to estimate causal effects from an experiment on immigration attitudes and a study on bureaucratic response. Our work provides a rigorous foundation for text-based causal inferences.


Wednesday, April 18, 2018: 11:30 am - 1:00 p.m.  [PLEASE NOTE NEW DAY/NEW TIME]

Jeff Gill, Editor in Chief, Political Analysis; Distinguished Professor, Department of Government; Professor, Department of Mathematics & Statistics; Member, Center for Behavioral Neuroscience, American University; Visiting Professor, Harvard University

Title:  Models for Identifying Substantive Clusters and Fitted Subclusters in Social Science Data

Abstract: Unseen grouping, often called latent clustering, is a common feature in social science data.  Subjects may intentionally or unintentionally group themselves in ways that complicate the statistical analysis of substantively important relationships. This work introduces a new model-based clustering design which incorporates two sources of heterogeneity.  The first source is a random effect that introduces substantively unimportant grouping but must be accounted-for. The second source is more important and more difficult to handle since it is directly related to the relationships of interest in the data.  We develop a model to handle both of these challenges and apply it to data on terrorist groups, which are notoriously hard to model with conventional tools.


Monday, April 23, 2018: 12-1:30 p.m.

Xiang Zhou, Harvard University

Title: Two residual-based methods to adjust for treatment-induced confounding in causal inference

Abstract: Treatment-induced confounding arises in both causal inference of time-varying treatments and causal mediation analysis where post-treatment variables affect both the mediator and outcome. Existing methods to adjust for treatment-induced confounding include, among others, Robins's structural nest mean model (SNMM) with its g-estimation and marginal structural models (MSM) with inverse probability weighting (IPW). In this talk, I describe two alternative methods, one called "regression-with-residuals" (RWR) and the other called "residual balancing," for estimating the marginal means of potential outcomes. The RWR method is a simple extension of Almirall et al.'s (2010) two-stage estimator for studying effect moderation to the estimation of marginal effects. In special cases, it is equivalent to Vansteelandt's (2009) sequential g-estimator for estimating controlled direct effects. The residual balancing method, on the other hand, can be considered a generalization of Hainmueller's (2012) entropy balancing method to time-varying settings. Numeric simulations show that the residual balancing method tends to be more efficient and more robust than IPW in a variety of settings.

Paper:  here


Monday, May 7, 2018: 12-1:30 p.m.

Teppei Yamamoto

Title:  Item Response Theory for Conjoint Survey Experiments  (Joint work with Devin Caughey and Hiroto Katsumata)

Abstract: In recent years, there has been an increasing use of conjoint survey experiments in political science to analyze preferences about objects that vary in multiple attributes. The dominant approach in these studies has been to apply the regression-based estimator for the Average Marginal Component Effects (AMCE) proposed by Hainmueller, Hopkins and Yamamoto (2014). While the standard approach enables model-free inference about preferences underlying conjoint survey data, it has important limitations for analyzing heterogeneity in respondents' preferences about attributes and investigating how attributes are related to each other in the formation of preference about profiles as a whole. In this paper, we propose an item response theory (IRT) model for conjoint survey data to analyze respondents' heterogeneous preferences about attributes, building upon a canonical spatial theory of voting to model preferences as a function of respondents' ideal points on a latent space capturing taste variation. The model also incorporates a set of valence parameters to identify the dimension of preference about attributes that is common to all respondents. We discuss identification conditions, inference via a Bayesian algorithm, and how to map model parameters to substantive quantities of interest. We illustrate the utility of the proposed approach through Monte Carlo simulations as well as a validation analysis of an original online conjoint experiment on presidential candidate choice.



Chad Hazlett, University of California, Los Angeles

Title TBA


For any questions or suggestions, please contact Teppei Yamamoto or pmlab-contact@mit.edu.