Speaker Series: Past Years

2017 - 2018 Schedule

The PML speaker series for AY 2017-18 will be in Millikan Room on Mondays at 12-1:30pm.  Lunch will be provided.

Monday, November 20, 2017: 12-1:30 p.m.

Colin Fogarty, Massachusetts Institute of Technology

Title: Studentized sensitivity analysis in paired observational studies

Abstract: A fundamental limitation of causal inference in observational studies is that perceived evidence for an effect might instead be explained by factors not accounted for in the primary analysis. Methods for assessing the sensitivity of a study's conclusions to unmeasured confounding have been established under the assumption that the treatment effect is constant across all individuals. In the potential presence of unmeasured confounding, it has been argued that certain patterns of effect heterogeneity may conspire with unobserved covariates to render the performed sensitivity analysis inadequate. We present a new method for conducting a sensitivity analysis for the sample average treatment effect in the presence of effect heterogeneity in paired observational studies. Our recommended procedure, called the studentized sensitivity analysis, represents an extension of recent work on studentized permutation tests to the case of observational studies, where randomizations are no longer drawn uniformly. The method naturally extends conventional tests for the sample average treatment effect in paired experiments to the case of unknown, but bounded, probabilities of assignment to treatment. In so doing, we illustrate that concerns about certain sensitivity analyses operating under the presumption of constant effects are largely unwarranted.

The presentation material is available here.


Monday, December 4, 2017: 12-1:30 p.m.

Francisco Cantu, University of Houston

Title: The Fingerprints of Fraud: Evidence from Mexico's 1988 Presidential Election

Abstract: This paper unpacks the formal and informal opportunities for fraud during the1988 presidential election in Mexico. In particular, I study how the alteration of votereturns came after an electoral reform that centralized the vote-counting process. Using an original image database of the vote-tally sheets for that election, and applying Convolutional Neural Networks (CNN) to analyze the sheets, I find evidence of blatant alterations in about a third of the tallies in the country. The empirical analysis shows that altered tallies were more prevalent in polling stations where the opposition was not present and in states controlled by governors with grassroots experience of managing the electoral operation. This research has implications for understanding the ways in which autocrats control elections as well as introducing a new methodology to audit the integrity of vote tallies.

The paper is available here


Monday, February 26, 2018: 12-1:30 p.m.

Alberto Abadie, Massachusetts Institute of Technology

Title: The Risk of Machine Learning 

Abstract: Many applied settings in empirical economics involve simultaneous estimation of a large number of parameters. In particular, applied economists are often interested in estimating the effects of many-valued treatments (like teacher effects or location effects), treatment effects for many groups, and prediction models with many regressors. In these settings, machine learning methods that combine regularized estimation and data-driven choices of regularization parameters are useful to avoid over-tting. In this article, we analyze the performance of a class of machine learning estimators that includes ridge, lasso and pretest in contexts that require simultaneous estimation of many parameters. Our analysis aims to provide guidance to applied researchers on (i) the choice between regularized estimators in practice and (ii) data-driven selection of regularization parameters. To address (i), we characterize the risk (mean squared error) of regularized estimators and derive their relative performance as a function of simple features of the data generating process. To address (ii), we show that data-driven choices of regularization parameters, based on Stein's unbiased risk estimate or on cross-validation, yield estimators with risk uniformly close to the risk attained under the optimal (unfeasible) choice of regularization parameters. We use data from recent examples in the empirical economics literature to illustrate the practical applicability of our results.

The presentation material is available here.


Monday, March 5, 2018: 12-1:30 p.m.

Fredrik Sävje, Yale University

Title: Average treatment effects in the presence of unknown interference

Abstract: “We investigate large-sample properties of treatment effect estimators under unknown interference in randomized experiments. The inferential target is a generalization of the average treatment effect estimand that marginalizes over potential spillover effects. We show that estimators commonly used to estimate treatment effects under no-interference are consistent for the generalized estimand for most experimental designs under limited but otherwise arbitrary and unknown interference. The rates of convergence depend on the rate at which the amount of interference grows and the degree to which it aligns with dependencies in treatment assignment. Importantly for practitioners, the results imply that if one erroneously assumes that units do not interfere in a setting with limited, or even moderate, interference, standard estimators are nevertheless likely to be close to an average treatment effect if the sample is sufficiently large.”

Paper: https://arxiv.org/abs/1711.06399


Monday, March 19, 2018: 12-1:30 p.m.

Margaret (Molly) E. Roberts, University of California, San Diego

Title: How to Make Causal Inferences Using Texts (with Naoki Egami, Christian Fong, Justin Grimmer and Brandon Stewart) 

Abstract: New text as data techniques offer a great promise: the ability to inductively discover measures that are useful for testing social science theories of interest from large collections of text. We introduce a conceptual framework for making causal inferences with discovered measures as a treatment or outcome. Our framework enables researchers to discover high-dimensional textual interventions and estimate the ways that observed treatments affect text-based outcomes. We argue that nearly all text-based causal inferences depend upon a latent representation of the text and we provide a framework to learn the latent representation. But estimating this latent representation, we show, creates new risks: we may introduce an identification problem or overfit. To address these risks we describe a split-sample framework and apply it to estimate causal effects from an experiment on immigration attitudes and a study on bureaucratic response. Our work provides a rigorous foundation for text-based causal inferences.


Wednesday, April 18, 2018: 11:30 am - 1:00 p.m.  [PLEASE NOTE NEW DAY/NEW TIME]

Jeff Gill, Editor in Chief, Political Analysis; Distinguished Professor, Department of Government; Professor, Department of Mathematics & Statistics; Member, Center for Behavioral Neuroscience, American University; Visiting Professor, Harvard University

Title:  Models for Identifying Substantive Clusters and Fitted Subclusters in Social Science Data

Abstract: Unseen grouping, often called latent clustering, is a common feature in social science data.  Subjects may intentionally or unintentionally group themselves in ways that complicate the statistical analysis of substantively important relationships. This work introduces a new model-based clustering design which incorporates two sources of heterogeneity.  The first source is a random effect that introduces substantively unimportant grouping but must be accounted-for. The second source is more important and more difficult to handle since it is directly related to the relationships of interest in the data.  We develop a model to handle both of these challenges and apply it to data on terrorist groups, which are notoriously hard to model with conventional tools.


Monday, April 23, 2018: 12-1:30 p.m.

Xiang Zhou, Harvard University

Title: Two residual-based methods to adjust for treatment-induced confounding in causal inference

Abstract: Treatment-induced confounding arises in both causal inference of time-varying treatments and causal mediation analysis where post-treatment variables affect both the mediator and outcome. Existing methods to adjust for treatment-induced confounding include, among others, Robins's structural nest mean model (SNMM) with its g-estimation and marginal structural models (MSM) with inverse probability weighting (IPW). In this talk, I describe two alternative methods, one called "regression-with-residuals" (RWR) and the other called "residual balancing," for estimating the marginal means of potential outcomes. The RWR method is a simple extension of Almirall et al.'s (2010) two-stage estimator for studying effect moderation to the estimation of marginal effects. In special cases, it is equivalent to Vansteelandt's (2009) sequential g-estimator for estimating controlled direct effects. The residual balancing method, on the other hand, can be considered a generalization of Hainmueller's (2012) entropy balancing method to time-varying settings. Numeric simulations show that the residual balancing method tends to be more efficient and more robust than IPW in a variety of settings.

Paper:  here


Monday, May 7, 2018: 12-1:30 p.m.

Teppei Yamamoto

Title:  Item Response Theory for Conjoint Survey Experiments  (Joint work with Devin Caughey and Hiroto Katsumata)

Abstract: In recent years, there has been an increasing use of conjoint survey experiments in political science to analyze preferences about objects that vary in multiple attributes. The dominant approach in these studies has been to apply the regression-based estimator for the Average Marginal Component Effects (AMCE) proposed by Hainmueller, Hopkins and Yamamoto (2014). While the standard approach enables model-free inference about preferences underlying conjoint survey data, it has important limitations for analyzing heterogeneity in respondents' preferences about attributes and investigating how attributes are related to each other in the formation of preference about profiles as a whole. In this paper, we propose an item response theory (IRT) model for conjoint survey data to analyze respondents' heterogeneous preferences about attributes, building upon a canonical spatial theory of voting to model preferences as a function of respondents' ideal points on a latent space capturing taste variation. The model also incorporates a set of valence parameters to identify the dimension of preference about attributes that is common to all respondents. We discuss identification conditions, inference via a Bayesian algorithm, and how to map model parameters to substantive quantities of interest. We illustrate the utility of the proposed approach through Monte Carlo simulations as well as a validation analysis of an original online conjoint experiment on presidential candidate choice.



Chad Hazlett, University of California, Los Angeles

Title TBA


For any questions or suggestions, please contact Teppei Yamamoto or pmlab-contact@mit.edu.