[下载]2008年新书：Mostly Harmless Econometrics: An Empiricist’s Companion

rawstone

15605

收藏 2008-10-20

258007.pdf
大小:(1.89 MB)

只需: 25 个论坛币马上下载

作者 Joshua D. Angrist Massachusetts Institute of Technology

Jörn-Ste¤en Pischke The London School of Economics

Preface
The universe of econometrics is constantly expanding. Econometric methods and practice have advanced
greatly as a result, but the modern menu of econometric methods can seem confusing, even to an experienced
number-cruncher. Luckily, not everything on the menu is equally valuable or important. Some of the more
exotic items are needlessly complex and may even be harmful. On the plus side, the core methods of applied
econometrics remain largely unchanged, while the interpretation of basic tools has become more nuanced and
sophisticated. Our Companion is an empiricist’s guide to the econometric essentials . . . Mostly Harmless
Econometrics.
The most important items in an applied econometrician’s toolkit are:
1. Regression models designed to control for variables that may mask the causal e¤ects of interest;
2. Instrumental variables methods for the analysis of real and natural experiments;
3. Di¤erences-in-di¤erences-type strategies that use repeated observations to control for unobserved
omitted factors.
The productive use of these basic techniques requires a solid conceptual foundation and a good understanding
of the machinery of statistical inference. Both aspects of applied econometrics are covered here.
Our view of what’s important has been shaped by our experience as empirical researchers, and especially
by our work teaching and advising Economics Ph.D. students. This book was written with these students
in mind. At the same time, we hope the book will …nd an audience among other groups of researchers who
have an urgent need for practical answers regarding choice of technique and the interpretation of research
…ndings. The concerns of applied econometrics are not fundamentally di¤erent from those in other social
sciences or epidemiology. Anyone interested in using data to shape public policy or to promote public health
must digest and use statistical results. Anyone interested in drawing useful inferences from data on people
can be said to be an applied econometrician.
Many textbooks provide a guide to research methods and there is some overlap between this book and
others in wide use. But our Companion di¤ers from econometrics texts in a number of important ways. First,
we believe that empirical research is most valuable when it uses data to answer speci…c causal questions, as

if in a randomized clinical trial. This view shapes our approach to all research questions. In the absence of
a real experiment, we look for well-controlled comparisons and/or natural “quasi-experiments”. Of course,
some quasi-experimental research designs are more convincing than others, but the econometric methods
used in these studies are almost always fairly simple. Consequently, our book is shorter and more focused
than textbook treatments of econometric methods. We emphasize the conceptual issues and simple statistical
techniques that turn up in the applied research we read and do, and illustrate these ideas and techniques
with many empirical examples. Although our views of what’s important are not universally shared among
applied economists, there is no arguing with the fact that experimental and quasi-experimental research
designs are increasingly at the heart of the most in‡uential empirical studies in applied economics.
A second distinction we claim is a certain lack of seriousness. Most econometrics texts appear to take
econometric models very seriously. Typically these books pay a lot of attention to the putative failures
of classical modelling assumptions such as linearity and homoskedasticity. Warnings are sometimes issued.
We take a more forgiving and less literal-minded approach. A principle that guides our discussion is that the
estimators in common use almost always have a simple interpretation that is not heavily model-dependent.
If the estimates you get are not the estimates you want, the fault lies in the econometrician and not the
econometrics! A leading example is linear regression, which provides useful information about the conditional
mean function regardless of the shape of this function. Likewise, instrumental variables methods estimate
an average causal e¤ect for a well-de…ned population even if the instrument does not a¤ect everyone. The
conceptual robustness of basic econometric tools is grasped intuitively by many applied researchers, but
the theory behind this robustness does not feature in most texts. Our Companion also di¤ers from most
econometrics texts in that, on the inference side, we are not much concerned with asymptotic e¢ ciency.
Rather, our discussion of inference is devoted mostly to the …nite-sample bugaboos that should bother
practitioners.
The main prerequisites for the material here are basic training in probability and statistics. We espe-
cially hope that readers are comfortable with the elementary tools of statistical inference, such as t-statistics
and standard errors. Familiarity with fundamental probability concepts like mathematical expectation is
also helpful, but extraordinary mathematical sophistication is not required. Although important proofs are
presented, the technical arguments are not very long or complicated. Unlike many upper-level econometrics
texts, we go easy on the linear algebra. For this reason and others, our Companion should be an easier read
than competing books. Finally, in the spirit of the Douglas Adams’lighthearted serial (The Hitchhiker’s
Guide to the Galaxy and Mostly Harmless, among others) from which we draw continued inspiration, our
Companion may have occasional inaccuracies, but it is quite a bit cheaper than the many versions of the En-
cyclopedia Galactica Econometrica that dominate today’s market. Grateful thanks to Princeton University
Press for agreeing to distribute our Companion on these terms.

[此贴子已经被作者于2008-12-12 8:42:14编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

全部回复

rawstone

2008-10-20 23:52:00

Contents
Preface xi
Acknowledgments xiii
Organization of this Book xv
I Introduction 1
1 Questions about Questions 3
2 The Experimental Ideal 9
2.1 The Selection Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Random Assignment Solves the Selection Problem . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Regression Analysis of Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
II The Core 19
3 Making Regression Make Sense 21
3.1 Regression Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1.1 Economic Relationships and the Conditional Expectation Function . . . . . . . . . . . 23
3.1.2 Linear Regression and the CEF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1.3 Asymptotic OLS Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.1.4 Saturated Models, Main E¤ects, and Other Regression Talk . . . . . . . . . . . . . . . 36
3.2 Regression and Causality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2.1 The Conditional Independence Assumption . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2.2 The Omitted Variables Bias Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.2.3 Bad Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.3 Heterogeneity and Nonlinearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.3.1 Regression Meets Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.3.2 Control for Covariates Using the Propensity Score . . . . . . . . . . . . . . . . . . . . 59
3.3.3 Propensity-Score Methods vs. Regression . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.4 Regression Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.4.1 Weighting Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.4.2 Limited Dependent Variables and Marginal E¤ects . . . . . . . . . . . . . . . . . . . . 69
3.4.3 Why is Regression Called Regression and What Does Regression-to-the-mean Mean? . 80
3.5 Appendix: Derivation of the average derivative formula . . . . . . . . . . . . . . . . . . . . . 81
4 Instrumental Variables in Action: Sometimes You Get What You Need 83
4.1 IV and causality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.1.1 Two-Stage Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.1.2 The Wald Estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.1.3 Grouped Data and 2SLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.2 Asymptotic 2SLS Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.2.1 The Limiting Distribution of the 2SLS Coe¢ cient Vector . . . . . . . . . . . . . . . . 103
4.2.2 Over-identi…cation and the 2SLS MinimandF . . . . . . . . . . . . . . . . . . . . . . . 105
4.3 Two-Sample IV and Split-Sample IVF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.4 IV with Heterogeneous Potential Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.4.1 Local Average Treatment E¤ects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.4.2 The Compliant Subpopulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
4.4.3 IV in Randomized Trials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
4.4.4 Counting and Characterizing Compliers . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.5 Generalizing LATE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
4.5.1 LATE with Multiple Instruments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
4.5.2 Covariates in the Heterogeneous-e¤ects Model . . . . . . . . . . . . . . . . . . . . . . . 131
4.5.3 Average Causal Response with Variable Treatment IntensityF . . . . . . . . . . . . . . 136
4.6 IV Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
4.6.1 2SLS Mistakes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
4.6.2 Peer E¤ects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
4.6.3 Limited Dependent Variables Reprise . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
4.6.4 The Bias of 2SLSF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
4.7 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
5 Parallel Worlds: Fixed E¤ects, Di¤erences-in-di¤erences, and Panel Data 165
5.1 Individual Fixed E¤ects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
5.2 Di¤erences-in-di¤erences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

5.2.1 Regression DD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
5.3 Fixed E¤ects versus Lagged Dependent Variables . . . . . . . . . . . . . . . . . . . . . . . . . 182
5.4 Appendix: More on …xed e¤ects and lagged dependent variables . . . . . . . . . . . . . . . . 184
III Extensions 187
6 Getting a Little Jumpy: Regression Discontinuity Designs 189
6.1 Sharp RD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
6.2 Fuzzy RD is IV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
7 Quantile Regression 203
7.1 The Quantile Regression Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
7.1.1 Censored Quantile Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
7.1.2 The Quantile Regression Approximation PropertyF . . . . . . . . . . . . . . . . . . . 210
7.1.3 Tricky Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
7.2 Quantile Treatment E¤ects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
7.2.1 The QTE Estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
8 Nonstandard Standard Error Issues 221
8.1 The Bias of Robust Standard ErrorsF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
8.2 Clustering and Serial Correlation in Panels . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
8.2.1 Clustering and the Moulton Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
8.2.2 Serial Correlation in Panels and Di¤erence-in-Di¤erence Models . . . . . . . . . . . . . 236
8.2.3 Fewer than 42 clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
8.3 Appendix: Derivation of the simple Moulton factor . . . . . . . . . . . . . . . . . . . . . . . . 241
Last words 245
Acronyms 247
Empirical Studies Index 251
Notation 253