Data Analysis Using Regression and Multilevel/Hierarchical Models
Data Analysis Using Regression and Multilevel/Hierarchical Models is a comprehensive
manual for the applied researcher who wants to perform data analysis using linear and
nonlinear regression and multilevel models. The book introduces and demonstrates a wide
variety of models, at the same time instructing the reader in how to fit these models using
freely available software packages. The book illustrates the concepts by working through
scores of real data examples that have arisen in the authors’ own applied research, with programming
code provided for each one. Topics covered include causal inference, including
regression, poststratification, matching, regression discontinuity, and instrumental variables,
as well as multilevel logistic regression and missing-data imputation. Practical tips
regarding building, fitting, and understanding are provided throughout.
Andrew Gelman is Professor of Statistics and Professor of Political Science at Columbia
University. He has published more than 150 articles in statistical theory, methods, and
computation and in applications areas including decision analysis, survey sampling, political
science, public health, and policy. His other books are Bayesian Data Analysis (1995,
second edition 2003) and Teaching Statistics: A Bag of Tricks (2002).
Jennifer Hill is Assistant Professor of Public Affairs in the Department of International
and Public Affairs at Columbia University. She has coauthored articles that have appeared
in the Journal of the American Statistical Association, American Political Science Review,
American Journal of Public Health, Developmental Psychology, the Economic Journal, and
the Journal of Policy Analysis and Management, among others.
Contents
List of examples xvii
Preface xix
1 Why? 1
2 Concepts and methods from basic probability and statistics 13
Part 1A: Single-level regression 29
3 Linear regression: the basics 31
4 Linear regression: before and after fitting the model 53
5 Logistic regression 79
6 Generalized linear models 109
Part 1B: Working with regression inferences 135
7 Simulation of probability models and statistical inferences 137
8 Simulation for checking statistical procedures and model fits 155
9 Causal inference using regression on the treatment variable 167
10 Causal inference using more advanced models 199
Part 2A: Multilevel regression 235
11 Multilevel structures 237
12 Multilevel linear models: the basics 251
13 Multilevel linear models: varying slopes, non-nested models, and
other complexities 279
14 Multilevel logistic regression 301
15 Multilevel generalized linear models 325
Part 2B: Fitting multilevel models 343
16 Multilevel modeling in Bugs and R: the basics 345
17 Fitting multilevel linear and generalized linear models in Bugs
and R 375
18 Likelihood and Bayesian inference and computation 387
19 Debugging and speeding convergence 415
Part 3: From data collection to model understanding to model
checking 435
20 Sample size and power calculations 437
21 Understanding and summarizing the fitted models 457
22 Analysis of variance 487
23 Causal inference using multilevel models 503
24 Model checking and comparison 513
25 Missing-data imputation 529
Appendixes 545
References 575
Author index 601
Subject index 607