Using R for Data Analysis in Social Sciences: A Research Project-Oriented Approach
by Quan Li (Author)
About the Author
Dr. Quan Li is Professor of Political Science at Texas A&M University. His research has appeared in over thirty articles in numerous journals and two coauthored books, Democracy and Economic Openness in an Interconnected System: Complex Transformations and Politics and Foreign Direct Investment. He has served on the editorial boards of American Journal of Political Science, Journal of Politics, International Studies Quarterly, and International Interactions.
About this book
Statistical analysis is common in the social sciences, and among the more popular programs is R. This book provides a foundation for undergraduate and graduate students in the social sciences on how to use R to manage, visualize, and analyze data. The focus is on how to address substantive questions with data analysis and replicate published findings.
Using R for Data Analysis in Social Sciences adopts a minimalist approach and covers only the most important functions and skills in R to conduct reproducible research. It emphasizes the practical needs of students using R by showing how to import, inspect, and manage data, understand the logic of statistical inference, visualize data and findings via histograms, boxplots, scatterplots, and diagnostic plots, and analyze data using one-sample t-test, difference-of-means test, covariance, correlation, ordinary least squares (OLS) regression, and model assumption diagnostics. It also demonstrates how to replicate the findings in published journal articles and diagnose model assumption violations. Because the book integrates R programming, the logic and steps of statistical inference, and the process of empirical social scientific research in a highly accessible and structured fashion, it is appropriate for any introductory course on R, data analysis, and empirical social-scientific research.
Brief contents
1. Learn about R and Write First Toy Programs 1
WHEN TO USE R IN A RESEARCH PROJECT 2
ESSENTIALS ABOUT R 3
HOWTO START A PROJECT FOLDER ANDWRITEOUR FIRST R PROGRAM 4
CREATE, DESCRIBE, AND GRAPH A VECTOR: A SIMPLE TOY EXAMPLE 7
SIMPLE REAL-WORLD EXAMPLE: DATA FROM IVERSEN AND SOSKICE (2006) 23
CHAPTER 1: R PROGRAM CODE 28
TROUBLESHOOT AND GET HELP 32
IMPORTANT REFERENCE INFORMATION: SYMBOLS, OPERATORS, AND FUNCTIONS 34
SUMMARY 35
MISCELLANEOUS Q&AS FOR AMBITIOUS READERS 36
EXERCISES 42
2. Get Data Ready: Import, Inspect, and Prepare Data 43
PREPARATION 43
IMPORT PENN WORLD TABLE 7.0 DATASET 45
INSPECT IMPORTEDDATA 49
PREPARE DATA I: VARIABLE TYPES AND INDEXING 55
PREPARE DATA II: MANAGE DATASETS 59
PREPARE DATA III: MANAGE OBSERVATIONS 65
PREPARE DATA IV: MANAGE VARIABLES 68
CHAPTER 2 PROGRAM CODE 78
SUMMARY 85
MISCELLANEOUS Q&AS FOR AMBITIOUS READERS 86
EXERCISES 93
3. One-Sample and Difference-of-Means Tests 94
CONCEPTUAL PREPARATION 95
DATA PREPARATION 101
WHAT IS THE AVE
RAGE ECONOMIC GROWTH RATE IN THE WORLD ECONOMY? 104
DID THEWORLDECONOMY GROWMORE QUICKLY IN 1990 THAN IN 1960? 115
CHAPTER 3 PROGRAM CODE 128
SUMMARY 133
MISCELLANEOUS Q&AS FOR AMBITIOUS READERS 133
EXERCISES 142
4. Covariance and Correlation 143
DATA AND SOFTWARE PREPARATIONS 143
VISUALIZE THE RELATIONSHIP BETWEEN TRADE AND GROWTH USING SCATTER PLOT 146
ARE TRADE OPENNESS AND ECONOMIC GROWTH CORRELATED? 149
DOES THE CORRELATION BETWEEN TRADE AND GROWTH CHANGE OVER TIME? 154
CHAPTER 4 PROGRAM CODE 160
SUMMARY 163
MISCELLANEOUS Q&AS FOR AMBITIOUS READERS 164
EXERCISES 168
5. Regression Analysis 170
CONCEPTUAL PREPARATION: HOW TO UNDERSTAND REGRESSION ANALYSIS 171
DATA PREPARATION 175
VISUALIZE AND INSPECT DATA 182
HOW TO ESTIMATE AND INTERPRET OLS MODEL COEFFICIENTS 185
HOW TO ESTIMATE STANDARD ERROR OF COEFFICIENT 187
HOW TO MAKE AN INFERENCE ABOUT THE POPULATION PARAMETER OF INTEREST 188
HOWTO INTERPRET OVERALL MODEL FIT 190
HOW TO PRESENT STATISTICAL RESULTS 193
CHAPTER 5 PROGRAM CODE 194
SUMMARY 198
MISCELLANEOUS Q&AS FOR AMBITIOUS READERS 199
EXERCISES 204
6. Regression Diagnostics and Sensitivity Analysis 206
WHY ARE OLS ASSUMPTIONS AND DIAGNOSTICS IMPORTANT? 206
DATA PREPARATION 211
LINEARITY AND MODEL SPECIFICATION 215
PERFECT AND HIGH MULTICOLLINEARITY 221
CONSTANT ERROR VARIANCE 223
INDEPENDENCE OF ERROR TERM OBSERVATIONS 227
INFLUENTIAL OBSERVATIONS 240
NORMALITY TEST 245
REPORT FINDINGS 247
CHAPTER 6 PROGRAM CODE 251
SUMMARY 259
MISCELLANEOUS Q&AS FOR AMBITIOUS READERS 259
EXERCISES 262
7. Replication of Findings in Published Analyses 263
WHAT EXPLAINS THE GEOGRAPHIC SPREAD OF MILITARIZED INTERSTATE DISPUTES?
REPLICATION AND DIAGNOSTICS OF BRAITHWAITE (2006) 264
DOES RELIGIOSITY INFLUENCE INDIVIDUAL ATTITUDES TOWARD INNOVATION?
REPLICATION OF BéNABOU ET AL. (2015) 284
CHAPTER 7 PROGRAM CODE 295
SUMMARY 301
8. Appendix: A Brief Introduction to Analyzing Categorical
Data and Finding More Data 302
OBJECTIVE 302
GETTINGDATA READY 303
DO MEN AND WOMEN DIFFER IN SELF-REPORTED HAPPINESS? 304
DO BELIEVERS IN GOD AND NON-BELIEVERS DIFFER IN SELF-REPORTED HAPPINESS? 310
SOURCES OF SELF-REPORTED HAPPINESS: LOGISTIC REGRESSION 313
WHERE TO FIND MORE DATA 323
References and Readings 327
Index 331
Pages: 368 pages
Publisher: Oxford University Press (June 6, 2018)
Language: English
ISBN-10: 9780190656225
ISBN-13: 978-0190656225