Statistics and hypothesis testing are routinely used in areas (such as linguistics) that are traditionally not mathematically intensive. In such fields, when faced with experimental data, many students and researchers tend to rely on commercial packages to carry out statistical data analysis, often without understanding the logic of the statistical tests they rely on. As a consequence, results are often misinterpreted, and users have difficulty in flexibly applying techniques relevant to their own research — they use whatever they happen to have learned. A simple solution is to teach the fundamental ideas of statistical hypothesis testing without using too much mathematics.
This book provides a non-mathematical, simulation-based introduction to basic statistical concepts and encourages readers to try out the simulations themselves using the source code and data provided (the freely available programming language R is used throughout). Since the code presented in the text almost always requires the use of previously introduced programming constructs, diligent students also acquire basic programming abilities in R.
The book is intended for advanced undergraduate and graduate students in any discipline, although the focus is on linguistics, psychology, and cognitive science. It is designed for self-instruction, but it can also be used as a textbook for a first course on statistics. Earlier versions of the book have been used in undergraduate and graduate courses in Europe and the US.
1 Getting Started..........................................
1.1 Installation: R, L ATEX, and Emacs .......................
1.2 How to read this book .................................
1.3 Some Simple Commands in R ...........................
1.4 Graphical Summaries ..................................
2 Randomness and Probability ............................
2.1 Elementary Probability Theory..........................
2.1.1 The Sum and Product Rules ......................
2.1.2 Stones and Rain: A Variant on the Coin-toss Problem
2.2 The Binomial Distribution ..............................
2.3 Balls in a Box .........................................
2.4 Standard Deviation and Sample Size .....................
2.4.1 Another Insight: Mean Minimizes Variance .........
2.5 The Binomial versus the Normal Distribution .............
Problems .................................................
3 The Sampling Distribution of the Sample Mean .........
3.1 The Central Limit Theorem ............................
3.2 σ and σ¯ x .............................................
3.3 The 95% Confidence Interval for the Sample Mean.........
3.4 Realistic Statistical Inference ...........................
3.5 s is an Unbiased Estimator of σ .........................
3.6 The t-distribution .....................................
3.7 The One-sample t-test .................................
3.8 Some Observations on Confidence Intervals ...............
3.9 Sample SD, Degrees of Freedom, Unbiased Estimators ......
3.10 Summary of the Sampling Process .......................
3.11 Significance Tests ......................................
3.12 The Null Hypothesis ...................................
3.13 z-scores ..............................................
3.14 P-values ............................................
3.15 Hypothesis Testing: A More Realistic Scenario ..........
3.16 Comparing Two Samples .............................
3.16.1 H0 in Two-sample Problems ....................
Problems ...............................................
4Power .................................................
4.1 Hypothesis Testing Revisited..........................
4.2 Type I and Type II Errors ............................
4.3 Equivalence Testing ..................................
4.3.1 Equivalence Testing Example ...................
4.3.2 TOST Approach to the Stegner et al. Example ....
4.3.3 Equivalence Testing Example: CIs Approach ......
4.4 Observed Power and Null Results ......................
Problems ...............................................
5 Analysis of Variance (ANOVA) ........................
5.1 Comparing Three Populations .........................
5.2 ANOVA ............................................
5.2.1 Statistical Models .............................
5.2.2 Variance of Sample Means as a Possible Statistic ..
5.2.3 Analyzing the Variance ........................
5.3 Hypothesis Testing ..................................
5.3.1 MS-within, MS-between as Statistics .............
5.3.2 The F-distribution.............................
5.3.3 ANOVA in R .................................
5.3.4 MS-within, Three Non-identical Populations ......
5.3.5 The F-distribution with Unequal Variances .......
5.4 ANOVA as a Linear Model ...........................
Problems ...............................................
6 Bivariate Statistics and Linear Models .................
6.1 Variance and Hypothesis Testing for Regression .........
6.1.1 Sum of Squares and Correlation .................
Problems ...............................................
7 An Introduction to Linear Mixed Models ..............
7.1 Introduction ........................................
7.2 Simple Linear Model .................................
7.3 The Levels of the Complex Linear Model ...............
7.4 Further Reading .....................................
A Random Variables ..................................
A.1 The Probability Distribution in Statistical Inference ..
A.2 Expectation .....................................
A.3 Properties of Expectation .........................
A.4 Variance ........................................
A.5 Important Properties of Variance ...................
A.6 Mean and SD of the Binomial Distribution ..........
A.7 Sample versus Population Means and Variances ......
A.8 Summing up .....................................
Problems ............................................
B Basic R Commands and Data Structures ...........
References ..............................................
Index ...................................................
2011R新书四册之四:Forest Analytics with R -- An Introduction S-Plus&R专版
2011R新书四册之三:Statistics and Data Analysis for Financial Engineering S-Plus&R专版
2011R新书四册之二:第三版Time Series Analysis and Applications with R examples S-Plus&R专版
2011R新书四册之一:The Foundations of Statistics:A Simulation-based Approach