This book is for people who want to learn probability and statistics quickly.It is suitable for graduate or advanced undergraduate students in computer science,mathematics, statistics, and related disciplines. The book includes modern topics like nonparametric curve estimation, bootstrapping, and classification, topics that are usually relegated to follow-up courses. The reader is presumed to know calculus and a little linear algebra. No previous knowledge of probability and statistics is required. Statistics, data mining, and machine learning are all concerned with collecting and analyzing data. For some time, statistics research was conducted in statistics departments while data mining and machine learning research was conducted in computer science departments. Statisticians thought that computer scientists were reinventing the wheel. Computer scientists thought that statistical theory didn’t apply to their problems.
Contents
I Probability
1 Probability 3
2 Random Variables 19
3 Expectation 47
4 Inequalities 63
5 Convergence of Random Variables 71
II Statistical Inference
6 Models, Statistical Inference and Learning 87
7 Estimating the cdf and Statistical Functionals 97
8 The Bootstrap 107
9 Parametric Inference 119
10 Hypothesis Testing and p-values 149
11 Bayesian Inference 175
12 Statistical Decision Theory 193
III Statistical Models and Methods
13 Linear and Logistic Regression 209
14 Multivariate Models 231
15 Inference About Independence 239
16 Causal Inference 251
17 Directed Graphs and Conditional Independence 263
18 Undirected Graphs 281