Introduction to Nonparametric Estimation
Series: Springer Series in Statistics
Authors:Tsybakov, Alexandre B.
About this book
Concise and self-contained treatment of the theory Thorough analysis of optimality and adaptivity issues Detailed account on minimax lower bounds
Methods of nonparametric estimation are located at the core of modern statistical science. The aim of this book is to give a short but mathematically self-contained introduction to the theory of nonparametric estimation. The emphasis is on the construction of optimal estimators; therefore the concepts of minimax optimality and adaptivity, as well as the oracle approach, occupy the central place in the book.
This is a concise text developed from lecture notes and ready to be used for a course on the graduate level. The main idea is to introduce the fundamental concepts of the theory while maintaining the exposition suitable for a first approach in the field. Therefore, the results are not always given in the most general form but rather under assumptions that lead to shorter or more elegant proofs.
The book has three chapters. Chapter 1 presents basic nonparametric regression and density estimators and analyzes their properties. Chapter 2 is devoted to a detailed treatment of minimax lower bounds. Chapter 3 develops more advanced topics: Pinsker's theorem, oracle inequalities, Stein shrinkage, and sharp minimax adaptivity.
Contents
1 Nonparametric estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Examples of nonparametric models and problems . . . . . . . . . . . . 1
1.2 Kernel density estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Mean squared error of kernel estimators . . . . . . . . . . . . . . 4
1.2.2 Construction of a kernel of order . . . . . . . . . . . . . . . . . . . 10
1.2.3 Integrated squared risk of kernel estimators . . . . . . . . . . . 12
1.2.4 Lack of asymptotic optimality for fixed density . . . . . . . . 16
1.3 Fourier analysis of kernel density estimators . . . . . . . . . . . . . . . . 19
1.4 Unbiased risk estimation. Cross-validation density estimators . 27
1.5 Nonparametric regression. The Nadaraya–Watson estimator . . . 31
1.6 Local polynomial estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.6.1 Pointwise and integrated risk of local polynomial estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.6.2 Convergence in the sup-norm . . . . . . . . . . . . . . . . . . . . . . . 42
1.7 Projection estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
1.7.1 Sobolev classes and ellipsoids . . . . . . . . . . . . . . . . . . . . . . . 49
1.7.2 Integrated squared risk of projection estimators . . . . . . . 51
1.7.3 Generalizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
1.8 Oracles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
1.9 Unbiased risk estimation for regression . . . . . . . . . . . . . . . . . . . . . 61
1.10 Three Gaussian models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
1.11 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
1.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
2 Lower bounds on the minimax risk . . . . . . . . . . . . . . . . . . . . . . . . 77
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
2.2 A general reduction scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.3 Lower bounds based on two hypotheses . . . . . . . . . . . . . . . . . . . . 81
2.4 Distances between probability measures . . . . . . . . . . . . . . . . . . . . 83
2.4.1 Inequalities for distances . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
2.4.2 Bounds based on distances . . . . . . . . . . . . . . . . . . . . . . . . . 90
2.5 Lower bounds on the risk of regression estimators at a point . . 91
2.6 Lower bounds based on many hypotheses . . . . . . . . . . . . . . . . . . . 95
2.6.1 Lower bounds in L2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
2.6.2 Lower bounds in the sup-norm . . . . . . . . . . . . . . . . . . . . . . 108
2.7 Other tools for minimax lower bounds. . . . . . . . . . . . . . . . . . . . . . 110
2.7.1 Fano’s lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
2.7.2 Assouad’s lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
2.7.3 The van Trees inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
2.7.4 The method of two fuzzy hypotheses . . . . . . . . . . . . . . . . . 125
2.7.5 Lower bounds for estimators of a quadratic functional . . 128
2.8 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
2.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
3 Asymptotic efficiency and adaptation . . . . . . . . . . . . . . . . . . . . . . 137
3.1 Pinsker’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
3.2 Linear minimax lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
3.3 Proof of Pinsker’s theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
3.3.1 Upper bound on the risk . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
3.3.2 Lower bound on the minimax risk . . . . . . . . . . . . . . . . . . . 147
3.4 Stein’s phenomenon. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
3.4.1 Stein’s shrinkage and the James–Stein estimator . . . . . . . 157
3.4.2 Other shrinkage estimators . . . . . . . . . . . . . . . . . . . . . . . . . 162
3.4.3 Superefficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
3.5 Unbiased estimation of the risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
3.6 Oracle inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
3.7 Minimax adaptivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
3.8 Inadmissibility of the Pinsker estimator . . . . . . . . . . . . . . . . . . . . 180
3.9 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
3.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211