帮楼主贴下此书的TOC:
Contents
--------------------------------------------------------------------------------
foreword
preface
acknowledgments
about this book
about the cover
illustration
Part 1 Introduction to data science
Chapter 1 The data science process
The roles in a data science project
Stages of a data
science project
Setting expectations
Summary
Chapter 2 Loading data into R
Working with data from files
Working with relational databases
Summary
Chapter 3 Exploring data
Using summary statistics to spot problems
Spotting problems using graphics and visualization
Summary
Chapter 4 Managing data
Cleaning data
Sampling for modeling and validation
Summary
Part 2 Modeling methods
Chapter 5 Choosing and evaluating models
Mapping problems to machine learning tasks
Evaluating models
Validating models
Summary
Chapter 6 Memorization methods
KDD and KDD Cup 2009
Building single-variable models
Building models using many variables
Summary
Chapter 7 Linear and logistic regression
Using linear regression
Using logistic regression
Summary
Chapter 8 Unsupervised methods
Cluster analysis
Association rules
Summary
Chapter 9 Exploring advanced methods
Using bagging and random forests to reduce training variance
Using generalized additive models (GAMs) to learn non-monotone relationships
Using kernel methods to increase data separation
Using SVMs to model complicated decision boundaries
Summary
Part 3 Delivering results
Chapter 10 Documentation and deployment
The buzz dataset
Using knitr to produce milestone documentation
Using comments and version control for running documentation
Deploying models
Summary
Chapter 11 Producing effective presentations
Presenting your results to the project sponsor
Presenting your model to end users
Presenting your work to other data scientists
Summary
appendix A Working with R and other tools
appendix B Important statistical concepts
appendix C More tools and ideas worth exploring
bibliography
index