分析数据下载链接:http://appliedpredictivemodeling.com/data/
目 录
Chapter 1 IntroductionPrediction Versus Interpretation, Key Ingredients of Predictive Models; Terminology; Example Data Sets and Typical Data Scenarios; Overview; Notation (15 pages, 3 figures)
Part I: General StrategiesChapter 2 A Short Tour of the Predictive Modeling ProcessCase Study: Predicting Fuel Economy; Themes; Summary (8 pages, 6 figures, R packages used)
This chapter is included in the sample pages on Spinger's website.
Chapter 3 Data Pre-ProcessingCase Study: Cell Segmentation in High-Content Screening; Data Transformations for Individual Predictors; Data Transformations for Multiple Predictors; Dealing with Missing Values; Removing Variables; Adding Variables; Binning Variables; Computing; Exercises (32 pages, 11 figures, R packages used)
Chapter 4 Over-Fitting and Model TuningThe Problem of Over-Fitting; Model Tuning; Data Splitting; Resampling Techniques; Case Study: Credit Scoring; Choosing Final Tuning Parameters; Data Splitting Recommendations; Choosing Between Models; Computing; Exercises (29 pages, 13 figures, R packages used)
Part II: Regression ModelsChapter 5 Measuring Performance in Regression ModelsQuantitative Measures of Performance; The Variance-Bias Tradeoff; Computing (4 pages, 3 figures)
Chapter 6 Linear Regression and Its CousinsCase Study: Quantitative Structure-Activity Relationship Modeling; Linear Regression; Partial Least Squares; Penalized Models; Computing; Exercises (37 pages, 20 figures, R packages used)
Chapter 7 Non-Linear Regression ModelsNeural Networks; Multivariate Adaptive Regression Splines; Support Vector Machines; K-Nearest Neighbors; Computing; Exercises (28 pages, 10 figures, R packages used)
Chapter 8 Regression Trees and Rule-Based ModelsBasic Regression Trees; Regression Model Trees; Rule-Based Models; Bagged Trees; Random Forests; Boosting; Cubist; Computing; Exercises (46 pages, 24 figures, R packages used)
Chapter 9 A Summary of Solubility Models(3 pages, 3 figures)
Chapter 10 Case Study: Compressive Strength of Concrete MixturesModel Building Strategy; Model Performance; Optimizing Compressive Strength; Computing (12 pages, 5 figures, R packages used)
Part III: Classification ModelsChapter 11 Measuring Performance in Classification ModelsClass Predictions; Evaluating Predicted Classes; Evaluating Class Probabilities; Computing (20 pages, 9 figures, R packages used)
Chapter 12 Discriminant Analysis and Other Linear Classification ModelsCase Study; Logistic Regression; Linear Discriminant Analysis; Partial Least Squares Discriminant Analysis; Penalized Models; Nearest Shrunken Centroids; Computing; Exercises (52 pages, 20 figures, R packages used)
Chapter 13 Non-Linear Classification ModelsNonlinear Discriminant Analysis; Neural Networks; Flexible Discriminant Analysis; Support Vector Machines; K-Nearest Neighbors; Naive Bayes; Computing; Exercises (38 pages, 16 figures, R packages used)
Chapter 14 Classification Trees and Rule-Based ModelsBasic Regression Trees; Rule-Based Models; Bagged Trees; Random Forests; Boosting; C5.0; Wrap-Up; Computing (46 pages, 15 figures, R packages used)
Chapter 15 A Summary of Grant Application Models(3 pages, 2 figures)
Chapter 16 Remedies for Severe Class ImbalanceCase Study: Predicting Caravan Policy Ownership; The Effect of Class Imbalance; Model Tuning; Alternate Cutoffs; Adjusting Prior Probabilities; Unequal Case Weights; Sampling Methods; Cost-Sensitive Training; Computing; Exercises (24 pages, 7 figures, R packages used)
Chapter 17 Case Study: Job SchedulingData Splitting and Model Strategy; Results; Computing (13 pages, 6 figures, R packages used)
Part IV: Other ConsiderationsChapter 18 Measuring Predictor ImportanceNumeric Outcomes; Categorical Outcomes; Other Approaches; Computing; Exercises (24 pages, 10 figures,R packages used)
Chapter 19 An Introduction to Feature SelectionConsequences of Using Non-Informative Predictors; Approaches for Reducing the Number of Predictors; Wrappers Methods; Filter Methods; Selection Bias; Misuse of Feature Selection; Case Study: Predicting Cognitive Impairment; Computing; Exercises (34 pages, 7 figures, R packages used)
Chapter 20 Factors That Can Affect Model PerformanceType III Errors; Measurment Error in the Outcome; Measurement Error in the Predictors; Discretizing Continuous Outcomes; When Should You Trust Your Model’s Prediction?; The Impact of a Large Sample; Computing; Exercises (26 pages, 12 figures, R packages used)
Appendix
These are included in the sample pages on Spinger's website.
Appendix A A Summary of Various ModelsAppendix B An Introduction to RStartup and Getting Help; Packages; Creating Objects; Data Types and Basic Structures; Working with Rectangular Data Sets; Objects and Classes; R Functions; The Three Faces of =; The AppliedPredictiveModeling Package; The caret Package; Software Used in This Text (16 pages, 1 figure, R packages used)
Appendix C Interesting Websites