全部版块 我的主页
论坛 计量经济学与统计论坛 五区 计量经济学与统计软件
7745 10
2013-07-01
2013_AppliedPredictiveModeling_Springer.zip
大小:(10.03 MB)

只需: 3 个论坛币  马上下载


This text is intended for a broad audience as both an introduction to predictive models as well as a guide to applying them. It provides an intuitive explanations of the techniques while an emphasis on problem-solving with real data across a wide variety of applications will aid practitioners who wish to extend their expertise. Readers should have knowledge of basic statistical ideas, such as correlation and linear regression analysis.

Table of Contents

Preface

Chapter 1 Introduction

Prediction Versus Interpretation, Key Ingredients of Predictive Models; Terminology; Example Data Sets and Typical Data Scenarios; Overview; Notation (15 pages, 3 figures)


Part I: General Strategies

Chapter 2 A Short Tour of the Predictive Modeling Process

Case Study: Predicting Fuel Economy; Themes; Summary (8 pages, 6 figures, R packages used)

Chapter 3 Data Pre-Processing

Case Study: Cell Segmentation in High-Content Screening; Data Transformations for Individual Predictors; Data Transformations for Multiple Predictors; Dealing with Missing Values; Removing Variables; Adding Variables; Binning Variables; Computing; Exercises (32 pages, 11 figures, R packages used)

Chapter 4 Over-Fitting and Model Tuning

The Problem of Over-Fitting; Model Tuning; Data Splitting; Resampling Techniques; Case Study: Credit Scoring; Choosing Final Tuning Parameters; Data Splitting Recommendations; Choosing Between Models; Computing; Exercises (29 pages, 13 figures, R packages used)


Part II: Regression Models

Chapter 5 Measuring Performance in Regression Models

Quantitative Measures of Performance; The Variance-Bias Tradeoff; Computing (4 pages, 3 figures)

Chapter 6 Linear Regression and Its Cousins

Case Study: Quantitative Structure-Activity Relationship Modeling; Linear Regression; Partial Least Squares; Penalized Models; Computing; Exercises (37 pages, 20 figures, R packages used)

Chapter 7 Non-Linear Regression Models

Neural Networks; Multivariate Adaptive Regression Splines; Support Vector Machines; K-Nearest Neighbors; Computing; Exercises (28 pages, 10 figures, R packages used)

Chapter 8 Regression Trees and Rule-Based Models

Basic Regression Trees; Regression Model Trees; Rule-Based Models; Bagged Trees; Random Forests; Boosting; Cubist; Computing; Exercises (46 pages, 24 figures, R packages used)

Chapter 9 A Summary of Solubility Models

(3 pages, 3 figures)

Chapter 10 Case Study: Compressive Strength of Concrete Mixtures

Model Building Strategy; Model Performance; Optimizing Compressive Strength; Computing (12 pages, 5 figures, R packages used)


Part III: Classification Models

Chapter 11 Measuring Performance in Classification Models

Class Predictions; Evaluating Predicted Classes; Evaluating Class Probabilities; Computing (20 pages, 9 figures, R packages used)

Chapter 12 Discriminant Analysis and Other Linear Classification Models

Case Study; Logistic Regression; Linear Discriminant Analysis; Partial Least Squares Discriminant Analysis; Penalized Models; Nearest Shrunken Centroids; Computing; Exercises (52 pages, 20 figures, R packages used)

Chapter 13 Non-Linear Classification Models

Nonlinear Discriminant Analysis; Neural Networks; Flexible Discriminant Analysis; Support Vector Machines; K-Nearest Neighbors; Naive Bayes; Computing; Exercises (38 pages, 16 figures, R packages used)

Chapter 14 Classification Trees and Rule-Based Models

Basic Regression Trees; Rule-Based Models; Bagged Trees; Random Forests; Boosting; C5.0; Wrap-Up; Computing (46 pages, 15 figures, R packages used)

Chapter 15 A Summary of Grant Application Models

(3 pages, 2 figures)

Chapter 16 Remedies for Severe Class Imbalance

Case Study: Predicting Caravan Policy Ownership; The Effect of Class Imbalance; Model Tuning; Alternate Cutoffs; Adjusting Prior Probabilities; Unequal Case Weights; Sampling Methods; Cost-Sensitive Training; Computing; Exercises (24 pages, 7 figures, R packages used)

Chapter 17 Case Study: Job Scheduling

Data Splitting and Model Strategy; Results; Computing (13 pages, 6 figures, R packages used)


Part IV: Other Considerations

Chapter 18 Measuring Predictor Importance

Numeric Outcomes; Categorical Outcomes; Other Approaches; Computing; Exercises (24 pages, 10 figures, R packages used)

Chapter 19 An Introduction to Feature Selection

Consequences of Using Non-Informative Predictors; Approaches for Reducing the Number of Predictors; Wrappers Methods; Filter Methods; Selection Bias; Misuse of Feature Selection; Case Study: Predicting Cognitive Impairment; Computing; Exercises (34 pages, 7 figures, R packages used)

Chapter 20 Factors That Can Affect Model Performance

Type III Errors; Measurment Error in the Outcome; Measurement Error in the Predictors; Discretizing Continuous Outcomes; When Should You Trust Your Model’s Prediction?; The Impact of a Large Sample; Computing; Exercises (26 pages, 12 figures, R packages used)


Appendix

These are included in the sample pages on Spinger's website.

Appendix A A Summary of Various Models

Appendix B An Introduction to R

Startup and Getting Help; Packages; Creating Objects; Data Types and Basic Structures; Working with Rectangular Data Sets; Objects and Classes; R Functions; The Three Faces of =; The AppliedPredictiveModeling Package; The caret Package; Software Used in This Text (16 pages, 1 figure, R packages used)

Appendix C Interesting Websites


References

Index


二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2013-7-4 08:32:41
没有第一章
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2013-7-4 14:23:32
leonkd 发表于 2013-7-4 08:32
没有第一章
ch01.pdf
大小:(242.87 KB)

 马上下载

  Uploaded Chapter 1 here. Thanks for your point.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2013-8-9 02:30:55
This is what I am looking for. Thanks added chapter 1.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2013-9-19 16:55:16
谢谢分享!
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2013-10-12 15:02:54
此书,不错,一年前始研究作者开发的R 包及其相关内容,现要合成如此众多的学习机器。本书不足是:没有
Ensemble Methods.
如下面十个机器,在进行5*3 CV 参数TUNING 后的最优model 的绩效:


Call:
summary.resamples(object = resamps)

Models: gbmM, svmRadialM, svmPolyM, rfM, cforestM, blackboostM, gamboostM, glmboostM, glmnetM, hddaM
Number of resamples: 15

ROC
              Min. 1st Qu. Median   Mean 3rd Qu.   Max. NA's
gbmM        0.8118  0.8915 0.9241 0.9131  0.9471 0.9647    0
svmRadialM  0.9160  0.9353 0.9643 0.9582  0.9756 1.0000    0
svmPolyM    0.8319  0.8965 0.9451 0.9308  0.9793 1.0000    0
rfM         0.7922  0.9085 0.9294 0.9222  0.9569 0.9922    0
cforestM    0.7412  0.8487 0.8950 0.8861  0.9196 0.9804    0
blackboostM 0.6529  0.7864 0.8176 0.8219  0.8710 0.9196    0
gamboostM   0.7843  0.8675 0.9018 0.8963  0.9289 0.9922    0
glmboostM   0.7689  0.8113 0.8549 0.8512  0.8948 0.9216    0
glmnetM     0.8235  0.8657 0.8863 0.8896  0.9196 0.9554    0
hddaM       0.8137  0.8560 0.8863 0.8864  0.9216 0.9688    0
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

点击查看更多内容…
相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群