An Introduction to Statistical Learning with Applications in R, Lesson 01

LouisHenry

834

收藏 2016-04-09

Statistic Learning

A. Tools for understanding data

Supervised

Building a statistical model for predicting, or estimating, an output based on one or more inputs
Applications: business, medicine, astrophysics, and public policy

Unsupervised

There are inputs but no supervising output

B. Three data sets

Wage Data: to understand the association between an employee’s age and education, as well as the calendar year, on his wage
Stock Market Data: to predict whether the index will increase or decrease on a given day using the past 5 days’ percentage changes in the index
Gene Expression Data: to understand which types of customers are similar to each other by grouping individuals according to their observed characteristics

C. Three problem types

A regression problem: predicting a continuous or quantitative output value
A classification problem: predicting a non-numerical value such as a categorical or qualitative output
A clustering problem: not trying to predict an output variable

D. Brief history

linear regression: 19th century, Legendre and Gauss
linear discriminant analysis: 1936, Fisher
logistic regression: 1940s
generalized linear models: early 1970s, Nelder and Wedderburn
classification and regression trees: mid 1980s, Breiman, Friedman, Olshen and Stone
generalized additive models: 1986, Hastie and Tibshirani
machine learning

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群