jgchen1966 发表于 2018-12-2 20:51 
以上东西,是不很基本?太基本,我也可能不一定全说对了 ? 因基本,也易说错,也就不想说。。
Leo Brei ...
另外,分类回归树还有一个规则问题前教一下前辈.
之前在其它数据挖掘软件有见过一些案例,的确有一些规则输出.但那案例是基于响应率及累计响应率的规则输出,
R语言对树象输出的这些规则是基于什么样的指标?可以看出这里分别输出MAILSHOT=YES和=NO的一些规则,两者数量也不同一个是5条,一个是6条?这些就代表各自预测准确率排名靠前的规则? 输出的规则号也是离散的?
library(rattle)
> asRules(TreeFit)#输出规则
Rule number: 7 [MAILSHOT=YES cover=26 (9%) prob=0.85]
INCOME>=3.009e+04
INCOME>=4.932e+04
Rule number: 39 [MAILSHOT=YES cover=11 (4%) prob=0.73]
INCOME< 3.009e+04
GENDER=FEMALE
CAR=NO
AGE>=29.5
INCOME< 2.041e+04
Rule number: 13 [MAILSHOT=YES cover=52 (17%) prob=0.67]
INCOME>=3.009e+04
INCOME< 4.932e+04
INCOME< 3.961e+04
Rule number: 43 [MAILSHOT=YES cover=22 (7%) prob=0.59]
INCOME< 3.009e+04
GENDER=MALE
INCOME< 2.34e+04
REGION=INNER_CITY
INCOME>=1.434e+04
Rule number: 11 [MAILSHOT=YES cover=31 (10%) prob=0.58]
INCOME< 3.009e+04
GENDER=MALE
INCOME>=2.34e+04
Rule number: 38 [MAILSHOT=NO cover=22 (7%) prob=0.36]
INCOME< 3.009e+04
GENDER=FEMALE
CAR=NO
AGE>=29.5
INCOME>=2.041e+04
Rule number: 12 [MAILSHOT=NO cover=25 (8%) prob=0.36]
INCOME>=3.009e+04
INCOME< 4.932e+04
INCOME>=3.961e+04
Rule number: 42 [MAILSHOT=NO cover=18 (6%) prob=0.33]
INCOME< 3.009e+04
GENDER=MALE
INCOME< 2.34e+04
REGION=INNER_CITY
INCOME< 1.434e+04
Rule number: 20 [MAILSHOT=NO cover=32 (11%) prob=0.22]
INCOME< 3.009e+04
GENDER=MALE
INCOME< 2.34e+04
REGION=RURAL,SUBURBAN,TOWN
Rule number: 18 [MAILSHOT=NO cover=22 (7%) prob=0.18]
INCOME< 3.009e+04
GENDER=FEMALE
CAR=NO
AGE< 29.5
Rule number: 8 [MAILSHOT=NO cover=39 (13%) prob=0.13]
INCOME< 3.009e+04
GENDER=FEMALE
CAR=YES