全部版块 我的主页
论坛 数据科学与人工智能 数据分析与数据科学 数据分析与数据挖掘
3870 8
2009-09-12
悬赏 100 个论坛币 已解决
在clementine12中,每个模型的节点都有都有variable importance选项,请问计算的原理是什么,不同的模型,比如决策树和神经网络模型,计算原理是否相同?

最佳答案

freedj 查看完整内容

http://clementine-blog.beauregar ... ariable-importance/ 这里有谈到,计算原理没说清楚,不过应该是一样的,至少结果是可比较的。网页要翻墙才能打开 我就不翻译了 How Clementine 12 calculates variable importance13Nov08 A long-awaited feature in Clementine 12 is that all, or almost all, modelling algorithms generate a summary listing the relative importance of the variables. In version 11, a handful ...
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2009-9-12 23:53:14
http://clementine-blog.beauregar ... ariable-importance/
这里有谈到,计算原理没说清楚,不过应该是一样的,至少结果是可比较的。网页要翻墙才能打开
我就不翻译了
How Clementine 12 calculates variable importance13Nov08
A long-awaited feature in Clementine 12 is that all, or almost all, modelling algorithms generate a summary listing the relative importance of the variables.  In version 11, a handful of algorithms ranked variables in order of importance, each using a different technique.  For instance, you could work out which variables in a regression were important by browsing the coefficients.  Neural networks generated a chart by a means that now escapes me.

Version 12 standardises how variable importance is calculated, so the importance charts of different models can be compared, and models that did not previously generate “native” variable importance can be evaluated with the new technique.  According to information received, the following algorithms all use the same  calculation:

C5.0
C&RT
QUEST
CHAID
Regression
Logistic
Discriminant
GenLin
SVM
Bayesian Networks
How does it work?  It uses factor prioritisation: that is, which factor (input variable) leads to the greatest reduction in the variance of the output, when the value of that input variable is known?  Which leads to the second-greatest?
The maths behind the calculation is quite involved.  For me the most useful piece of knowledge is that all of the algorithms above use an identical means of determining variable importance, so the results can be directly compared.  No word yet on whether neural networks use the new calculation.

On a practical note, for some algorithms generation of variable importance is disabled by default because it can take a long time to calculate.  If you want it for SVM, logistic regression, or the binary classifier, you need to turn it on before building the model.  You might want to use feature selection prior to modelling in these cases, to reduce the number of low-impact variables being entered into the models.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2009-9-13 22:51:46
好,我看一下
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2009-9-15 15:01:26
上面的网站怎么打不开
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2009-9-15 15:50:25
嗯,国内打不开的,被屏蔽了。要翻墙
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2009-9-15 15:53:22
楼上的,该怎么翻墙啊
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

点击查看更多内容…
相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群