clementine软件操作问题

420948492

3911

收藏 2009-09-12

悬赏 100 个论坛币已解决

在clementine12中，每个模型的节点都有都有variable importance选项，请问计算的原理是什么，不同的模型，比如决策树和神经网络模型，计算原理是否相同？

最佳答案

freedj 查看完整内容

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

全部回复

freedj

2009-9-12 23:53:14

http://clementine-blog.beauregar ... ariable-importance/
这里有谈到，计算原理没说清楚，不过应该是一样的，至少结果是可比较的。网页要翻墙才能打开
我就不翻译了
How Clementine 12 calculates variable importance13Nov08
A long-awaited feature in Clementine 12 is that all, or almost all, modelling algorithms generate a summary listing the relative importance of the variables.  In version 11, a handful of algorithms ranked variables in order of importance, each using a different technique.  For instance, you could work out which variables in a regression were important by browsing the coefficients.  Neural networks generated a chart by a means that now escapes me.

Version 12 standardises how variable importance is calculated, so the importance charts of different models can be compared, and models that did not previously generate “native” variable importance can be evaluated with the new technique.  According to information received, the following algorithms all use the same  calculation:

C5.0
C&RT
QUEST
CHAID
Regression
Logistic
Discriminant
GenLin
SVM
Bayesian Networks
How does it work?  It uses factor prioritisation: that is, which factor (input variable) leads to the greatest reduction in the variance of the output, when the value of that input variable is known?  Which leads to the second-greatest?
The maths behind the calculation is quite involved.  For me the most useful piece of knowledge is that all of the algorithms above use an identical means of determining variable importance, so the results can be directly compared.  No word yet on whether neural networks use the new calculation.

On a practical note, for some algorithms generation of variable importance is disabled by default because it can take a long time to calculate.  If you want it for SVM, logistic regression, or the binary classifier, you need to turn it on before building the model.  You might want to use feature selection prior to modelling in these cases, to reduce the number of low-impact variables being entered into the models.

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

420948492

2009-9-13 22:51:46

好，我看一下

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

ycl0536

2009-9-15 15:01:26

上面的网站怎么打不开

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

freedj

2009-9-15 15:50:25

嗯，国内打不开的，被屏蔽了。要翻墙

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

ycl0536

2009-9-15 15:53:22

楼上的，该怎么翻墙啊

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

点击查看更多内容…

最佳答案

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群