全部版块 我的主页
论坛 计量经济学与统计论坛 五区 计量经济学与统计软件
911 4
2015-10-16

Hi,

I have made a logistic model for a datasetwith a binary response variable.

The result is not good, I think.

The prediction error is 10%. However allthe predict response variables are the same (all are “YES”). The predictpossibilities are different, but they are all larger than 0.5, and I use 0.5 asa cutoff to decide the result, so all the response variables are “YES”.

Also I calculate the R2, whichis 0.05; it’s so small. And the Hoemer Lemeshow Test shows the p-value is<0.0001, which is bad enough.

I think the logistic model is not a goodchoice here. I want to know what I should do next.

Trying some other models? Could you give mesome choices?

Or I need to do something based on logisticmodel?

Could anyone give me any suggestion aboutmy problem?

Thanks.


二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2015-10-17 07:42:21
You will have to be way more specific than that. What is it that you are trying to study? What is the dependent variable? What are the independent variables?

There are a lot of things that are not clear from your statements. For example, you said that all the predictions are 1 but there is only a 10% "prediction error". Does this mean that your dependent variable is 1 in most of your observations? That may create a problem.

Do not worry about R squares. They don't usually mean much in such models. Look at, say, a two way table of prediction vs actual. Or use ROC curve.

Try probit model too.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-20 01:25:54
夏目贵志 发表于 2015-10-17 07:42
You will have to be way more specific than that. What is it that you are trying to study? What is th ...
I cannot paste the data.
The factors are some engineering factors like alloy composition and other process variables like water pressure, temperature. They are numerical and dependent variable is a binary variable with 1 "PASS", 0"FAIL". And accually most of these binary variables are PASS. That's why only 10% prediction error.
The problem here is that I am not sure if the model indeed works here.
I have tried probit model, which is similar with logit model. The CV-MSE is the same.
Do I need to try some penalty in logistic model?
I cannot find what the problem here is; then I cannot find the way to solve it.
Could you give me some suggestion?
Thanks
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-20 08:19:37
hunxuexiaomeinv 发表于 2015-10-20 01:25
I cannot paste the data.
The factors are some engineering factors like alloy composition and othe ...
I'd say if it is experimental data, you really need to think hard about the underlying data generating process before imposing the assumptions of any model. It could be that linearity is just a bad assumption. If so, it does not matter if you are using logit or probit. I think it is time to talk to your advisor about it. I'd say most people here work with economics/business, not some natural science.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-20 15:00:41
夏目贵志 发表于 2015-10-20 08:19
I'd say if it is experimental data, you really need to think hard about the underlying data genera ...
Thanks, anyway.
I want to try machine learing methods to do classification.
Thanks for your answer.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群