关于模型非线性问题

jose.liupei

3491

收藏 2014-09-23

悬赏 10 个论坛币已解决

请问：我在一个回归模型中（比如：y=a+bx+e）考虑到非线性问题，加入x的平方项，回归等式变成y=a+bx+cx^2+e
在不考虑非线性时，回归y=a+bx+e，得到系数b是显著并且为负值；
考虑到非线性时，回归y=a+bx+cx^2+e，得到系数b是不显著但为正值，x的平方项的系数c显著并且为负值；

在这种情况下该如何解释？到底有没有非线性效果呢？还需要做别的test吗？
多谢多谢

最佳答案

colinwang 查看完整内容

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

全部回复

colinwang

2014-9-23 11:50:14

jose.liupei 发表于 2014-9-28 00:03
Thanks for ur detailed answers.

But I do not quite understand what u mentioned about "As you in ...

Hi Jo,

Sorry for the confusion, I DIDNOT mean that you NEED to increase the sample size. What I was trying to express is that: hypothetically, if you increase the sample size to a very very large number, for instance, 10000, you will observe all the coefficients with significant p-value (refer to sample size calculation).

So, in your case, you can not make your decision based on the p-value, since p-value is not only resting on the degree of association between Y and X, as well as the sample size which is wether big enough to express the association.

When you have a certain 300 observations, it is big enough to express Y~X and Y~X2. However, it is might not legitimate to show Y~(X, X2). The underlying causes might be the Collinearity. X2 is driven from X, somehow when you use X2 to explain Y. X can be omitted by the algorithm, where MLE is very sensible on correlation. That's why I suggest other approach to do model selection.

Let's say if you nail to Y~X2. The x is greater or equal to 0, then you don't have a U-sharpe rather an right-half U-sharpe. This is very easy to interpret. The X2 can be considered as a monotone transformation from X, and you could draw the linear association between Y and X2, followed by extrapolation to Y with X.

If you still have issue, free to email me at colinwang@hotmail.co.uk. In addition, thank statax for the backup!

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

jose.liupei

2014-9-23 21:12:17

求解答～谢谢～

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

xuruilong100

2014-9-23 22:37:48

看一下y=a+bx^2+e的结果，如果b的显著性较强，可以考虑去掉x线性因素，只考察x^2非线性因素。

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

jose.liupei

2014-9-23 22:59:50

xuruilong100 发表于 2014-9-23 22:37
看一下y=a+bx^2+e的结果，如果b的显著性较强，可以考虑去掉x线性因素，只考察x^2非线性因素。

那应该怎么解释这个结果？或者在经济学上有什么意义呢？谢谢

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

colinwang

2014-9-26 00:15:30

First, let's align that it is still a linear model with non-linear relation between Y and X.

Let's move to your spot. Linear regression, as a parametric methods, in regardless of what kind of algorithm for maximisation, is very sensible on the its assumptions, especially on the scale, distribution, outliner, etc.

When you perform y=a+bx+e, it suggest the significant linear relation between y and x under certain sample size (I assume it is not over powered).

For y=a+bx+cx^2+e, you also find the association between y and x^2. Note, even y and x is not statistically related here, you CANNOT conclude that x is irrelevant with y. As you increasing the sample size, I'm assure you that the coefficient will be significant again at certain level.

So now, your question is becoming clear, which model is the better:
y=a+bx+e
y=a+bx+cx^2+e
y=a+bx^2+e

There are couple options for model comparison. R2 square is a simple way, but remembering penalising the degree of freedom you denoted. Log likelihood ratio test using restrict maximise likelihood estimation could be very informative and quantitative. Check the residual, stepwise, professional prior, etc. I can keep going for a day. So focus on your background hypothesis and choose the best way you can.

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

点击查看更多内容…

最佳答案

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群