全部版块 我的主页
论坛 计量经济学与统计论坛 五区 计量经济学与统计软件
3405 7
2014-09-23
悬赏 10 个论坛币 已解决
请问:我在一个回归模型中(比如:y=a+bx+e)考虑到非线性问题,加入x的平方项,回归等式变成y=a+bx+cx^2+e
在不考虑非线性时,回归y=a+bx+e,得到系数b是显著并且为负值;
考虑到非线性时,回归y=a+bx+cx^2+e,得到系数b是不显著但为正值,x的平方项的系数c显著并且为负值;

在这种情况下该如何解释?到底有没有非线性效果呢?还需要做别的test吗?
多谢多谢

最佳答案

colinwang 查看完整内容

Hi Jo, Sorry for the confusion, I DIDNOT mean that you NEED to increase the sample size. What I was trying to express is that: hypothetically, if you increase the sample size to a very very large number, for instance, 10000, you will observe all the coefficients with significant p-value (refer to sample size calculation). So, in your case, you can not make your decision based on the p-val ...
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2014-9-23 11:50:14
jose.liupei 发表于 2014-9-28 00:03
Thanks for ur detailed answers.

But I do not quite understand what u mentioned about "As you in ...
Hi Jo,

Sorry for the confusion, I DIDNOT mean that you NEED to increase the sample size. What I was trying to express is that: hypothetically, if you increase the sample size to a very very large number, for instance, 10000, you will observe all the coefficients with significant p-value (refer to sample size calculation).

So, in your case, you can not make your decision based on the p-value, since p-value is not only resting on the degree of association between Y and X, as well as the sample size which is wether big enough to express the association.

When you have a certain 300 observations, it is big enough to express Y~X and Y~X2. However, it is might not legitimate to show Y~(X, X2). The underlying causes might be the Collinearity. X2 is driven from X, somehow when you use X2 to explain Y. X can be omitted by the algorithm, where MLE is very sensible on correlation. That's why I suggest other approach to do model selection.

Let's say if you nail to Y~X2. The x is greater or equal to 0, then you don't have a U-sharpe rather an right-half U-sharpe. This is very easy to interpret. The X2 can be considered as a monotone transformation from X, and you could draw the linear association between Y and X2, followed by extrapolation to Y with X.

If you still have issue, free to email me at colinwang@hotmail.co.uk. In addition, thank statax for the backup!
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2014-9-23 21:12:17
求解答~谢谢~
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2014-9-23 22:37:48
看一下y=a+bx^2+e的结果,如果b的显著性较强,可以考虑去掉x线性因素,只考察x^2非线性因素。
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2014-9-23 22:59:50
xuruilong100 发表于 2014-9-23 22:37
看一下y=a+bx^2+e的结果,如果b的显著性较强,可以考虑去掉x线性因素,只考察x^2非线性因素。
那应该怎么解释这个结果?或者在经济学上有什么意义呢?谢谢
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2014-9-26 00:15:30
First, let's align that it is still a linear model with non-linear relation between Y and X.

Let's move to your spot. Linear regression, as a parametric methods, in regardless of what kind of algorithm for maximisation, is very sensible on the its assumptions, especially on the scale, distribution, outliner, etc.

When you perform y=a+bx+e, it suggest the significant linear relation between y and x under certain sample size (I assume it is not over powered).

For y=a+bx+cx^2+e, you also find the association between y and x^2. Note, even y and x is not statistically related here, you CANNOT conclude that x is irrelevant with y. As you increasing the sample size, I'm assure you that the coefficient will be significant again at certain level.

So now, your question is becoming clear, which model is the better:
y=a+bx+e
y=a+bx+cx^2+e
y=a+bx^2+e

There are couple options for model comparison. R2 square is a simple way, but remembering penalising the degree of freedom you denoted. Log likelihood ratio test using restrict maximise likelihood estimation could be very informative and quantitative. Check the residual, stepwise, professional prior, etc. I can keep going for a day. So focus on your background hypothesis and choose the best way you can.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

点击查看更多内容…
相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群