各位亲爱的:
下面是我的原始数据前五条
| NO | IND | YEAR | Y | X1 | X2 | X3 | X4 | X5 | X6 | X7 | X8 | X9 | X10 |
| 000892 | 0002 | 2000 | -0.220610872 | 0.000000001253039639329720000 | 0.231898552 | 0.331953596 | 0.088139 | 20.87047503 | 1.301571724 | 0.174032405 | 0.169023479 | 0.695188711 | 0.268306 |
| 600834 | 0002 | 2000 | -0.152963157 | 0.000000002059844887153580000 | -0.0245712 | 0.42852705 | 0.100523 | 20.20161689 | 0.24863549 | 0.043036409 | 0.09321455 | 1.359202284 | 0.200086 |
| 000421 | 0002 | 2000 | -0.140540342 | 0.000000001436834614561080000 | 0.112320096 | 0.404368262 | 0.035705 | 20.53049399 | 1.067407663 | 0.180926955 | 0.021184526 | 0.421314133 | 0.452166 |
| 000544 | 0002 | 2000 | -0.13827197 | 0.000000000779274879521076000 | 0.272756108 | 0.27706574 | -0.000384 | 20.96651651 | 1.928423226 | 0.41110824 | -0.005036504 | -0.012728888 | 0.488551 |
| 600662 | 0002 | 2000 | -0.135518148 | 0.000000000825818041234081000 | 0.083971604 | 0.35501825 | 0.075335 | 21.0482187 | 0.721423604 | 0.183097691 | 0.202827047 | 0.099186507 | 0.465287 |
然后我用SAS进行加权最小二乘法的分行业IND分年度YEAR回归,代码如下:
** run the original regression to get the residuals**;
proc reg data=data noprint;
model y=x1-x10;
by IND YEAR;
output out=WORK.PRED r=residual;
run;
** compute the absolute and squared residuals**;
data work.resid;
set work.pred;
absresid=abs(residual);
proc reg data=work.resid noprint;
** run a regression with the absolute residuals vs. X to get the estimated standard deviation**;
model absresid=x1-x10;
by IND YEAR;
output out=WORK.s_weights;
run;
** compute the weights using the estimated standard deviations**;
data work.s_weights;
set work.s_weights;
s_weight=1/(abs(residual));
label s_weight = "weights using absolute residuals";
** Do the weighted least squares using the weights from the estimated standard deviation**;
proc reg data=work.s_weights outest=coef;
weight s_weight;
model y=x1-x10;
by IND YEAR;
run;
quit;
data coef;
set coef;
rename x1=a x2=b x3=c x4=d x5=e x6=f x7=g x8=h x9=i x10=j;
run;
data reg;
merge data coef;
by IND YEAR;
wr=y-Intercept-a*x1-b*x2-c*x3-d*x4-e*x5-f*x6-g*x7-h*x8-i*x9-j*x10;
run;
最终得到残差输出表,发现一个现象,就是每一行业每一年的第一条数据当中的Y值,无论它的原始数据是多少,都会被SAS改成-1.这样造成该条数据输出的残差是不对的。
| NO | IND | YEAR | Y | X1 | X2 | X3 | X4 | X5 | X6 | X7 | X8 | X9 | X10 | wr |
892 | 2 | 2000 | -1 | 1.25E-09 | 0.231899 | 0.331954 | 0.088139 | 20.87048 | 1.301572 | 0.174032 | 0.169023 | 0.695189 | 0.268306 | -0.93322 |
600834 | 2 | 2000 | -0.15296 | 2.06E-09 | -0.02457 | 0.428527 | 0.100523 | 20.20162 | 0.248635 | 0.043036 | 0.093215 | 1.359202 | 0.200086 | -0.11876 |
421 | 2 | 2000 | -0.14054 | 1.44E-09 | 0.11232 | 0.404368 | 0.035705 | 20.53049 | 1.067408 | 0.180927 | 0.021185 | 0.421314 | 0.452166 | -0.09756 |
544 | 2 | 2000 | -0.13827 | 7.79E-10 | 0.272756 | 0.277066 | -0.00038 | 20.96652 | 1.928423 | 0.411108 | -0.00504 | -0.01273 | 0.488551 | -0.08169 |
600662 | 2 | 2000 | -0.13552 | 8.26E-10 | 0.083972 | 0.355018 | 0.075335 | 21.04822 | 0.721424 | 0.183098 | 0.202827 | 0.099187 | 0.465287 | -0.11534 |
| NO | IND | YEAR | Y | X1 | X2 | X3 | X4 | X5 | X6 | X7 | X8 | X9 | X10 | wr |
892 | 2 | 2000 | -1 | 1.25E-09 | 0.231899 | 0.331954 | 0.088139 | 20.87048 | 1.301572 | 0.174032 | 0.169023 | 0.695189 | 0.268306 | -0.93322 |
600880 | 2 | 2001 | -1 | 2.51E-09 | 0.424173 | 0.178917 | 0.079324 | 19.74005 | 0.610613 | 0.236453 | 0.087196 | -0.26891 | 0.72113 | -0.95491 |
600646 | 2 | 2002 | -1 | 1.19E-09 | 0.78086 | 0.012734 | -1.41708 | 18.3805 | 2.474379 | 0.075539 | -5.95578 | -0.17564 | 0.01696 | -0.72805 |
600899 | 2 | 2003 | -1 | 1.34E-09 | 0.176978 | 0.037479 | -0.52829 | 19.91676 | 1.164224 | -0.84587 | -0.92142 | 0.204259 | 0.013242 | -0.75888 |
600897 | 2 | 2004 | -1 | 7.93E-10 | 0.216053 | 0.717263 | 0.063238 | 21.04809 | 0.061314 | 0.375615 | -0.00278 | 0.865155 | 0.19876 | -0.93913 |
| | | | | | | | | | | | | |
|
请各位大神帮忙诊治~~太感谢了!