zhangli106601 发表于 2010-5-15 14:14 
本人在做虚拟变量回归时如果包含截距项那么拟合优度R平方值比较低为0.7多,截距项和其他回归系数值均显著;但是如果去掉截距项R平方值会达到0.998,截距项和其他回归系数值也均显著。现在想请教下各位大虾到底要不要截距项啊???
You need to post how you did it. According the limited information above, it is imposible.
Here is a simulation for your problem. The coef for x and r-square are the same. The only difference is the interpretation of dummy(c=0,1,2) + intercept. They are same.
data t1;
do i = 1 to 100;
c=mod(i,3);
x=rannor(123); error=rannor(123);
y=c+1*x + error;
output;
end;
run;
proc glm data=t1;
class c;
model y=c x/solution;
run;
quit;
proc glm data=t1;
class c;
model y=c x/solution noint;
run;
quit;
***********************
The SAS System 14:05 Saturday, May 15, 2010 30
The GLM Procedure
Class Level Information
Class Levels Values
c 3 0 1 2
Number of Observations Read 100
Number of Observations Used 100
The SAS System 14:05 Saturday, May 15, 2010 31
The GLM Procedure
Dependent Variable: y
Sum of
Source DF Squares Mean Square F Value Pr > F
Model 3 154.4179842 51.4726614 60.01 <.0001
Error 96 82.3390611 0.8576986
Corrected Total 99 236.7570452
R-Square Coeff Var Root MSE y Mean
0.652221 111.3991 0.926120 0.831354
Source DF Type I SS Mean Square F Value Pr > F
c 2 66.93327984 33.46663992 39.02 <.0001
x 1 87.48470433 87.48470433 102.00 <.0001
Source DF Type III SS Mean Square F Value Pr > F
c 2 65.47431949 32.73715975 38.17 <.0001
x 1 87.48470433 87.48470433 102.00 <.0001
Standard
Parameter Estimate Error t Value Pr > |t|
Intercept 1.962378170 B 0.16123655 12.17 <.0001
c 0 -1.955754798 B 0.22800059 -8.58 <.0001
c 1 -1.302729216 B 0.22633900 -5.76 <.0001
c 2 0.000000000 B . . .
x 1.069740088 0.10592038 10.10 <.0001
NOTE: The X'X matrix has been found to be singular, and a generalized inverse was used to solve the
normal equations. Terms whose estimates are followed by the letter 'B' are not uniquely
estimable.
The SAS System 14:05 Saturday, May 15, 2010 32
The GLM Procedure
Class Level Information
Class Levels Values
c 3 0 1 2
Number of Observations Read 100
Number of Observations Used 100
The SAS System 14:05 Saturday, May 15, 2010 33
The GLM Procedure
Dependent Variable: y
Sum of
Source DF Squares Mean Square F Value Pr > F
Model 4 223.5328621 55.8832155 65.15 <.0001
Error 96 82.3390611 0.8576986
Uncorrected Total 100 305.8719232
R-Square Coeff Var Root MSE y Mean
0.652221 111.3991 0.926120 0.831354
Source DF Type I SS Mean Square F Value Pr > F
c 3 136.0481578 45.3493859 52.87 <.0001
x 1 87.4847043 87.4847043 102.00 <.0001
Source DF Type III SS Mean Square F Value Pr > F
c 3 141.7730055 47.2576685 55.10 <.0001
x 1 87.4847043 87.4847043 102.00 <.0001
Standard
Parameter Estimate Error t Value Pr > |t|
c 0 0.006623372 0.16126934 0.04 0.9673
c 1 0.659648954 0.15894131 4.15 <.0001
c 2 1.962378170 0.16123655 12.17 <.0001
x 1.069740088 0.10592038 10.10 <.0001