How do I test endogeneity? How do I perform a Durbin–Wu–Hausman test?
Consider a regression y = b0 + b1*z + b2*x3 + e
where z is endogenous.
Suppose that x1 and x2 are instrumental variables for z.
One should decide whether it is necessary to use an instrumental variable, i.e., whether a set of estimates obtained by least squares is consistent or not.
An augmented regression test can easily be formed by including the residuals of each endogenous right-hand side variable, as a function of all exogenous variables, in a regression of the original model.
We would first perform a regression
z = c0 + c1*x1 + c2*x2 + c3*x3 + u
to get residuals z_res, then perform an augmented regression: y = d0 + d1*z + d2*x3 + d3*z_res + e
If d3 is significantly different from zero, then OLS is not consistent.
For example, let us assume that you wish to estimate
rent = b0 + b1*hsngval + b2*pcturban + e
where hsngval is endogenous amd pcturban is exogenous.
Instrumental variables for hsngval are: faminc, reg2, reg3 and reg4. To test the endogeneity of hsngval,
(i) we first run a reduced form model, using all exogenous variables:
. regress hsngval faminc reg2-reg4 pcturban
Source |
SS
df
MS
Number of obs =
50
-------------+------------------------------
F(
5,
44) =
19.66
_cons |
-18671.87
11995.48
-1.56
0.127
-42847.17
5503.438
------------------------------------------------------------------------------
(ii) Then, we save the residual from the above regression.
Call it “hsng_res”.
Then, include hsng_res in the main equation, and estimate the main equation by OLS.
. predict hsng_res, res . regress rent hsngval pcturban hsng_res
Source |
SS
df
MS
Number of obs =
50
-------------+------------------------------
F(
3,
46) =
47.05
_cons |
120.7065
12.42856
9.71
0.000
95.68912
145.7239
------------------------------------------------------------------------------
Then, we test the significance of the coefficient of the added residual. . test hsng_res
( 1)
hsng_res = 0.0
F(
1,
46) =
15.91
Prob > F =
0.0002
The small p-value indicates that OLS is not consistent.
To perform an IV regression, run ivreg . ivreg rent pcturban (hsngval = faminc reg2-reg4)
Instrumental variables (2SLS) regression
Source |
SS
df
MS
Number of obs =
50
-------------+------------------------------
F(
2,
47) =
42.66
_cons |
120.7065
15.70688
7.68
0.000
89.10834
152.3047
------------------------------------------------------------------------------
Instrumented:
hsngval
Instruments:
pcturban faminc reg2 reg3 reg4
------------------------------------------------------------------------------
Note that the coefficients of the last two estimates are the same, however, the standard errors are different.