' Heckman selection model with three regressors z1, z2, z3
' (common to both selection and regression equations)
' selection equation: lfp=1 if selected; otherwise 0
' regression model: lw (dependent) c z1 z2 z3 (regressors)
'
' for a discussion, see Section 20.4 of Greene, William H. (1997)
' Econometric Analysis, 3rd edition, Prentice-Hall
'change path to program path
%path = @runpath
cd %path
' load artificial data
load mlogit
' get starting values from 2-step Heckman
' first step: estimate probit of the selection equation
smpl @all
equation eq1.binary(d=n) lfp c z1 z2 z3
' copy starting values to selection equation params
coef(4) b = [email=eq1.@coefs]eq1.@coefs[/email]
' true values b(1) 1 b(2) 0.2 b(3) 0.5 b(4) -0.3
' compute inverse mills ratio
eq1.fit(i) xbhat
series imills = @dnorm(xbhat)/@cnorm(xbhat)
' compute delta to estimate sig2 (variance)
series delta = imills*(imills+xbhat)
' run second step OLS, including the inverse mills ratio
smpl @all if lfp=1
equation eq2.ls lw c z1 z2 z3 imills
' true values c(1) 5 c(2) 0.8 c(3) 0.1 c(4) -1
' construct estimate of correlation rho (note: uses only estimation sample of 2nd step)
eq2.makeresid resid2
' initial estimate of the regression model variance
coef(1) sig2 = @sumsq(resid2)/eq2.@regobs+@mean(delta)*eq2.c(5)^2
' true value 4
' initial estimate of the squared correlation between the two equations
' rho2 should be constrained between 0 and 1
coef(1) rho2 = eq2.c(5)^2/sig2(1)
' true value 0.64
' end of 2 step Heckman