Home PC
cd "C:\Users\dell\Desktop\STATA20150120\20150303BOB"
capture log close
log using Econ4G03-assign2.log, text replace
capture drop _all
pause on
capture clear all
capture program drop hetero
program hetero, rclass
clear
args obs c
set obs `obs'
scalar df = `obs' - 2
g x1 = rnormal(0,15)
g error = rnormal(0,1) + rnormal(0,1)*x1*`c'
egen double MeanError = mean(error)
egen double SDError = sd(error)
replace error = ((error-MeanError)/SDError)*50
g y = 1 + 2*x1 + error
reg y x1
return scalar b_ols = _b[x1]
return scalar se_ols = _se[x1]
return scalar p_ols1 = 2 * ttail(df, (abs(_b[x1]-2))/_se[x1])
test x1=2
return scalar p_ols2 = r(p)
reg y x1, vce(robust)
return scalar b_r = _b[x1]
return scalar se_r = _se[x1]
return scalar p_r1 = 2 * ttail(df, (abs(_b[x1]-2))/_se[x1])
test x1=2
return scalar p_r2 = r(p)
end
set seed 2222
simulate b_ols=r(b_ols) se_ols=r(se_ols) p_ols1=r(p_ols1) p_ols2=r(p_ols2) ///
b_r=r(b_r) se_r=r(se_r) p_r1=r(p_r1) p_r2=r(p_r2), ///
saving(hetero,replace) reps(500) dots : hetero 15000 15
g StatSig_ols = (p_ols1<0.05)
g StatSig_r = (p_r1<0.05)
tabstat b_ols se_ols se_r p_ols1 p_ols2 p_r1 p_r2 StatSig_ols StatSig_r, stat(mean q iqr)
pause
sum Stat*, d
pause
kdensity p_ols1, addplot(kdensity p_r1) legend(label(1 "P-values from OLS VCE") label(2 "P-values from Hetero VCE")) lpattern(dash)
pause
histogram p_ols1, bin (22)
pause
histogram p_r1, bin (22)
pause
Assignment 2 -- Questions
1) Run the Monte Carlo simulation above as it is currently structured
and discuss the file and the output.
There are two intermingled aspects to this:
i) understanding and describing the Stata code; and
ii) understanding the describing the substantive issue being addressed.
(i.e., What is this simulation trying to illustrate?)
2) Play with the structure of the Monte Carlo (focus on adjusting the two
arguments, but you may also want to make other ajustments).
What patterns you can see? One simple approach is to see how the
two Var-Cov estimators perform as the sample size and/or
the "amount" of hetero changes. You could graph the results.
More sophisticated questions could also be explored. Thinking about how
to present the results is also an interesting exersize.
In undertaking these
simple experiments, think of the econometric theory you are learning
(and maybe read the text to learn more).
Warning: Clearly this is a question without an endpoint,
so you need to judge how much effort to allocate to it
-- do not go to an extreme in doing this.
Think about identifying a specific research question, and then approach it
in a structured manner.
(In part, this sub-question might get you started in
thinking about research in econometric theory.)
3) Hard part of the assignment. In their provocative textbook, Mostly Harmless
Econometrics, Josh Angrist and Jorn-Steffen Pischke argue (albeit in a
slightly different context) that a better approach to dealing with
heteroskdasticity than using heteroskdasticity consistent standard
errors is to estimate BOTH the traditional OLS std error and the
heteroskedasticity consistent one and to use the larger of the two for
inference. Modify the code above to explore this contention in a
limited -- i.e., choose case specific -- Monte Carlo consistent
with the the approach above).
Warning: Relevant for (2) and (3). As currently written the Monte Carlo above
does not track many
features of the regressions being performed. For example, it does not
list the R2. If your Monte Carlo tests an extreme
case (say R2 = 0.001 or 0.98 -- that is, one extreme or the other)
this might be more or less interesting than a more "typical" case.
Extreme cases are OK, but you should be aware about whether the
case you are looking at is "typical" or not. For example,
most simple cross-sectional regressions in many areas of
economics have R2's of between 0.10 and 0.40; few would have R2's
of, say, 0.98 (although an R2 of 0.98 might be common in time series).