Home PC
cd "C:\Users\dell\Desktop\STATA20150120\20150303BOB" 
capture log close 
log using Econ4G03-assign2.log, text replace 
capture drop _all
pause on
capture clear all
capture program drop hetero 
program hetero, rclass
        clear
        args obs c
        set obs `obs'
        scalar df = `obs' - 2
        g x1 = rnormal(0,15)
        g error = rnormal(0,1) + rnormal(0,1)*x1*`c'
        egen double MeanError = mean(error)
        egen double SDError   = sd(error)
        replace error = ((error-MeanError)/SDError)*50
        g y  = 1 + 2*x1 + error
        reg  y x1
        
        return scalar b_ols                 = _b[x1]
        return scalar se_ols                 = _se[x1]
        
        return scalar p_ols1         = 2 * ttail(df, (abs(_b[x1]-2))/_se[x1])
        test x1=2
        return scalar p_ols2                = r(p)
        
        reg  y x1, vce(robust) 
        
        return scalar b_r                 = _b[x1]
        return scalar se_r                 = _se[x1]
        return scalar p_r1       = 2 * ttail(df, (abs(_b[x1]-2))/_se[x1])
        test x1=2
        return scalar p_r2                = r(p)
end
set seed 2222
simulate b_ols=r(b_ols) se_ols=r(se_ols) p_ols1=r(p_ols1) p_ols2=r(p_ols2) ///
  b_r=r(b_r) se_r=r(se_r) p_r1=r(p_r1) p_r2=r(p_r2), ///
  saving(hetero,replace) reps(500) dots : hetero 15000 15 
g StatSig_ols = (p_ols1<0.05)
g StatSig_r   = (p_r1<0.05)
tabstat b_ols se_ols se_r p_ols1 p_ols2 p_r1 p_r2 StatSig_ols StatSig_r, stat(mean q iqr) 
pause 
sum Stat*, d
pause
kdensity p_ols1, addplot(kdensity p_r1) legend(label(1 "P-values from OLS VCE") label(2 "P-values from Hetero VCE")) lpattern(dash)
pause
histogram p_ols1, bin (22)
pause
histogram p_r1, bin (22)
pause 
Assignment 2 -- Questions 
1) Run the Monte Carlo simulation above as it is currently structured
   and discuss the file and the output. 
   There are two intermingled aspects to this: 
   i) understanding and describing the Stata code; and 
   ii) understanding the describing the substantive issue being addressed. 
   (i.e., What is this simulation trying to illustrate?)
2) Play with the structure of the Monte Carlo (focus on adjusting the two 
   arguments, but you may also want to make other ajustments). 
   What patterns you can see? One simple approach is to see how the 
   two Var-Cov estimators perform as the sample size and/or 
   the "amount" of hetero changes. You could graph the results. 
   More sophisticated questions could also be explored. Thinking about how 
   to present the results is also an interesting exersize. 
   In undertaking these 
   simple experiments, think of the econometric theory you are learning 
   (and maybe read the text to learn more). 
   Warning: Clearly this is a question without an endpoint, 
   so you need to judge how much effort to allocate to it 
   -- do not go to an extreme in doing this.
   Think about identifying a specific research question, and then approach it 
   in a structured manner. 
   (In part, this sub-question might get you started in 
   thinking about research in econometric theory.)
3) Hard part of the assignment. In their provocative textbook, Mostly Harmless
   Econometrics, Josh Angrist and Jorn-Steffen Pischke argue (albeit in a 
   slightly different context) that a better approach to dealing with 
   heteroskdasticity than using heteroskdasticity consistent standard 
   errors is to estimate BOTH the traditional OLS std error and the 
   heteroskedasticity consistent one and to use the larger of the two for 
   inference. Modify the code above to explore this contention in a  
   limited -- i.e., choose case specific -- Monte Carlo consistent 
   with the the approach above). 
Warning: Relevant for (2) and (3). As currently written the Monte Carlo above 
                 does not track many 
                 features of the regressions being performed. For example, it does not 
                 list the R2. If your Monte Carlo tests an extreme 
                 case (say R2 = 0.001 or 0.98 -- that is, one extreme or the other) 
                 this might be more or less interesting than a more "typical" case.
                 Extreme cases are OK, but you should be aware about whether the 
                 case you are looking at is "typical" or not. For example, 
                 most simple cross-sectional regressions in many areas of 
                 economics have R2's of between 0.10 and 0.40; few would have R2's 
                 of, say, 0.98 (although an R2 of 0.98 might be common in time series). 
这是问题的要求及提示