求助stata中teffects psmatch命令的具体用法

23997

收藏 2016-05-03

看了stata中的help文件，说是teffects psmatch的基本语句是：
teffects psmatch (ovar) (tvar tmvarlist [, tmodel]) [if] [in] [weight] [, stat
options]

不大理解什么意思了，感觉与psmatch2语句差别很大。请教牛人， teffects psmatch 后面这些语句分别是输入什么东西？非常感谢！

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

全部回复

caimiao0714

2016-5-3 12:05:54

liuqianrui111 发表于 2016-5-3 11:39
看了stata中的help文件，说是teffects psmatch的基本语句是：
teffects psmatch (ovar) (tvar tmvarlist ...

一般是teffects psmatch (y) (t x1 x2, probit), atet nn(#) caliper(#)

不选probit就默认logit, atet是显示ate on the treated, nn(#)里面的#表示1对#匹配，caliper表示卡尺内匹配#表示水平。teffects psmatcgh比之前的psmatch2的优点是提供了Abadie & Imbens(2012)的稳健标准误，其他的差不多。

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

liuqianrui111

2016-5-3 16:04:24

caimiao0714 发表于 2016-5-3 12:05
一般是teffects psmatch (y) (t x1 x2, probit), atet nn(#) caliper(#)

不选probit就默认logit, atet是 ...

非常感谢您，太详细了！请问t是不是就是赋值0 1的虚拟变量？1在处理组，0在对照组，跟psmatch2一样的？

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

caimiao0714

2016-5-3 22:15:38

liuqianrui111 发表于 2016-5-3 16:04
非常感谢您，太详细了！请问t是不是就是赋值0 1的虚拟变量？1在处理组，0在对照组，跟psmatch2一样的？

客气了，是你所说的这样的，t指的是treatment，即你自己定义的分组。

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

arlionn

2016-5-5 09:55:06

http://www.statalist.org/forums/forum/general-stata-discussion/general/1145219-psmatch2-graph-for-propensity-score-matching/page2

http://www.ssc.wisc.edu/sscc/pubs/stata_psmatch.htm

[size=14.399999618530273px]

这个帖子讲了在使用 PSM 时，stata官方命令和外部命令 psmatch2 如何配合使用。
最重要的是，说明了如何取出配对样本，这使得 DID+PSM 分析可以实操了。

http://www.ssc.wisc.edu/sscc/pubs/stata_psmatch.htm
Propensity Score Matching in Stata using teffects

For many years, the standard tool for propensity score matching in Stata has been the psmatch2 command, written by Edwin Leuven and Barbara Sianesi. However, Stata 13 introduced a new teffects command for estimating treatments effects in a variety of ways, including propensity score matching. The teffects psmatch command has one very important advantage over psmatch2: it takes into account the fact that propensity scores are estimated rather than known when calculating standard errors. This often turns out to make a significant difference, and sometimes in surprising ways. We thus strongly recommend switching from psmatch2 to teffects psmatch, and this article will help you make the transition.
An Example of Propensity Score Matching

Run the following command in Stata to load an example data set:
use http://ssc.wisc.edu/sscc/pubs/files/psm
It consists of four variables: a treatment indicator t, covariates x1 and x2, and an outcome y. This is constructed data, and the effect of the treatment is in fact a one unit increase in y. However, the probability of treatment is positively correlated with x1 and x2, and both x1 and x2 are positively correlated with y. Thus simply comparing the mean value of y for the treated and untreated groups badly overestimates the effect of treatment:
ttest y, by(t)
(Regressing y on t, x1, and x2 will give you a pretty good picture of the situation.)
The psmatch2 command will give you a much better estimate of the treatment effect:
psmatch2 t x1 x2, out(y)

You can carry out the same estimation with teffects. The basic syntax of the teffects command when used for propensity score matching is:
teffects psmatch (outcome) (treatmentcovariates)
In this case the basic command would be:
teffects psmatch (y) (t x1 x2)
However, the default behavior of teffects is not the same as psmatch2 so we'll need to use some options to get the same results. First, psmatch2 by default reports the average treatment effect on the treated (which it refers to as ATT). The teffects command by default reports the average treatment effect (ATE) but will calculate the average treatment effect on the treated (which it refers to as ATET) if given the atet option. Second, psmatch2 by default uses a probit model for the probability of treatment. The teffects command uses a logit model by default, but will use probit if the probit option is applied to the treatment equation. So to run the same model using teffects type:
teffects psmatch (y) (t x1 x2, probit), atet

The average treatment effect on the treated is identical, other than being rounded at a different place. But note that teffects reports a very different standard error (we'll discuss why that is shortly), plus a Z-statistic, p-value, and 95% confidence interval rather than just a T-statistic.
Running teffects with the default options gives the following:
teffects psmatch (y) (t x1 x2)

This is equivalent to:
psmatch2 t x1 x2, out(y) logit ate
----------------------------------------------------------------------------------------
      Variable    Sample | Treated    Controls Difference       S.E. T-stat
----------------------------+-----------------------------------------------------------
            y  Unmatched |  1.8910736  -.423243358 2.31431696 .109094342 21.21
                     ATT |  1.8910736 .930722886 .960350715 .168252917    5.71
                     ATU |-.423243358 .625587554 1.04883091          .       .
                     ATE |                         1.01936701          .       .
----------------------------+-----------------------------------------------------------

The ATE from this model is very similar to the ATT/ATET from the previous model. But note that psmatch2 is reporting a somewhat different ATT in this model. The teffects command reports the same ATET if asked:
teffects psmatch (y) (t x1 x2), atet

Matching With Multiple Neighbors

By default teffects psmatch matches each observation with one other observation. You can change this with the nneighbor() (or just nn()) option. For example, you could match each observation with its three nearest neighbors with:
teffects psmatch (y) (t x1 x2), nn(3)

Start with a clean slate by typing:
use http://ssc.wisc.edu/sscc/pubs/files/psm, replace
The gen() option tells teffects psmatch to create a new variable (or variables). For each observation, this new variable will contain the number of the observation that observation was matched with. If there are ties or you told teffects psmatch to use multiple neighbors, then gen() will need to create multiple variables. Thus you supply the stem of the variable name, and teffects psmatch will add suffixes as needed.
teffects psmatch (y) (t x1 x2), gen(match)
In this case each observation is only matched with one other, so gen(match) only creates match1. Referring to the example output, the match of observation 1 is observation 467 (which is why those two are listed).
Note that these observation numbers are only valid in the current sort order, so make sure you can recreate that order if needed. If necessary, run:
gen ob=_n
and then:
sort ob
to restore the current sort order.
The predict command with the ps option creates two variables containing the propensity scores, or that observation's predicted probability of being in either the control group or the treated group:
predict ps0 ps1, ps
Here ps0 is the predicted probability of being in the control group (t=0) and ps1 is the predicted probability of being in the treated group (t=1). Observations 1 and 467 were matched because their propensity scores are very similar.
The po option creates variables containing the potential outcomes for each observation:
predict y0 y1, po
Because observation 1 is in the control group, y0 contains its observed value of y. y1 is the observed value of y for observation 1's match, observation 467. The propensity score matching estimator assumes that if observation 1 had been in the treated group its value of y would have been that of the observation in the treated group most similar to it (where "similarity" is measured by the difference in their propensity scores).
Observation 467 is in the treated group, so its value for y1 is its observed value of y while its value for y0 is the observed value of y for its match, observation 781.
Running the predict command with no options gives the treatment effect itself:
predict te
The treatment effect is simply the difference between y1 and y0. You could calculate the ATE yourself (but emphatically not its standard error) with:
sum te
and the ATET with:
sum te if t

Regression on the "Matched Sample"
We will discuss how to run regressions on a matched sample because it remains a popular technique, but we cannot recommend it.
psmatch2 makes it easy by creating a _weight variable automatically. For observations in the treated group, _weight is 1. For observations in the control group it is the number of observations from the treated group for which the observation is a match. If the observation is not a match, _weight is missing. _weight thus acts as a frequency weight (fweight) and can be used with Stata's standard weighting syntax. For example (starting with a clean slate again):
use http://ssc.wisc.edu/sscc/pubs/files/psm, replace
psmatch2 t x1 x2, out(y) logit
reg y x1 x2 t [fweight=_weight]
Observations with a missing value for _weight are omitted from the regression, so it is automatically limited to the matched sample. Again, keep in mind that the standard errors given by the reg command are incorrect because they do not take into account the matching stage.
teffects psmatch does not create a _weight variable, but it is possible to create one based on the match1 variable. Here is example code, with comments:
gen ob=_n //store the observation numbers for future use
save fulldata,replace // save the complete data set

keep if t // keep just the treated group
keep match1 // keep just the match1 variable (the observation numbers of their matches)
bysort match1: gen weight=_N // count how many times each control observation is a match
by match1: keep if _n==1 // keep just one row per control observation
ren match1 ob //rename for merging purposes
merge 1:m ob using fulldata // merge back into the full data
replace weight=1 if t // set weight to 1 for treated observations

The resulting weight variable will be identical to the _weight variable created by psmatch2, as can be verified with:
assert weight==_weight
It is used in the same way and will give exactly the same results:
reg y x1 x2 t [fweight=weight]
Obviously this is a good bit more work than using psmatch2.

Other Methods of Estimating Treatment Effects
While propensity score matching is the most common method of estimating treatment effects at the SSCC, teffects also implements Regression Adjustment (teffects ra), Inverse Probability Weighting (teffects ipw), Augmented Inverse Probability Weighting (teffects aipw), Inverse Probability Weighted Regression Adjustment (teffects ipwra), and Nearest Neighbor Matching (teffects nnmatch). The syntax is similar, though it varies whether you need to specify variables for the outcome model, the treatment model, or both:
teffects ra (y x1 x2) (t)
teffects ipw (y) (t x1 x2)
teffects aipw (y x1 x2) (t x1 x2)
teffects ipwra (y x1 x2) (t x1 x2)
teffects nnmatch (y x1 x2) (t)
Complete Example Code

复制代码

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

pany198634

2016-10-7 16:57:27

非常棒！

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

点击查看更多内容…

墨鱼卷卷

2016-10-14 11:22:40

arlionn 发表于 2016-5-5 09:55
http://www.statalist.org/forums/forum/general-stata-discussion/general/1145219-psmatch2-graph-for-pr ...

连老师好，
   本人的论文正是处理组为多个取值的情况：偏瘦、正常体重、超重、肥胖，与这篇文章需要用到同样的分析Gender differences in the impact of weight status on academic
performance: Evidence from adolescents in Taiwan。
   我学习了teffects ra 和 teffects ipw后，进行了一些尝试。目前我的疑惑是：
（1）psgraph只能处理取值为0,1的treatment;
（2）不管前面是否进行psmatch2 rbmi14...这一步，后面的teffects ra结果都是一样的，请问这两步操作有必然联系吗？还是只进行teffects ra即可？
（3）psmatch2后得到的_treated取值是1,2,3，本来rbmi14取值是1-4；
（4）无法作图twoway kdensity。
   烦请连老师拨冗点拨，感激不尽，谢谢！

命令如下：
   . qui mlogit rbmi14 i.FEMALE ES i.goal SES2 i.FGoal EduFee i.hukou2 i.SinP i.pre2 i.stay i.SchRank i.leader2 [pw=SWeight]

.
. margins [pw=SWeight], dydx(_all)

Average marginal effects                         Number of obs =    1060
Model VCE : Robust

Expression : Pr(rbmi14==1), predict()
dy/dx w.r.t. : 1.FEMALE ES 2.goal 3.goal 4.goal SES2 2.FGoal 3.FGoal 4.FGoal 5.FGoal EduFee 1.hukou2 2.SinP 1.pre2 2.stay 3.stay 4.stay
            5.stay 2.SchRank 3.SchRank 1.leader2

------------------------------------------------------------------------------
         |          Delta-method
         |    dy/dx Std. Err.    z P>|z|    [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.FEMALE | .0920773 .0301992    3.05 0.002    .0328879 .1512667
      ES | .0006211 .0002575    2.41 0.016    .0001165 .0011258
         |
      goal |
      2  | .102114 .0328386    3.11 0.002    .0377514 .1664765
      3  | .0218842 .0553197    0.40 0.692 -.0865404 .1303089
      4  | .0818112 .0692772    1.18 0.238 -.0539696    .217592
         |
      SES2 |  -.0334425 .0210781 -1.59 0.113 -.0747547 .0078697
         |
   FGoal |
      2  | .0355978 .103015    0.35 0.730 -.1663079 .2375035
      3  |  -.0214249 .1003652 -0.21 0.831 -.2181371 .1752872
      4  |  -.0251212 .1109837 -0.23 0.821 -.2426451 .1924028
      5  | .0719878 .1113616    0.65 0.518 -.1462769 .2902525
         |
   EduFee | .0012621 .0049114    0.26 0.797 -.0083641 .0108884
1.hukou2 | .0407316 .0549977    0.74 0.459 -.0670619 .1485252
   2.SinP |  -.0182909 .0585819 -0.31 0.755 -.1331093 .0965276
   1.pre2 |  -.0434235 .0317282 -1.37 0.171 -.1056096 .0187625
         |
      stay |
      2  |  -.0170712 .0536095 -0.32 0.750    -.122144 .0880015
      3  |  -.0515278 .0920063 -0.56 0.575 -.2318569 .1288013
      4  |  -.0529491 .0614643 -0.86 0.389    -.173417 .0675187
      5  |  -.0099532 .0632434 -0.16 0.875    -.133908 .1140015
         |
   SchRank |
      2  | .024761 .0414867    0.60 0.551 -.0565514 .1060734
      3  |  -.0622266 .0757181 -0.82 0.411 -.2106313 .0861782
         |
1.leader2 | .049643 .0292713    1.70 0.090 -.0077277 .1070136
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

.
. predict double score
(option pr assumed; predicted probabilities)
(841 missing values generated)

.
. psgraph, treated(rbmi14) pscore(score) bin(50) saving(psm2, replace)
Error: treatment indicator variable should take on values 0, 1 (and missing)
r(198);

. set seed 123456

.
. gen u=uniform()

.
. sort u

.
. psmatch2 rbmi14, outcome(FinalM) pscore(score) n(1) caliper(.01) common
----------------------------------------------------------------------------------------
      Variable    Sample | Treated    Controls Difference       S.E. T-stat
----------------------------+-----------------------------------------------------------
      FinalM  Unmatched | 55.4276961          . 1.88976341 1.28688233    1.47
                     ATT |       .          .          .          .       .
----------------------------+-----------------------------------------------------------

psmatch2: | psmatch2: Common
Treatment |       support
assignment | Off suppo  On suppor |    Total
-----------+----------------------+----------
Treated |    816       0 |    816
      2 |       0       32 |       32
      3 |       0       181 |    181
-----------+----------------------+----------
   Total |    816       213 |    1,029
. sum _treated

Variable |    Obs       Mean Std. Dev.    Min       Max
-------------+--------------------------------------------------------
_treated |    1029 1.382896 .7672396       1       3

. teffects ra (FinalM  FEMALE ES goal SES2 FGoal EduFee hukou2 SinP pre2 stay SchType ) (rbmi14) , control(1)

Iteration 0: EE criterion =  1.537e-26
Iteration 1: EE criterion =  4.019e-28

Treatment-effects estimation                   Number of obs    =    1052
Estimator    : regression adjustment
Outcome model  : linear
Treatment model: none
------------------------------------------------------------------------------
         |             Robust
   FinalM |    Coef. Std. Err.    z P>|z|    [95% Conf. Interval]
-------------+----------------------------------------------------------------
ATE       |
   rbmi14 |
(2 vs 1)  | 13.37879 3.555003    3.76 0.000    6.411107 20.34646
(3 vs 1)  | 3.194084 1.898452    1.68 0.092 -.5268132 6.914982
-------------+----------------------------------------------------------------
POmean    |
   rbmi14 |
      1  | 55.22743 1.022599 54.01 0.000    53.22317 57.23169
------------------------------------------------------------------------------