求助: nearest neighborhood matching 如何导出 matched data

3212

收藏 2013-09-12

求助：
在Stata 13运行nearest neighborhood matching (或者在stata 中运行user defined nnmatch package)的时候，可以给出the average treatment effect, the treatment effect on the treated and the treatment effect on the control.但是如何导出matched data?

难道要自己重新给nnmatch编程，测试，然后导出吗？

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

全部回复

蓝色

2013-9-13 05:21:47

Nearest neighbor matching estimation for average treatment effects

      nnmatch depvar treatvar varlist_nnmatch [weight] [if exp] [in range] [, tc({ate | att | atc}) m(#) metric(maha
            | matname) exact(varlist_ex) biasadj(bias | varlist_adj) robusth(#) population level(#) keep(filename)
            replace]

pweights are allowed. See help weights for more information about weights.  See section 5.2 (Abadie et al. 2004)
for inforamtion about how nnmatch handles weights.

depvar, varlist_nnmatch, and elements of biasadj(varlist_adj) and exact(varlist_ex) must be numeric variables.
treatvar must be a {0,1} variable.

Description

nnmatch estimates the average treatment effect on depvar by comparing outcomes between treated and control
observations (as defined by treatvar), using nearest neighbor matching across the variables defined in
varlist_nnmatch.  nnmatch can estimate the treatment effect for the treated observations, the controls, or the
sample as a whole.  The program pairs observations to the closest m matches in the opposite treatment group to
provide an estimate of the counterfactual treatment outcome.  The program allows for matching over a
multi-dimensional set of variables (varlist_nnmatch), giving options for the weighting matrix to be used in
determining the optimal matches.  It also allows exact matching (or as close as possible) on a subset of
variables. In addition, the program allows for bias correction of the treatment effect and estimation of either
the sample or population variance, with or without assuming a constant treatment effect (homoskedasticity).
Finally it allows observations to be used as a match more than once, thus making the order of matching irrelevant.
See Abadie et al. (2004) for further detail.

Options

tc({ate | att | atc}) specifies which treatment effect is to be estimated:

      ate: the average treatment effect,
      att: the average treatment effect for the treated, or
      atc: the average treatment effect for the controls.

      If no option is specified, the average treatment effect, ate, is assumed.  In this case, all observations are
      matched to their nearest m neighbors of the opposite treatment group.  In estimating the att or atc, only the
      treated or controls, respectively, are matched.

m(#) specifies the number of matches to be made per observation. If two observations of the opposite treatment
      group are equally close to that being matched, both will be used.  Thus, the number of matches per observation
      will be greater than or equal to m.  If the average treatment effect is selected, m must be less than or equal
      to the smaller of N0 and N1, where N0 is the number of control observations in the dataset, and N1 is the
      number of treatment observations.  If tc(att) is selected, m need only be less than or equal to N0; if tc(atc)
      is selected, m must be less than or equal to N1.  If m(#) is not specified, 1 is assumed.

metric(maha | matname) specifies the weighting matrix to be used when k, the number of elements of
      varlist_nnmatch, is greater than 1.  The metric() option specifies the relative weight to be placed on each
      variable in varlist_nnmatch in defining nearest neighbor matches.  Two options are available:

      (1) metric(maha) specifies the Mahalanobis metric, the inverse of the sample variance-covariance matrix of the
      k variables in varlist_nnmatch.
      (2) metric(matname) allows for a user-defined weight matrix matname, where matname is an already-specified
      k-dimensional, symmetric, and positive semi-definite matrix.

      If no option is specified, the default is to use the k*k diagonal matrix of the inverse sample standard errors
      of the k variables in varlist_nnmatch.

exact(varlist_ex) allows you to specify exact matching (or as exact as possible) on one or more variables.  The
      exact-matching variables need not overlap with the elements of varlist_nnmatch.  In practice, however, the
      exact() option adds these variables to the original k*k varlist_nnmatch matrix, but in the weight matrix
      multiplies each exact element by 1,000 relative to the weights placed on the elements of varlist_nnmatch.
      (Regardless of the metric() option chosen for the varlist_nnmatch variables, the exact-match variables are
      normalized via the default option -- the inverse sample errors.) Because for each matched observation there
      may not exist a member of the opposite treatment group with equal value, matching may not be exact across the
      full dataset.  The output lists the percentage of matches (across the paired observations, greater than or
      equal to N*m in number) that match exactly.

biasadj(bias | varlist_adj) specifies that the bias-corrected matching estimator be used.  The simple matching
      estimator estimates the average treatment effect by calculating the average over the N observations being
      matched of the difference between the depvar outcome for observation i and the average outcomes for its m
      matches in the opposite treatment group. However, the simple matching estimator will be biased if matching is
      not exact.  This option regression-adjusts the results using the original matching variable(s),
      varlist_nnmatch (if biasadj(bias) is selected), or a newly-specified set of variables, varlist_adj (if
      biasadj(varlist_adj) is chosen).

robusth(#) specifies that nnmatch estimate heteroskedasticity-consistent standard errors using # matches in the
      second matching stage (across observations of the same treatment level).  The program does this by conducting
      a second matching process (again across the elements of varlist_nnmatch), this time matching observations in
      the same treatment group, to compare variability in outcomes (depvar) for observations with approximately the
      same varlist_nnmatch values.  robusth(#) allows the user to choose how many matches are used in this process.
      If robusth() is not selected or # equals zero, homoskedastic errors are estimated.

population specifies the calculation of the population variance rather than the sample variance.  If population is
      not selected, sample variance is assumed.

level(#) specifies the confidence level, as a percentage, for confidence intervals.  The default is level(95) or
      as set by set level.

keep(filename) saves the temporary matching dataset in the file filename.dta. In the estimation process, nnmatch
      creates a temporary dataset holding, for each observation i being matched, a new observation containing the
      values of is outcome variable (depvar), the matching variable(s), varlist_nnmatch, values, and the outcome and
      varlist_nnmatch values for its m closest matches.  Thus, the new dataset will hold at least N*m observations.
      If biasadj(varlist_adj) or exact(varlist_ex) are selected, the temporary dataset will also hold these values
      for each observation i and its match(es) j.  keep(filename) allows you to save this temporary dataset.

      If keep(filename) is selected, each observation of filename.dta will hold the following variables:

      t:    The treatment group indicator, treatvar, for the observation being matched, i.
      y:    The observed outcome variable, depvar(i).
      x:    The varlist_nnmatch values for observation i.
      id:    The identification code for the observation being matched, i.
               (When the command nnmatch is given, the program creates a temporary variable, id =
               {1,2,...N}, based on the original sort order.)
      index: The identification code for j, the match for observation i.
      dist: The estimated distance between observation i and its match j, based on the
               varlist_nnmatch values of each and the selected weight matrix.
      k_m:    The number of times observation i is itself used as a match for any observation l of the
               opposite treatment group, each time weighted by the total number of matches for the
               given observation l.
               (For example, if observation i is one of three matches for observation l, it receives a
               value of 1/3 for that match.  k_m(i) is the sum, across all observations l, of this
               value.  Thus the sum of k_m across all observations i will equal N (or N0 or N1, if the
               atc or att, respectively, are estimated).  Note that this value refers to is use as a
               match, not to its matches j, so the value of k_m is equal across all observations in the
               temporary dataset that pertain to the matching of observation i.
      w_id:    Weight for observation i, if weights are selected.
      w_index: Weight of observation j, the match for observation i, if weights are selected.
      `y'_0: The inferred depvar value if observation i were in the control group.
               (If observation i is in fact a control observation, `y'_0 = `y'(i).  If i is a treated
               observation, `y'_0 = `y'(j).)
      `y'_1: Inferred depvar value if i were in the treated group.
      `x'_0m:  Values of varlist_nnmatch for i's `control' observation.  Namely, if i is a control
               observation, `x'_0m = x_i for each element x of varlist_nnmatch.  If i is a treatment
               observation, `x'_0m will equal x_j.
      `x'_1m:  Values of varlist_nnmatch for i's `treatment' observation.
      `b'_0b:  Values of the bias-adjustment variables (if biasadj(varlist_adj) is selected) for is
               `control' observation, where `b' represents each element of the bias-adjustment
               variables.
      `b'_1b:  Bias-adjustment variables for is `treatment' observation.
      `e'_0e:  Values of the exact-matching variables (if exact(varlist_ex) is selected) for i's
               `control' observation, where `e' represents each element of the exact-matching
               variables.
      `e'_1e:  Exact-matching variables for i's `treatment' observation.

replace replaces the dataset specified in keep(filename) if it already exists.

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

gnuliutingting

2013-9-13 07:12:19

谢谢版主，这就去试试看！

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

donwayho

2016-7-4 15:44:08

蓝色发表于 2013-9-13 05:21
Nearest neighbor matching estimation for average treatment effects

nnmatch depvar treatva ...

Thanks a lot

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

donwayho

2016-7-4 15:44:10

蓝色发表于 2013-9-13 05:21
Nearest neighbor matching estimation for average treatment effects

nnmatch depvar treatva ...

Thanks a lot

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群