[求助]命令hausman的用法

8482

收藏 2008-06-18

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; “sort code year （排序） &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; tis year (时间变量是year) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; iis code (表示单位的是code) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;xtreg y x x2, fe（假设其中x是需要被工具的变量） &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; est store fixed （这里的fixed实际上就是个变量名，用什么都行） &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; xtreg y x x2, re &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; hausman fixed”&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1）其中“est store fixed ”语句起得什么作用啊。&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 2）我在stata里面试了试，发现输入“est store fixed ”语句后，会生成一个变量_est_fixed，在data editor里面该变量是一组为1的数据（_est_fixed数值为1怎么计算出来的啊）。&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 3）hausman指令后面是两个参数，分别什么意思，什么情况下可以只带一个参数啊？&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 万分感谢！（小弟刚刚开始看，问得问题菜了点）

[此贴子已经被作者于2008-6-18 15:22:24编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

全部回复

zyj_azhu

2008-6-21 09:52:00

各位师兄师姐，哪位能给解释解释啊，小弟实在是看不懂了！

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

蓝色

2008-6-21 16:47:00

1、est store 是保存估计的结果，后面的fixed是自己取的名，指固定效应模型的结果

2、你在stata里面的help里面查hausman，est就知道对各个命令的解释

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

蓝色

2008-6-21 16:48:00

help estimates dialogs: store change restore replay
stats dir table drop/clear
----------------------------------------------------------------------------------------------------------------------

Title

[R] estimates -- Estimation results

Syntax

Store estimation results

estimates store name [, title(str) nocopy]

Replay estimation results

estimates replay [namelist] [, noheader]

Display table of estimation results

estimates table [namelist] [, table_options]

Evaluate postestimation command for stored estimation result sets

estimates for namelist [, noheader nostop] : any_cmd

List model statistics for stored estimation results

estimates stats [namelist]

Display name and title of active estimation results

estimates query

Describe stored estimation results

estimates dir [namelist]

Restore estimation results

estimates restore name [, drop]

Drop stored estimation results, but not results in memory

estimates drop namelist

Drop all stored estimation results

estimates clear

Change title of stored estimation results

estimates change name [, title(str)]

    table_options             description
    ----------------------------------------------------------------------------------------------------------------
    Main
      stats(scalarlist)       report scalarlist statistics in the table
      star[(#1 #2 #3)]        denote significance of coefficients with stars

    Options
      keep(keeplist)          report keeplist coefficients in order specified
      drop(droplist)          drop droplist coefficients from the table
      equations(matchlist)    match the equations of the models in namelist according to matchlist; see Options for
                                details

    Numerical formats
      b[(fmt)]                coefficients are always reported; use display format fmt
      se[(fmt)]               report standard errors; use display format fmt
      t[(fmt)]                report t- or z-values; use display format fmt
      p[(fmt)]                report p-values; use display format fmt
      stfmt(fmt)              use display format fmt for the scalar statistics

    General format
      varwidth(#)             use # characters to display variable names and statistics; default is varwidth(12)
      modelwidth(#)           use # characters to display model names; default is modelwidth(12)
      eform                   display coefficient table in exponentiated form
      label                   display variable labels instead of variable names
      newpanel                display statistics in separate table from the one with coefficients
      style(oneline)          display vertical line after variables; the default
      style(columns)          display vertical line after each column (variable names, models)
      style(noline)           suppress all vertical lines
      coded                   display a compact table

    + title(str)              display the title str for the table
    ----------------------------------------------------------------------------------------------------------------
    + title(str) does not appear in the dialog box.

where name is identifier | .
namelist is _all | * | name [name ...]

estimates may be abbreviated to est and to esti.

Description

    estimates provides the preferred method for storing and restoring sets of estimation results. When we say "set
    of estimation results", we mean the collection of scalars, macros, matrices, and functions saved in e() after
    any given Stata estimation (eclass) command. For brevity, in what follows we refer to a set of estimation
    results as an "estimation set" or just a "set". Estimation sets are identified by name. In a namelist, you may
    use the * and ? wildcards. _all or * refers to all estimation sets. A period (.) refers to the most recent
    ("active") estimation set, even if the set has not (yet) been stored.

estimates store stores the active estimation set under name. The results from this set remain active. A set
already stored under name is silently overwritten.

estimates replay replays results from stored estimation sets. If no namelist is specified, results from all
stored sets are replayed.

estimates table displays a table with coefficients and statistics for one or more estimation sets in parallel
columns. In addition, standard errors, t statistics, p-values, and scalar statistics may be listed.

estimates for evaluates a postestimation command for one or more stored estimation sets. The postestimation
command can access the names under which the sets were stored via e(_estimates_name).

estimates stats lists model statistics, including the AIC and BIC model selection indices, for the specified
estimation sets. If no namelist is specified, statistics for all stored sets are listed.

estimates query displays identifying information on the active set of estimation results.

estimates dir lists the names, commands, dependent variables, and descriptions of stored sets. If no namelist
is specified, all stored results are described.

estimates restore restores a stored estimation set, making it the active set so that all postestimation commands
will act on it.

estimates drop permanently drops stored estimation sets. Dropping the active estimation set clears the stored
information (if stored), not the results, from active memory.

estimates clear permanently drops all stored estimation sets.

estimates change sets or modifies the descriptive title of an already stored estimation set.

Typing estimates without a subcommand replays results from the active estimation set.

You may store up to 20 estimation sets. Sets with large numbers of parameters use a considerable amount of
memory. Thus you should drop sets when you no longer need them.

The following postestimation commands refer to estimation sets via the names under which they were stored via
estimates:

        hausman     Hausman specification test
        lrtest      Likelihood-ratio test
        suest       Testing cross-model hypotheses

With the obvious exception of estimates restore, all subcommands of estimates and the postestimation commands do
not change the active estimation results.

Options for estimates store

title(str) specifies a title documenting a stored set. The title is displayed by the subcommands dir, replay,
and for. You may also set or change the title later with the estimates change subcommand.

nocopy specifies that after the results for set name are stored, they no longer be available as the active
estimation results.

Option for estimates replay

noheader suppresses the display of a header describing the name and title of stored set.

Options for estimates table

+------+
----+ Main +----------------------------------------------------------------------------------------------------

stats(scalarlist) specifies one or more scalars to be displayed in the table. scalarlist may contain e()
scalars and the following statistics:

            aic         Akaike's information criterion
            bic         Schwarz's information criterion
            rank        rank of e(V) - number of free parameters in model

        scalarlist may be separated by white space or commas. Analogous to coefficients, requested scalars not
        saved by a particular estimation command (i.e., not contained in e()) are displayed as blanks. If a period
        "." is displayed, it indicates that the e() scalar is stored with a missing value ".".

        Example: stats(N ll chi2 aic) specifies that the number of observations, N, the log likelihood, ll, the chi2
        test (test that the coefficients in the first equation of the model are 0), and the AIC information
        criterion be displayed.

star[(#1 #2 #3)] specifies that the significance of coefficients is denoted by stars: *: p < .05, **: p < .01,
 and ***: p < .001. The optional argument may override these thresholds (1 > #1 > #2 > #3 > 0). star may
 not be combined with se, t, or p.

+---------+
----+ Options +-------------------------------------------------------------------------------------------------

    keep(keeplist) specifies the coefficients (and their order) to be included in the table. A keeplist comprises
        one or more specifications, separated by white space: a variable name (e.g., price), an equation name (e.g.,
        mean:), or a full name (e.g., mean:price).

    drop(droplist) specifies the coefficients to be dropped from the table. A droplist comprises one or more
        specifications, separated by white space: a variable name (e.g., price), an equation name (e.g., mean:), or
        a full name (e.g., mean:price). All coefficients that match a specification are dropped. drop(_cons) drops
        _cons from all equations.

    equations(matchlist) specifies how the equations of the models in namelist are to be matched. The most common
        usage is equations(1), which indicates that all the first equations in the models are to be matched into one
        equation named #1, while the other equations are to be matched by name. If equations() is not specified,
        all equations are matched by name. Coefficients within equations are always matched by name, whether or not
        equations() is specified.

Generally, matchlist has the syntax

term [, term ...]

where term is

[eqname =] #|#1:#2:...:#m (m = number of models in namelist)

        and #j, j = 1,...,m, is a number or a period(.). A term specifies which equations from the models are
        matched. The numbers in a term refer to the position of equations in the different models (1 is the first
        equation, 2 the second, and so on). A period (.) indicates that no equation should be included for the
        associated model. A term that consists of a single number, #, is a convenient shorthand for #:#:...:#,
        specifying that the #th equation in all models should be matched. Equations not matched by position are
        matched by name.

To specify multiple terms, separate each term by commas (,). Within terms, separate numbers by colons (:)
or blanks.

        The matched equation for a term is named eqname in the output and in results returned in r(). eqname should
        not be used as an equation name in any of the models. If you do not specify eqname, est table uses the name
        #i for the ith term. If you specify keep() or drop() in combination with equations(), be sure to refer to
        the matched equation names, not to the equation names in the models.

Some examples may be instructive. Assume that namelist consists of three models.

equations(1:1:1), or equations(1), matches the first equation in the three models into a single equation
named #1; any other equations in the models will be matched by name.

equations(means=1) matches the first equation in the models into a single equation named mean; other
equations of the models are matched by name.

equation(1:.:1) matches the first equation of models 1 and 3 into a single equation named #1 but does
not include any equations from model 2 in #1.

equations(1,2) matches the first equations in the three models into equation #1 and the second equation
in the three models into equation #2, matching any other equations in the models by name.

+-------------------+
----+ Numerical formats +---------------------------------------------------------------------------------------

    b[(fmt)] specified without an argument is allowed only for consistency with the options se, t, and p, and has no
        effect. Coefficients are always displayed. However, the optional argument may be used to specify the
        display format for the coefficients (e.g., b(%9.3f)). It defaults to %10.0g.

    se[(fmt)] specifies that the standard errors of the coefficients be displayed below the coefficients. A display
        format may be specified as an optional argument (e.g., se(%9.2f)). By default, the display format of the
        coefficients is used.

t[(fmt)] specifies that the t- or z-values (coef/se(coef)) be displayed below the coefficients. A display
format may be specified as an optional argument (e.g., t(%9.2f)). It defaults to %7.2f.

    p[(fmt)] specifies that the (two-sided) p-values of the coefficients be displayed below the coefficients. As is
        standard in Stata, the reference distribution is the t if the estimation command saved the residual degrees
        of freedom in e(df_r), and the normal distribution otherwise. A display format may be specified as an
        optional argument (e.g., t(%7.2f)). It defaults to %7.4f.

stfmt(fmt) specifies the display format for the scalar statistics. It defaults to the display format of the
coefficients.

+----------------+
----+ General format +------------------------------------------------------------------------------------------

varwidth(#) specifies the number of characters used to display the names of variables and statistics. It
defaults to 12.

modelwidth(#) specifies the number of characters used to display the names of models.

eform displays the coefficient table in exponentiated form: For each coefficient, exp(b) rather than b is
displayed, and standard errors are transformed. Display of the intercept, if any, is suppressed.

label specifies that variable labels are displayed instead of variable names.

newpanel specifies that the statistics be displayed in a table separated by a blank line from the table with
coefficients, rather than in the style of another equation in the table of coefficients.

style(style_spec) specifies the "style" of the coefficients table. The following values are allowed:

           style(oneline)     specifies that a vertical line be displayed after the variables, but not between the
                                models
           style(columns)     specifies that vertical lines be displayed after each column (variable names, models)
           style(noline)      suppresses all vertical lines

The default style is style(oneline).

    coded specifies that a compact table be displayed, in which a model occupies only a single column with cmd:*
        representing parameters and statistics computed for the model, and spaces otherwise. This format is
        especially useful for comparing variables that are included in a large collection of models.

The following option is available with estimates_table but is not shown in the dialog box:

title(str) specifies the title str for the table.

Options for estimates for

noheader suppresses the display of a header describing the name and title of stored set.

nostop does not stop repeating the command for other stored sets if one of them results in an error.

Option for estimates restore

drop specifies that after the set name is restored, this set is no longer stored for later restoration.

Option for estimates change

title(str) sets or changes the title of a stored estimation set. The title is displayed by the subcommands dir,
replay, and for.

Examples

    . logit foreign price trunk
    . est store A, title(simple model)
    . logit foreign price trunk disp
    . est store B

    . est dir
    . est stats
    . est table A B, star

    . est restore A
    . est change A, ti("a less simple model")
    . est for A: test price trunk
    . est for A: predict p
    . est for .: lstat
    . est drop _all

Also see

Manual: [R] estimates

Online: ereturn, _estimates

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

蓝色

2008-6-21 16:48:00

help hausman dialog: hausman
----------------------------------------------------------------------------------------------------------------------

Title

[R] hausman -- Hausman specification test

Syntax

hausman name-consistent [name-efficient] [, options]

    options                   description
    ----------------------------------------------------------------------------------------------------------------
    Main
      constant                include estimated intercepts in comparison; default is to exclude
      alleqs                  use all equations to peform test; default is first equation only
      skipeqs(eqlist)         skip specified equations when performing test
      equations(matchlist)    associate/compare the specified (by number) pairs of equations
      force                   force performance of test, even though assumptions are not met
      df(#)                   use # degrees of freedom
      sigmamore               base both (co)variance matrices on disturbance variance estimate from efficient
                                estimator
      sigmaless               base both (co)variance matrices on disturbance variance estimate from consistent
                                estimator

    Advanced
      tconsistent(string)     consistent estimator column header
      tefficient(string)      efficient estimator column header
    ----------------------------------------------------------------------------------------------------------------

    where name-consistent and name-efficient are names under which estimation results were saved via estimates
        store.
    A period (.) may be used to refer to the last estimation results, even if these were not already stored.
    Not specifying name-efficient is equivalent to specifying the last estimation results as ".".

Description

hausman performs Hausman's specification test. To use hausman, one has to perform the following steps.

      (1) obtain an estimator that is consistent whether or not the hypothesis is true;
      (2) store the estimation results under a name-consistent using estimates store;
      (3) obtain an estimator that is efficient (and consistent) under the hypothesis that you are testing, but
          inconsistent otherwise;
      (4) store the estimation results under a name-efficient using estimates store;
      (5) use hausman to perform the test

hausman name-consistent name-efficient [, options]

    The order of computing the two estimators may be reversed. You have to be careful though to specify to hausman
    the models in the order "always consistent" first and "efficient under H0" second. It is possible to skip
    storing the second model and refer to the last estimation results by a period (.).

    hausman may be used in any context. The order in which you specify the regressors in each model does not
    matter, but it is your responsibility to assure that the estimators and models are comparable, and satisfy the
    theoretical conditions (see (1) and (3) above).

Options

+------+
----+ Main +----------------------------------------------------------------------------------------------------

    constant specifies that the estimated intercept(s) be included in the model comparison; by default, they are
        excluded. The default behavior is appropriate for models in which the constant does not have a common
        interpretation across the two models.

alleqs specifies that all the equations in the models be used to perform the Hausman test; by default, only the
first equation is used.

    skipeqs(eqlist) specifies in eqlist the names of equations to be excluded from the test. Equation numbers are
        not allowed in this context, as the equation names, along with the variable names, are used to identify
        common coefficients.

equations(matchlist) specifies, by number, the pairs of equations that are to be compared.

The matchlist in equations() should follow the syntax

#c:#e [,#c:#e[, ...]]

where #c(#e) is an equation number of the always-consistent (efficient under H0) estimator. For instance
equations(1:1), equations(1:1, 2:2), or equations(1:2).

If equations() is not specified, then equations are matched on equation names.

        equations() handles the situation in which one estimator uses equation names and the other does not. For
        instance, equations(1:2) means that equation 1 of the always-consistent estimator is to be tested against
        equation 2 of the efficient estimator. equations(1:1, 2:2) means that equation 1 is to be tested against
        equation 1 and that equation 2 is to be tested against equation 2. If equations() is specified, options
        alleqs and skipeqs are ignored.

force specifies that the Hausman test be performed, even though the assumptions of the Hausman test seem not to
be met, for example, because the estimators are pweighted or the data are clustered.

df(#) specifies the degrees of freedom for the Hausman test. The default is the matrix rank of the variance of
the difference between the coefficients of the two estimators.

sigmamore and sigmaless specify that the two covariance matrices used in the test be based on a common estimate
of disturbance variance (sigma2).

        sigmamore specifies that the covariance matrices be based on the estimated disturbance variance from the
            efficient estimator. This option provides a proper estimate of the contrast variance for so-called
            tests of exogeneity and overidentification in instrumental variables regression.

sigmaless specifies that the covariance matrices be based on the estimated disturbance variance from the
consistent estimator.

        These options can only be specified when both estimators save e(sigma) or e(rmse), or with command xtreg.
        e(sigma_e) is saved after command xtreg with options fe or mle. e(rmse) is saved after command xtreg with
        option re.

        sigmamore or sigmaless are recommended when comparing fixed-effects and random-effects linear regression
        because they are much less likely to produce a nonpositive-definite differenced covariance matrix (although
        the tests are asymptotically equivalent whether or not one of the options is specified).

+----------+
----+ Advanced +------------------------------------------------------------------------------------------------

    tconsistent(string) and tefficient(string) are formatting options. They allow you to specify the headers of the
        columns of coefficients that default to the names of the models. These options will be primarily of
        interest to programmers.

Remark: An alternative to hausman

    The assumption that one of the estimators is efficient (i.e., has minimal asymptotic variance) is a demanding
    one. It is violated, for instance, if your observations are clustered or pweighted, or if your model is somehow
    misspecified. Moreover, even if the assumption is satisfied, there may be a "small sample" problem with the
    Hausman test. Hausman's test is based on estimating the variance var(b-B) of the difference of the estimators
    by the difference var(b)-var(B) of the variances. Under the assumptions (1) and (3), var(b)-var(B) is a
    consistent estimator of var(b-B), but it is not necessarily positive definite "in finite samples", i.e., in your
    application. If this is the case, the Hausman test is undefined. Unfortunately, this is not a rare event.
    Stata supports a generalized Hausman test that overcomes both of these problems. See suest for details.

Examples

Typing

        . xtreg lny educ age, fe
        . est store fixed
        . xtreg lny educ age sex, re
        . hausman fixed .

presents Hausman's specification test, which tests the appropriateness of the random-effects estimator (xtreg,
re).

Typing

        . mlogit travmode age gender income
        . est store all
        . mlogit travmode age gender income if travmode != 2
        . est store partial
        . hausman partial all, alleqs constant

will perform a Hausman test for independence of irrelevant alternatives (IIA).

    When one estimator uses equation names and the other does not, specify the equations() option to force the
    comparison. This is illustrated in the comparison of the OLS estimator and the estimator of the regress part of
    the heckman model

        . regress mpg price
        . est store reg
        . heckman mpg price, sel(foreign=weight)
        . hausman reg ., eq(1:1)

Comparison of the probit and selection model of the heckman

        . probit foreign weight
        . est store probit_for
        . heckman mpg price, sel(foreign=weight)
        . hausman probit_for ., eq(1:2)

Also see

Manual: [R] hausman

Online: lrtest, suest, test, xtreg, xtregar

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群