statsby怎么用啊？

qwer_1234

24998

收藏 2008-04-10

我要获得回归的系数比如：

我要做一个简单的回归，y on x的，我要用statsby取得估计的系数，该怎么做呢？

clear
*　生成数据，x,y
drawnorm x e, n(100)
gen y=1+1*x+e
* 回归
reg y x
*　获取系数
statsby ....

[此贴子已经被作者于2008-4-11 7:12:23编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

全部回复

蓝色

2008-4-11 07:26:00

没有用过那个命令，但是你可以查查help啊

里面有例子，自己操作一下就知道了啊。下面就可stata里面的help的内容。看最后的例子。

help statsby dialog: statsby
--------------------------------------------------------------------------------------------------------------------

Title

[D] statsby -- Collect statistics for a command across a by list

Syntax

statsby [exp_list] [, options ]: command

    options                    description
    --------------------------------------------------------------------------------------------------------------
    Main
    * by(varlist [, missing]) equivalent to interactive use of by varlist:

    Options
      clear                    replace data in memory with results
      saving(filename, ...)    save results to filename; save statistics in double precision; save results to
                                 filename every # replications
      total                    include results for the entire dataset
      subsets                  include all combinations of subsets of groups

    Reporting
      nodots                   suppress the replication dots
      noisily                  display any output from command
      trace                    trace the command
      nolegend                 suppress table legend
      verbose                  display the full table legend

    Advanced
      basepop(exp)             restrict initializing sample to exp; seldom used
      force                    do not check for svy commands; seldom used
      forcedrop                retain only observations in by() groups when calling command; seldom used
    --------------------------------------------------------------------------------------------------------------
    * by() is required on the dialog box because statsby is useful to the interactive user only when using by().
    All weight types supported by command are allowed except pweights; see weight.

Description

statsby collects statistics from command across a by() list. Typing

. statsby exp_list , by(varname): command

    executes command for each group identified by varname, building a dataset of the associated values from the
    expressions in exp_list. The resulting dataset replaces the current dataset, unless the saving() option is
    supplied.

    command defines the statistical command to be executed. Most Stata commands and user-written programs can be
    used with statsby, as long as they follow standard Stata syntax and allow the if qualifier. The by prefix
    cannot be part of command.

    exp_list specifies the statistics to be collected from the execution of command. The expressions in exp_list
    follow the grammar given in exp_list. If no expressions are given, exp_list assumes a default depending upon
    whether command changes results in e() and r(). If command changes results in e(), the default is _b. If
    command changes results in r() (but not e()), the default is all the scalars posted to r(). It is an error
    not to specify an expression in exp_list otherwise.

Options

+------+
----+ Main +--------------------------------------------------------------------------------------------------

    by(varlist [, missing]) specifies a list of existing variables that would normally appear in the by varlist:
        section of the command if you were to issue the command interactively. By default, statsby ignores groups
        in which one or more of the by() variables is missing. Alternatively, missing causes missing values to be
        treated like any other values in the by() groups, and results from the entire dataset are included with
        use of the subsets option. If by() is not specified, command will be run on the entire dataset. varlist
        can contain both numeric and string variables.

+---------+
----+ Options +-----------------------------------------------------------------------------------------------

clear specifies that it is okay to replace the data in memory, even though the current data have not been
saved to disk.

saving(filename [, suboptions]) creates a Stata data file (.dta file) consisting of (for each statistic in
exp_list) a variable containing the replicates.

See help prefix_saving_option, for details about suboptions.

total specifies that command be run on the entire dataset, in addition to the groups specified in the by()
option.

subsets specifies that command be run for each group defined by any combination of the variables in the by()
option.

+-----------+
----+ Reporting +---------------------------------------------------------------------------------------------

nodots suppresses display of the replication dots. By default, one dot character is printed for each by()
group. A red `x' is printed if command returns with an error or one of the values in exp_list is missing.

noisily causes the output of command to be displayed for each by() group. This option implies the nodots
option.

trace causes a trace of the execution of command to be displayed. This option implies the noisily option.

nolegend suppresses the display of the table legend, which identifies the rows of the table with the
expressions they represent.

verbose requests that the full table legend be displayed. By default, coefficients and standard errors are
not displayed.

+----------+
----+ Advanced +----------------------------------------------------------------------------------------------

    basepop(exp) specifies a base population that statsby uses to evaluate the command and to set up for
        collecting statistics. The default base population is the entire dataset, or the dataset specified by any
        if or in conditions specified on the command.

One situation where basepop() is useful is collecting statistics over the panels of a panel dataset by
using an estimator that works for time series, but not panel data, e.g.,

. statsby, by(mypanels) basepop(mypanels==2): arima ...

    force suppresses the restriction that command not be a svy command. statsby does not perform subpopulation
        estimation for survey data, so it should not be used with svy. statsby reports an error when it
        encounters svy in command if the force option is not specified. This option is seldom used, so use it
        only if you know what you are doing.

    forcedrop forces statsby to drop all observations except those in each by() group before calling command for
        the group. This allows statsby to work with user-written commands that completely ignore if and in but do
        not return an error when either is specified. forcedrop is seldom used.

Example: Collecting coefficients

. sysuse auto
. statsby, by(foreign): regress mpg gear turn

Example: Collecting both coefficients and standard errors using a time-series estimator with panel data

    . webuse grunfeld, clear
    . tsset company year
    . statsby _b _se, basepop(company==1) by(company): arima invest mvalue kstock, ar(1)

Example: Collecting results saved in r-class macros

. sysuse auto, clear
. statsby mean=r(mean) sd=r(sd) size=r(N), by(rep78): summarize mpg

Also see

Manual: [D] statsby

Online: [P] postfile, [D] collapse, [R] bootstrap, [R] jackknife, [R] permute, [D] by

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

qwer_1234

2008-4-11 07:54:00

谢谢，蓝色，我看了help的，但还是不明白。

特别是：

Example: Collecting coefficients

. sysuse auto
. statsby, by(foreign): regress mpg gear turn

这个 sysuse auto是不是可有可无啊？

foreign　是一个变量吗？　

我试过以下几种：

clear
drawnorm x e, n(100)
gen y=1+1*x+e
gen z=1
reg y x

statsby by(x) :reg y x

statsby by(y) :reg y x

statsby by(z) :reg y x

都提示说：
no; data in memory would be lost

都快要疯掉了哦。

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

蓝色

2008-4-11 07:59:00

sysuse auto

是打开stata软件安装是自带的数据auto文件

不打开文件怎么执行命令啊

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

whgyu

2008-4-11 08:05:00

sysuse auto是调用stata附带的一个数据auto.dta. Stata里很多的例子都是用这个数据做的。foreigen是里面的一个变量。

statsby是用在categorical variable上的。比如foreign有两个值0/1,

statsby, by(foreigen): regress mpg gear turn

就相当于两个回归：

regress mpg gear turn if foreign==0
regress mpg gear turn if foreign==1

然后statsby把对应的系数存下来。这样你就会有一个两行的新数据
foreign   _b[mpg] ...
0         ...
1           ...

你看到的那个提示是因为你产生的x,y,z数据没有存，所以stata不能再产生新的数据（就是上面那两行）。你只要用statsby ...., clear就行了。

我不太清楚你想干什么。在你自己的例子中，x, y, z都是连续变量，by(x)没有意义。

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

arlionn

2008-4-11 08:31:00

以下是引用qwer_1234在2008-4-11 7:54:00的发言：

谢谢，蓝色，我看了help的，但还是不明白。

特别是：

Example: Collecting coefficients

. sysuse auto
. statsby, by(foreign): regress mpg gear turn

这个 sysuse auto是不是可有可无啊？

foreign　是一个变量吗？　

我试过以下几种：

clear
drawnorm x e, n(100)
gen y=1+1*x+e
gen z=1
reg y x

statsby by(x) :reg y x

statsby by(y) :reg y x

statsby by(z) :reg y x

都提示说：
no; data in memory would be lost

都快要疯掉了哦。

clear
drawnorm x e, n(100)
gen y=1+1*x+e
gen z=x>0.5
reg y x
save data1.dta, replace

statsby _b[x], by(z) :reg y x

执行statsby命令前，必须保证数据已经存储。因为该命令执行后数据内容将发生变化。

*------------结果----------------------------

. statsby _b[x], by(z) :reg y x
(running regress on estimation sample)

      command: regress y x
      _stat_1: _b[x]
           by: z

Statsby groups
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
..

.
end of do-file

. list

     +--------------+
     | z    _stat_1 |
     |--------------|
1. | 0   1.175667 |
2. | 1    1.43738 |
     +--------------+

[此贴子已经被作者于2008-4-11 8:32:04编辑过]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

点击查看更多内容…

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群