全部版块 我的主页
论坛 计量经济学与统计论坛 五区 计量经济学与统计软件 Stata专版
22973 10
2015-10-20
老师推荐我用stata先将数据分组归类成从1%-100%组,然后将落在1%和100%的值删掉,然后再做回归分析,想问下大神们这要怎么写代码?第一步分组,第二步删除的代码。谢谢大家~~(老师不推荐用winsor的方法,因为数据整理拉平滑以后会对结果有影响。)

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2015-10-20 14:16:15
用winsor命令,这个命令每次只能处理一个变量。更高级的是winsor2,一次同时处理多个变量。如果不知道怎么写命令,你可以help winsor或者help winsor2,学习一下。
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-20 14:18:27
小斜 发表于 2015-10-20 14:16
用winsor命令,这个命令每次只能处理一个变量。更高级的是winsor2,一次同时处理多个变量。如果不知道怎么写 ...
我知道winsor这个方法,但是老师不推荐我用这个,整理拉平滑了以后,对数据影响太大了,不过还是谢谢你~
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-20 15:49:25
顶一下上去
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-20 20:17:23
mb34527 发表于 2015-10-20 14:18
我知道winsor这个方法,但是老师不推荐我用这个,整理拉平滑了以后,对数据影响太大了,不过还是谢谢你~


在winsor2 命令中,只要添加 trim 选项,就可以实现你老师要求的剔除离群值(trime)而不是缩尾(winsor)的功能

参见 help winsor2

Syntax

    winsor2 varlist [if] [in], [ suffix(string) replace trim cuts(# #) by(groupvar) label ]

Description

    winsor2 winsorize or trim (if trim option is specified) the variables in varlist at particular percentiles specified by option cuts(#1 #2).  In defult, new variables will be generated with a suffix "_w" or "_tr", which can be changed by specifying suffix() option.  The replace option replaces the variables with their winsorized or trimmed ones.

                +---------------------------------------------+
    ----+ Difference between winsorizing and trimming +----

    Winsorizing is not equivalent to simply excluding data, which is a simpler procedure, called trimming or truncation. In a trimmed estimator, the extreme values are discarded; in a Winsorized estimator, the extreme values are instead replaced by certain percentiles, specified by option cuts(# #).  For details, see winsor (if installed), trimmean (if installed).

    For example, you type the following commands to get the 1th and 99th percentiles of variable wage, 1.930993 and 38.70926, respectively.

        . sysuse nlsw88, clear
        . sum wage, detail

    In defult, winsor2 winsorize wage at 1th and 99th percentiles,

        . winsor2 wage, replace cuts(1 99)

    which can be done by hands:

        . replace wage=1.930993 if wage<1.930993
        . replace wage=38.70926 if wage>38.70926

    Note that, values smaller than the 1th percentile is repalce by the 1th percentile, and the similar thing is done with the 99th percentile.

    Things change when -trim- option is specified:

        . winsor2 wage, replace cuts(1 99) trim

    which can also be done by hands:

        . replace wage=. if wage<1.930993
        . replace wage=. if wage>38.70926

    In this case, we discard values smaller than 1th percentile or greater than 99th percentile.  This is trimming.



Options

    suffix(string) specifies the suffix of the new variables. The defult is "_w" or "_tr" (when trim specified).

    replace replaces the variables with their winsorized or trimmed counterpart.  Can not be specified with suffix(string).

    trim trims the variables.

    cuts(# #) specifies the percentiles at which the data is winsorized or trimmed.  cuts(1 99) (the default) means winsor (trim) at 1th and 99th percentile. Specify cuts(1 99) or cuts(99 1) makes no difference.

    by(groupvar) the winsor or trim is done within each group specified by groupvar.


Examples

        *- winsor at (p1 p99), get new variable "wage_w"
        .  sysuse nlsw88, clear
        .  winsor2 wage

        *- winsor 3 variables at 0.5th and 99.5th percentiles, and overwrite the old variables
        .  winsor2 wage age hours, cuts(0.5 99.5) replace

        *- winsor 3 variables at (p1 p99), gen new variables with suffix _win, and add variable labels
        .  winsor2 wage age hours, suffix(_win) label

        *- left-winsorizing only, at 1th percentile
        .  winsor2 wage, cuts(1 100)

        *- right-trimming only, at 99th percentile
        .  winsor2 wage, cuts(0 99) trim

        *- winsor variables at (p1 p99) by (industry), overwrite the old variables
        .  winsor2 wage hours, replace by(industry)
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-25 16:48:12
hiderm 发表于 2015-10-20 20:17
在winsor2 命令中,只要添加 trim 选项,就可以实现你老师要求的剔除离群值(trime)而不是缩尾(winso ...
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

点击查看更多内容…
相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群