连老师,我在用你第七讲PANELDATA中使用xtabond2命令时自己的STATA中没有显示hansen检验结果,不知是何原因,另外该命令在STATA10.0中必须附加nomata选项才可以运行么,否则不能运行?
. use "D:\Stata10\abdata.dta", clear
. do "C:\DOCUME~1\ADMINI~1\LOCALS~1\Temp\STD03000000.tmp"
. xtabond2 n L.n L(0/1).(w k) yr1978-yr1984, ///
> gmm(L.n) iv(L(0/1).(w k) yr1978-yr1984) ///
> robust small
Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.
Warning: Two-step estimated covariance matrix of moments is singular.
Using a generalized inverse to calculate robust weighting matrix for Hansen test.
Difference-in-Sargan/Hansen statistics may be negative.
.
end of do-file
. do "C:\DOCUME~1\ADMINI~1\LOCALS~1\Temp\STD03000000.tmp"
. *== 系统 GMM 估计量
. xtabond2 n L.n L(0/1).(w k) yr1978-yr1984, ///
> gmm(L.n) iv(L(0/1).(w k) yr1978-yr1984) ///
> robust small nomata
Building GMM instruments..
Estimating.
Performing specification tests.
Dynamic panel-data estimation, one-step system GMM
------------------------------------------------------------------------------
Group variable: id Number of obs = 891
Time variable : year Number of groups = 140
Number of instruments = 47 Obs per group: min = 6
F(12, 139) = 1944.57 avg = 6.36
Prob > F = 0.000 max = 8
------------------------------------------------------------------------------
| Robust
n | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
n |
L1. | .8271481 .0446552 18.52 0.000 .7388569 .9154393
w |
--. | -.4527507 .1645629 -2.75 0.007 -.7781208 -.1273807
L1. | .3433344 .1620574 2.12 0.036 .0229182 .6637506
k |
--. | .340388 .0577688 5.89 0.000 .2261687 .4546073
L1. | -.1986107 .0604901 -3.28 0.001 -.3182104 -.079011
yr1978 | .0020453 .01871 0.11 0.913 -.0349476 .0390383
yr1979 | .0063316 .0228688 0.28 0.782 -.038884 .0515472
yr1980 | -.0187114 .0237909 -0.79 0.433 -.0657502 .0283273
yr1981 | -.0613047 .0313112 -1.96 0.052 -.1232126 .0006031
yr1982 | -.0391016 .0336971 -1.16 0.248 -.1057269 .0275236
yr1983 | -.0216804 .0302384 -0.72 0.475 -.0814671 .0381063
yr1984 | -.0197512 .0394761 -0.50 0.618 -.0978025 .0583
_cons | .5796057 .1763682 3.29 0.001 .2308944 .928317
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z = -4.04 Pr > z = 0.000
Arellano-Bond test for AR(2) in first differences: z = -0.42 Pr > z = 0.673
Sargan test of overid. restrictions: chi2(34) = 97.86 Prob > chi2 = 0.000
(Not robust but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(.) = . Prob > chi2 = .
(Robust but weakened by many instruments.)
.
end of do-file
我想可能是版本的问题,你可以下载一个最新的xtabond2命令。命令为:
ssc install xtabond2, replace
至于nomata选项,它的作用是不采用mata进行运算(mata的运算速度较快,但要求stata9.0以上版本),我当时在制作视频时之所以加入该选项,是为了防止部分学员的stata版本过低导致mata运算不可行。如果你使用的是stata10,可以去掉该选项。
连老师我重新下载了xtabond2命令,问题已经解决谢谢。但是发现运行该命令附加nomata和不附加nomata两个结果中的hansen检验P值不同,所运用的工具变量个数也不同,不知道是何原因?另外请教老师,Sargan 检验和Hansen检验出现矛盾时我们该如何判断和抉择谢谢!
. do "C:\DOCUME~1\ADMINI~1\LOCALS~1\Temp\STD03000000.tmp"
. xtabond2 contribution lcontribution lcontribution2 ldiffcontribution1 ldiffcontribution2,gmm(l
> contribution lcontribution2 ldiffcontribution1 ldiffcontribution2) nomata twostep robust small
Building GMM instruments.....
57 instrument(s) dropped because of collinearity.
Estimating.
Warning: Two-step estimated covariance matrix of moment conditions is singular.
Number of instruments may be large relative to number of groups.
Using a generalized inverse to calculate optimal weighting matrix for two-step estimation.
Computing Windmeijer finite-sample correction...................................................
> ..............................................................................................
> ....................
Performing specification tests.
Dynamic panel-data estimation, two-step system GMM
------------------------------------------------------------------------------
Group variable: subject Number of obs = 1312
Time variable : period Number of groups = 164
Number of instruments = 108 Obs per group: min = 8
F(3, 163) = 72.64 avg = 8.00
Prob > F = 0.000 max = 8
------------------------------------------------------------------------------
| Corrected
contribution | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lcontribut~n | .8988869 .1153642 7.79 0.000 .6710859 1.126688
lcontribut~2 | .1973221 .0441006 4.47 0.000 .1102399 .2844043
ldiffcontr~1 | -.3376674 .1352491 -2.50 0.014 -.6047337 -.0706012
ldiffcontr~2 | -.3964245 .1431313 -2.77 0.006 -.6790551 -.1137939
_cons | -1.292562 .7001243 -1.85 0.067 -2.675045 .0899204
------------------------------------------------------------------------------
Instruments for first differences equation
GMM-type (missing=0, separate instruments for each period unless collapsed)
L(1/.).(lcontribution lcontribution2 ldiffcontribution1
ldiffcontribution2)
Instruments for levels equation
Standard
_cons
GMM-type (missing=0, separate instruments for each period unless collapsed)
D.(lcontribution lcontribution2 ldiffcontribution1 ldiffcontribution2)
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z = -5.68 Pr > z = 0.000
Arellano-Bond test for AR(2) in first differences: z = -1.48 Pr > z = 0.140
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(103) = 170.07 Prob > chi2 = 0.000
(Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(103) = 120.55 Prob > chi2 = 0.114
(Robust, but can be weakened by many instruments.)
.
end of do-file
. do "C:\DOCUME~1\ADMINI~1\LOCALS~1\Temp\STD03000000.tmp"
. xtabond2 contribution lcontribution lcontribution2 ldiffcontribution1 ldiffcontribution2,gmm(l
> contribution lcontribution2 ldiffcontribution1 ldiffcontribution2) twostep robust small
Favoring space over speed. To switch, type or click on mata: mata set matafavor speed, perm.
Warning: Two-step estimated covariance matrix of moments is singular.
Using a generalized inverse to calculate optimal weighting matrix for two-step estimation.
Difference-in-Sargan/Hansen statistics may be negative.
Dynamic panel-data estimation, two-step system GMM
------------------------------------------------------------------------------
Group variable: subject Number of obs = 1312
Time variable : period Number of groups = 164
Number of instruments = 94 Obs per group: min = 8
F(4, 163) = 72.64 avg = 8.00
Prob > F = 0.000 max = 8
------------------------------------------------------------------------------
| Corrected
contribution | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lcontribut~n | .8988869 .1153642 7.79 0.000 .6710859 1.126688
lcontribut~2 | .1973221 .0441006 4.47 0.000 .1102399 .2844043
ldiffcontr~1 | -.3376674 .1352491 -2.50 0.014 -.6047337 -.0706012
ldiffcontr~2 | -.3964245 .1431313 -2.77 0.006 -.6790551 -.1137939
_cons | -1.292562 .7001243 -1.85 0.067 -2.675045 .0899204
------------------------------------------------------------------------------
Instruments for first differences equation
GMM-type (missing=0, separate instruments for each period unless collapsed)
L(1/.).(lcontribution lcontribution2 ldiffcontribution1
ldiffcontribution2)
Instruments for levels equation
Standard
_cons
GMM-type (missing=0, separate instruments for each period unless collapsed)
D.(lcontribution lcontribution2 ldiffcontribution1 ldiffcontribution2)
------------------------------------------------------------------------------
Arellano-Bond test for AR(1) in first differences: z = -5.68 Pr > z = 0.000
Arellano-Bond test for AR(2) in first differences: z = -1.48 Pr > z = 0.140
------------------------------------------------------------------------------
Sargan test of overid. restrictions: chi2(89) = 170.07 Prob > chi2 = 0.000
(Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(89) = 120.55 Prob > chi2 = 0.015
(Robust, but can be weakened by many instruments.)
Difference-in-Hansen tests of exogeneity of instrument subsets:
GMM instruments for levels
Hansen test excluding group: chi2(73) = 97.47 Prob > chi2 = 0.029
Difference (null H = exogenous): chi2(16) = 23.08 Prob > chi2 = 0.112
.
end of do-file
我此前并未注意到nomata选项会导致不同的结果。你可以看一下如下这篇文章,Roodman详细介绍了xtabond2这个命令,或许探讨了nomata选项。
Roodman, D., 2009, How to do Xtabond2: An Introduction to Difference and System GMM in Stata, The Stata Journal, 9 (1): 86-136.
Sargan和Hansen检验的区别在完成xtabond2估计后有提示:
Sargan test of overid. restrictions: chi2(89) = 170.07 Prob > chi2 = 0.000
(Not robust, but not weakened by many instruments.)
Hansen test of overid. restrictions: chi2(89) = 120.55 Prob > chi2 = 0.015
(Robust, but can be weakened by many instruments.)
二者的抉择问题,多少有点主观的因素在里面,所以不好判断。不过,就文献中的应用来看,多数学者都是报告Sargan检验值。就你的例子来看,工具变量数目较多,建议采用Sargan检验。
连老师以我的例子为例,报告Sargan检验值的话是否意味着我的模型设置不恰当需要重新选择工具变量?该如何选择比较好呢?麻烦了!
是的,从你的例子来看,sargan统计量拒绝了原假设,意味着模型设定存在问题。
不过,在使用xtabond2命令进行分析时,sargan检验在多数情况下都会拒绝原假设,这个问题我也很困惑,我猜测该检验在sys-GMM中可能存在过度拒绝的问题。
建议你使用FD-GMM做一下,看看sargan检验是否能通过。或者重新设定一下模型,新增一些控制变量或增加某些变量的滞后项,以便弱化序列相关问题。
连老师,按照你讲义所说,我采用两阶段一阶差分GMM进行估计后再进行Sargan检验,结果中P值依然高度显著,而且我觉得系统GMM估计肯定要比FD-GMM估计好些。这样的话可能是我的模型设定不当了,我有一个疑惑,增加变量的滞后项时我们一般怎么判断增加到几阶滞后项目才停止加入滞后解释变量??另外,动态面板数据中的控制变量一般有哪些?您能否举个例子,我在你的动态面板数据讲义中没有找到关于控制变量筛选的内容。在我的微观数据中有个体的性别,年龄,家庭收入,父母教育背景,城乡居民等等,我都应该加进来作为控制变量么?
. xtabond contribution lcontribution lcontribution2 ldiffcontribution1 ldiffcontribution2
> ,twostep
note: lcontribution dropped because of collinearity
Arellano-Bond dynamic panel-data estimation Number of obs = 1148
Group variable: subject Number of groups = 164
Time variable: period
Obs per group: min = 7
avg = 7
max = 7
Number of instruments = 39 Wald chi2(4) = 261.30
Prob > chi2 = 0.0000
Two-step results
------------------------------------------------------------------------------
contribution | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
contribution |
L1. | 1.088704 .1975656 5.51 0.000 .7014823 1.475925
lcontribut~2 | -.2743769 .0214021 -12.82 0.000 -.3163243 -.2324295
ldiffcontr~1 | -1.389588 .2071323 -6.71 0.000 -1.79556 -.9836165
ldiffcontr~2 | -1.281343 .1921747 -6.67 0.000 -1.657999 -.9046877
_cons | 1.365419 1.149296 1.19 0.235 -.8871592 3.617997
------------------------------------------------------------------------------
Warning: gmm two-step standard errors are biased; robust standard
errors are recommended.
Instruments for differenced equation
GMM-type: L(2/.).contribution
Standard: D.lcontribution D.lcontribution2 D.ldiffcontribution1
D.ldiffcontribution2
Instruments for level equation
Standard: _cons
.
end of do-file
. estat sargan
Sargan test of overidentifying restrictions
H0: overidentifying restrictions are valid
chi2(34) = 78.13872
Prob > chi2 = 0.0000
SYS-GMM只有在特定的情况下才会明显优于FD-GMM。同时,需要注意的是,sys-GMM使用了很多工具变量,这并不是件好事。
至于你提到加入多少控制变量的事情,并没有统一的标准,这涉及到模型设定的问题,要以你的理论分析为基础。也正是因为这个原因,我在视频中并没有花太多的时间讲解模型设定。
建议考虑如下控制变量:时间虚拟变量、以及你上面列出的一些控制变量(由于不知道你的被解释变量是什么含义,问题的背景是什么,所以无法给出具体的建议,不过你可以参考前期同类文献的设定方式)。
此外,在你的估计结果中,L.y的系数大于1,你要考虑一下这结果在理论上是否能解释的通。回到更为基础的问题,你可以仔细查验一下原始数据中的主要变量是否有严重的离群值,解释变量之间是否有严重的共线性问题。
扫码加好友,拉您进群



收藏
