全部版块 我的主页
论坛 计量经济学与统计论坛 五区 计量经济学与统计软件 Stata专版
27756 20
2019-12-13
这是我日常操作遇到的问题,查阅了论坛中很多回复,没有清晰的解决办法,经过反复的研究,找出了问题的所在,在这里和大家分享。下面以一个简单的例子来解决这个问题。下面以一个简答的例子来理解,
1、reg roa sif lnage lncopen i.year i.ind if region ==2,vce(cluster code) ;在这个模型中,我控制了年份和行业,同时对个体进行聚类,使用聚类稳健标准误。


上面使用聚类稳健标准误后,F值和显著性都缺失了。点开这个F值的蓝色链接,会有stata对这个问题的解答,大体上缺漏的原因就是因为如果聚类的时候,只有一个code的话,那么就无法实现聚类。但是通过计算每个code的数量的时候,发现最少也有2个,也就是样本公司,最少也有两年的,不存在数据中某个公司只有一个样本的情况。(不过要注意的是,如果你数据中存在这种情况,需要删减掉)
既然不存在样本中code只有一个无法聚类的问题,那么就排除了第一种情况。
2、控制变量年份和行业中,存在某一个行业,在某一年中只有一个样本的情况,于是同样的方法,先检查一下数据,是否存在这样的情况。
果然结果发现,的确存在某个行业,在该年度只有一个样本的情况。
下面删除这些只有一个样本的情况,再次进行回归分析。

现在结果出来了,可以看到F值了。
对上面F值缺漏的情况,总结两种可能的结果,并相应地处理:
1、聚类的个体(上面例子中的code),是否存在只有一个样本的情况。
2、加入年份和行业的控制变量,也要检查,是否存在某个行业在该年度,只有一个样本的情况。

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2020-1-15 23:49:35
[em17]
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-2-10 04:52:41
Are any standard errors missing?

    If any standard errors are reported as dots, something is wrong with your model:  one or more coefficients could not be estimated in the normal statistical sense.  You need to address that problem and ignore the rest of this discussion.


Are you using bootstrap or jackknife?

    The VCE you have just estimated is not of sufficient rank to perform the model test.  This is most likely due to not having enough replications.

    The bootstrap command has a reps(#) option, and if # is less than the number of coefficients in the model, the VCE will have insufficient rank.  The solution is to rerun bootstrap with a much larger number of replications.

    The jackknife command estimates the VCE by refitting the model for each observation in the dataset, leaving the associated observation out of the estimation sample each time.  As with the conventional variance estimator, the VCE will be singular if
    the number of observations is less than the number of parameters.  See the following discussion if you supplied the cluster() option to jackknife.


Are you using a svy estimator or did you specify the vce(cluster clustvar) option?

    The VCE you have just estimated is not of sufficient rank to perform the model test.  As discussed in [R] test, the model test with clustered or survey data is distributed as F(k,d-k+1) or chi2(k), where k is the number of constraints and d=number
    of clusters or d=number of PSUs minus the number of strata.  Because the rank of the VCE is at most d and the model test reserves 1 degree of freedom for the constant, at most d-1 constraints can be tested, so k must be less than d.  The model that
    you just fit does not meet this requirement.

    To simplify the remaining discussion, let's consider the case of clustered data.  This discussion applies to survey estimation in general by substituting, "PSUs - strata" for "clusters".

    There is no mechanical problem with your model, but you need to consider carefully whether any of the reported standard errors mean anything.  The theory that justifies the standard error calculation is asymptotic in the number of clusters, and we
    have just established that you are estimating at least as many parameters as you have clusters.

    That concern aside, the model test statistic issue is that you cannot simultaneously test that all coefficients are zero because there is not enough information.  You could test a subset, but not all, and so Stata refuses to report the overall model
    test statistic.

    Here note the degrees of freedom reported for the chi2 or F.  You might see chi2(6) or F(6, 5).  If you were to count the number of coefficients that would be constrained to 0 in a model test in this case, you would find that number to be greater
    than 6.  You could find out what that number is by reestimating the model parameters without the vce(robust) and vce(cluster clustvar) options (or, for the survey commands, using the corresponding non-svy estimator).  In any case, the 6 reported is
    the maximum number of coefficients that could be simultaneously tested.


Is there a regressor that is nonzero for only 1 observation or for one cluster?

    The VCE you have just estimated is not of sufficient rank to perform the model test.  This can happen if there is a variable in your model that is nonzero for only 1 observation in the estimation sample.  Likewise, it can happen if a variable is
    nonzero for only one cluster when using the cluster-robust VCE.  In such cases the derivative of the sum-of-squares or likelihood function with respect to that variable's parameter is zero for all observations.  That implies that the
    outer-product-of-gradients (OPG) variance matrix is singular.  Because the OPG variance matrix is used in computing the robust variance matrix, the latter is therefore singular as well.

再补充一点,也有可能是自由度不够,即聚类组数量必须大于方程自变量数量。
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-4-13 14:11:28
感谢!
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-5-2 14:09:43
那请问f值缺失的模型回归结果可以用吗。如果两个主检验方程,可以一个cluster,一个不cluster吗
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-5-24 17:01:35
梓煜 发表于 2020-5-2 14:09
那请问f值缺失的模型回归结果可以用吗。如果两个主检验方程,可以一个cluster,一个不cluster吗
我也想问这个问题 您知道了吗 F为蓝色,缺失的回归结果可以用在论文里吗?
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

点击查看更多内容…
相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群