关于tobit模型

55306

收藏 2006-05-17

Tobit模型有两个要求：1.以正的概率取0，2.其余值在0的右侧连续（wooldridge）。现在我用DEA计算了效率，我要解释这些效率的影响因素，这些效率值全部小于等于1，因为是相对效率，所以有部分是1，其余大于0小于1，但没有一个会等于0，因为不可能有完全无效率的点，所以不满足Tobit模型的第一个要求，如果我要采用Tobit模型是否有其他的扩展形式，使得左端的限制可以不为0？

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

全部回复

ivannj

2006-5-18 16:47:00

顶一下

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

wy0827

2006-7-12 17:47:00

我也想问这个问题，请高手指点迷津

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

qiangli

2006-7-12 19:15:00

首先：tobit模型左右都可以受限的；

其次：受限点不一定是0，可以是别的数，根据具体情况设定：如调查收入，你只能看到某个收入水平以下的；

第三：你提的问题我觉得应该用下面这种方法来解决。

http://www.stata.com/support/faqs/stat/logit.html

How do you estimate a model when the dependent variable is a proportion?

Title		Logit transformation
Author		Allen McDowell, StataCorp Nicholas J. Cox, Durham University, UK
Date		August 2001; updated August 2004

A traditional solution to this problem is to perform a logit transformation on the data. Suppose that your dependent variable is called y and your independent variables are called X. Then, one assumes that the model that describes y is

 1 y = ---------------- 1 + exp(-XB)

If one then performs the logit transformation, the result is

 ln( y / (1 - y) ) = XB

We have now mapped the original variable, which was bounded by 0 and 1, to the real line. One can now estimate this model using OLS or WLS, for example by using regress. Of course, one cannot perform the transformation on observations where the dependent variable is zero or one; the result will be a missing value, and that observation would subsequently be dropped from the estimation sample.

A better alternative is to estimate using glm with family(binomial), link(logit), and robust; this is the method proposed by Papke and Wooldridge (1996). When time this article was published, Stata’s glm command could not fit such models, and this fact is noted in the article. glm has since been enhanced specifically to deal with fractional response data.

In either case, there may well be a substantive issue of interpretation. Let us focus on interpreting zeros: the same kind of issue may well arise for ones. Suppose the y variable is proportion of days workers spend off sick. There are two extreme possibilities. The first extreme is that all observed zeros are in effect sampling zeros: each worker has some nonzero probability of being off sick, and it is merely that some workers were not, in fact, off sick in our sample period. Here, we would often want to include the observed zeros in our analysis and the glm route is attractive. The second extreme is that some or possibly all observed zeros must be considered as structural zeros: these workers will not ever report sick, because of robust health and exemplary dedication. These are extremes, and intermediate cases are also common. In practice, it is often helpful to look at the frequency distribution: a marked spike at zero or one may well raise doubt about a single model fitted to all data.

A second example might be data on trading links between countries. Suppose the y variable is proportion of imports from a certain country. Here a zero might be structural if two countries never trade, say on political or cultural grounds. A model that fits over both the zeros and the nonzeros might not be advisable, so that a different kind of model should be considered.

Reference

Papke, L. E. and J. Wooldridge. 1996. Econometric methods for fractional response variables with an application to 401(k) plan participation rates. Journal of Applied Econometrics 11: 619–632.