谢谢楼主提的好问题,我又复习了一把。
stata 官方的论坛里就有这个问题的答案。我直接贴了。
Concerning the pseudo-R2, we use the formula
pseudo-R2 = 1 − L1/L0
where L0 and L1 are the constant-only and full model log-likelihoods, respectively.
For discrete distributions, the log likelihood is the log of a probability, so it is always negative (or zero). Thus 0 ≥ L1 ≥ L0, and so 0 ≤ L1/L0 ≤ 1, and so 0 ≤ pseudo-R2 ≤1 for DISCRETE distributions.
For continuous distributions, the log likelihood is the log of a density. Since density functions can be greater than 1 (cf. the normal density at 0), the log likelihood can be positive or negative. Similarly, mixed continuous/discrete likelihoods like tobit can also have a positive log likelihood.
If L1 > 0 and L0 < 0, then L1/L0 < 0, and 1 − L1/L0 > 1.
If L1 > L0 > 0 and then L1/L0 > 1, and 1 − L1/L0 < 0.
Hence, this formula for pseudo-R2 can give answers > 1 or < 0 for continuous or mixed continuous/discrete likelihoods like tobit. So, it makes no sense.
For many models, including tobit, the pseudo-R2 has no real meaning.
This formula for pseudo-R2 is nothing more than a reworking of the model chi-squared, which is 2(L1 − L0). Thus even for discrete distributions where 0 ≤ pseudo R2 ≤ 1, it is still better to report the model chi-squared and its p-value—not the pseudo-R2.