全部版块 我的主页
论坛 计量经济学与统计论坛 五区 计量经济学与统计软件 Stata专版
2111 4
2010-03-10
logit      the size of the sample?????????how many   is enough    ,
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2010-3-11 19:56:15
[其实关于logistic回归的样本量在部分著作中也有提及,一般来讲,比较有把握的说法是:每个结局至少需要10例样品。这里说得是每个结局。例如,观察胃癌的危险因素,那就是说,胃癌是结局,不是你的总的例数,而是胃癌的例数就需要这么多,那总的例数当然更多。比如我有7个研究因素,那我就至少需要70例,如果你是1:1的研究,那总共就需要140例。如果1:2甚至更高的,那就需要的更多了。
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2010-3-11 21:26:30
Logit是ML估计
而ML估计的优良性状应该是大样本特征的。
J.scott long的书上提到,样本数超过500是安全的。


看Allison,P.D.(1995)Survival analysis using the sas system-a practical guide的 P80
All these approximations get better as the sample size gets larger.
The fact that these desirable properties have only been proven for large
samples does not mean that ML has bad properties for small samples. It simply
means that we usually don't know what the small-sample properties are. And
in the absence of attractive alternatives, researchers routinely use ML
estimation for both large and small samples. Although I won't argue against
that practice, I do urge caution in interpreting p-values and confidence
intervals when samples are small. Despite the temptation to accept larger
p-values as evidence against the null hypothesis in small samples, it is actually
more reasonable to demand smaller values to compensate for the fact that the
approximation to the normal or chi-square distributions may be poor.
The other reason for ML's popularity is that it is often
straightforward to derive ML estimators when there are no other obvious
possibilities. As we will see, one case that ML handles nicely is data with
censored observations. While you can use least squares with certain
adjustments for censoring (Lawless 1982, p. 328), such estimates often have
much larger standard errors, and there is little available theory to justify the
construction of hypothesis tests or confidence intervals.
The basic principle of ML is to choose as estimates those values
that will maximize the probability of observing what we have, in fact,
observed. There are two steps to this: (1) write down an expression for the
probability of the data as a function of the unknown parameters, and (2) find
the values of the unknown parameters that make the value of this expression
as large as possible.
The first step is known as constructing the likelihood function. To
accomplish this, you must specify a model, which amounts to choosing a
probability distribution for the dependent variable and choosing a functional
form that relates the parameters of this distribution to the values of the
covariates. We have already considered those two choices.
The second step—maximization—typically requires an iterative
numerical method, that is, one involving successive approximations. Such
methods are often computationally demanding, which explains why ML
estimation has become popular only in the last two decades.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2010-3-13 09:22:12
真羡慕版主 这知识渊博  !谢谢!
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-2-15 14:51:52
thanks for sharing
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群