全部版块 我的主页
论坛 数据科学与人工智能 数据分析与数据科学 SPSS论坛
13837 5
2015-04-14

  K-W是否需要满足各组分的分布是相同的才可以使用。
疑问来源:
The other assumption of one-way anova is that the variation within the groups is equal (homoscedasticity). While Kruskal-Wallis does not assume that the data are normal, it does assume that the different groups have the same distribution, and groups with different standard deviations have different distributions. If your data are heteroscedastic, Kruskal–Wallis is no better than one-way anova, and may be worse. Instead, you should use Welch's anova for heteoscedastic data.
(http://www.biostathandbook.com/kruskalwallis.html)


  wiki百科:Since it is a non-parametric method, the Kruskal–Wallis test does not assume a normal distribution of the residuals, unlike the analogous one-way analysis of variance. If the researcher can make the more stringent assumptions of an identically shaped and scaled distribution for all groups, except for any difference in medians, then the null hypothesis is that the medians of all groups are equal, and the alternative hypothesis is that at least one population median of one group is different from the population median of at least one other group.
只是说如果可以满足,就怎样怎样,看似非强制要求。


  然而 K-W基于 Mann–Whitney U test 所以继续wiki:

Although Mann and Whitney[1] developed the MWW test under the assumption of continuous responses with the alternative hypothesis being that one distribution is stochastically greater than the other, there are many other ways to formulate the null and alternative hypotheses such that the MWW test will give a valid test.[2]

A very general formulation is to assume that:

  • All the observations from both groups are independent of each other,
  • The responses are ordinal (i.e. one can at least say, of any two observations, which is the greater),
  • The distributions of both groups are equal under the null hypothesis, so that the probability of an observation from one population (X) exceeding an observation from the second population (Y) equals the probability of an observation from Y exceeding an observation from X. That is, there is a symmetry between populations with respect to probability of random drawing of a larger observation.
  • Under the alternative hypothesis, the probability of an observation from one population (X) exceeding an observation from the second population (Y) (after exclusion of ties) is not equal to 0.5. The alternative may also be stated in terms of a one-sided test, for example: P(X > Y) + 0.5 P(X = Y)  > 0.5.

Under more strict assumptions than those above, e.g., if the responses are assumed to be continuous and the alternative is restricted to a shift in location (i.e. F1(x) = F2(x + δ)), we can interpret a significant MWW test as showing a difference in medians. Under this location shift assumption, we can also interpret the MWW as assessing whether the Hodges–Lehmann estimate of the difference in central tendency between the two populations differs from zero. The Hodges–Lehmann estimate for this two-sample problem is the median of all possible differences between an observation in the first sample and an observation in the second sample.

其中第三条的确说:在零假设下两组的分布相同。但是因为我才学习统计分很短的时间,我也不敢确认这句话是否意味着M-W必须满足两组分布相同,而且虽然K-W基于M-W,但m-w分析两个组分,K-W分析3个以上,也不清楚是否k-w 必须满足 M-W 应该满足的。

  所以希望大神们帮我解答或分析一下:我的数据在对数处理后是正态的,但方差不是齐性的,我该选择one-way anova非齐性下的事后检验方法,还是选择非参数性方法中的K-W呢?万分感谢!



二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2015-4-14 12:28:54
方差分析的假设条件是正态分布和同方差,如果不满足这两个条件 ,可以考虑非参数方法。当各组相互独立时,K-W是一种比较可靠的方法。
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-4-14 13:18:42
jamescha2009 发表于 2015-4-14 12:28
方差分析的假设条件是正态分布和同方差,如果不满足这两个条件 ,可以考虑非参数方法。当各组相互独立时,K ...
是的,这些我也了解,k-w使用我也满足相互独立,只是想知道K-W是否还需要各组分的分布相同这一个必要条件。
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-4-14 18:16:32
自己也常浏览一些问题贴,发现很多问题也没有最终可靠的答案,也许其他人也会遇到和我相似的问题,那我在这里说一下通过自己查找到的资料所推导出的答案。
这里我不再打算使用非参数统计方法了,使用前看到此方法宽泛的假设的确很诱人,但其有效性却没人可以百分百的肯定,况且K-W只能证明是否具有差异,而差异在哪,为什么会有差异却没有ANOVA那么清晰,虽然SPSS里面的K-W可以通过类似S-N-K子集的方法来看,但是数据group多的话,那张多彩绚丽的图真是让人看的头疼,最重要的是K和W作者在发表此方法时明确说明:该方法不需要类似ANOVA的看似严酷般的假设条件,它不需要正态分布,不需要方差齐性,但后人的文章逐步证明其统计学上的不正确性,如今也相应发展出改进的版本,如零假设需要groups具有相同的分布,方差也需齐性,有一篇文章里面也说,关于方差问题两作者也承认他们犯了错误,好像是错把均值和中值与齐性混淆,忘了,我只是大致浏览了几篇文章,大概意思应该是那样,不过这都不重要,重要的是如果这方法不用,我还能用什么方法,这又回到了我在1楼所发的问题来源那里,那个handbook还是非常具有参考价值的,他提出使用Welch's anova里的Games-Howell test作为ANOVA的替代方法,当然你就会发现这个Games-Howell test其实就存在你做ONE-WAY ANOVA里的事后检验中,上面标明了当假设方差不齐时,所以说我逛了一大圈又回到了原点,还是用老方法,只要把数据做对数处理以使其符合正太分布,那些方差齐性的我选择Tukey,不齐的我就用Games-Howell test,我其实为什么纠结这个参数与非参就是因为大部分我看过的我这领域的文章也没发现有用Games-Howell test的,他们都说如果正太分布用什么什么,如果不是就用非参的什么什么,当然数据是有些不同的,我只是害怕因为这个被拒,那真是醉了。就这样吧,我就选Games-Howell test了,你拒绝我时,我也已经准备好一大堆资料拒绝你了。
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-4-14 18:19:00
另外再加一条,我也看到很多人回答别人的问题时说,如果什么什么不满足你就选非参,个人还是觉得,能有参数就还是用参数,非参尽量放在当你迫不得已时。
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2021-7-30 16:43:58
长年不登录的我,特意为了楼主的严谨而登录一次,为楼主点赞
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群