全部版块 我的主页
论坛 数据科学与人工智能 数据分析与数据科学 SPSS论坛
4791 3
2013-04-08
悬赏 100 个论坛币 未解决
求大侠帮忙!要研究一个总体指标满意度与各变量的关系,现打算用回归分析来做,但是根据前人的经验,满意度是存在一定的右偏的,非完全正态的。请问,有没有办法通过增加虚拟变量或者其他的什么方法来处理,使其能够满足回归分析的条件,还是说既然是右偏的就只能放弃用回归分析来做的思路?急,恳请大家知道的帮个忙。
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2013-4-8 21:35:40
1. Taking the square root helps normalize a distribution skewed right.

When one applies a square root transformation, the square root of every value is taken. However, as one cannot take the square root of a negative number, if there are negative values for a variable, a constant must be added to move the minimum value of the distribution above 0, preferably to 1.00 (the rationale for this assertion is explained below). Another important point is that numbers of 1.00 and above behave differently than numbers between 0.00 and 0.99. The square root of numbers above 1.00 always become smaller, 1.00 and 0.00 remain constant, and number between 0.00 and 1.00 become larger (the square root of 4 is 2, but the square root of 0.40 is 0.63). Thus, if you apply a square root to a continuous variable that contains values between 0 and 1 as well as above 1, you are treating some numbers differently than others, which is probably not desirable in most cases.

2. Taking the logarithm of X also helps normalize a right-skewed distribution.

Logarithmic transformations are actually a class of transformations, rather than a single transformation. In brief, a logarithm is the power (exponent) to which a base number must be raised in order to get the original number. Any given number can be expressed as y to the x power in an infinite number of ways. For example, if we were talking about base 10, 1 is 100, 100 is 102, 16 is 101.2, and so on. Thus, log10(100)=2 and log10(16)=1.2. However, base 10 is not the only option for log transformations. Another common option is the Natural Logarithm, where the constant e (2.7182818) is the base. In this case the natural log 100 is 4.605. As the logarithm of any negative number or number less than 1 is undefined, if a variable contains values less than 1.0, a constant must be added to move the minimum value of the distribution, preferably to 1.00.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2013-4-9 11:24:07
不好意思啊 本人英文真心差  看不懂啊 麻烦能用中文说明下吗,谢了
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2013-4-9 12:56:54
猜测是提示你利用变换 sqrt(y)+平移     或者    log(y)搞搞,不一定非得线性回归啊。把满意度分级再利用Logistic回归不知道成不成啊
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群