johnyanlei 发表于 2010-10-13 11:11 
最近在做一个研究,自变量为0或1的哑变量(Dummy variable),做出来结果还可以,但是问题在于0 1 变量中为数值为1的样本量只占总样本量的1/400
即: 在4000家企业中只有100家企业的哑变量值为1,两外3900家都为0
不知道这样悬殊的比例会不会使得回归缺乏信服力呢?
请大家指点一下吧,谢谢啦!
It should not be a problem as long as it is motivated by real problem/certain behavior and the result is justified by your data.
For eaxmple, 100 are larger firms and larger firms tend to have bigger advertisement spending.