手头有一个数据集,每个ID有若干条观测(数量各自不同),每条观测我都按照时间排了顺序(times)。我查了这种数据由于不独立,不能用logistic做回归,得用广义估计方程GEE分析,但是根据R语言中geepack包的说明,始终看不太明白,求高手指点程序该怎么写。数据自变量有8个,因变量1个,都已经根据自身数据特点定义为二分类的因子。
下面是的程序:
library(geepack)
mydata <- mydata[order(mydata$id, my$times),]
mf <- as.formula(outcome~age+gender+nation+smoke+F1+F2+F3+F4)
gee1 <- geeglm(mf,data=mydata,id=id,family=binomial, corstr="unstructured")
Error during wrapup: NA/NaN/Inf in 'y'
我的数据集里8个自变量和1个因变量都不存在缺失值。求指点怎么才能正确的跑出GEE结果,我的qq是170213693,在线等。。
| ID | age | nation | gender | smoke | F1 | F2 | F3 | F4 | times | outcome |
1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 2 | 0 |
1 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 3 | 1 |
2 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 1 |
2 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 2 | 1 |
3 | 0 | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 0 |
3 | 0 | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 2 | 0 |
3 | 0 | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 3 | 0 |
3 | 0 | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 4 | 1 |
4 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 0 |
5 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 1 |
5 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 2 | 1 |
5 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 3 | 0 |
5 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 4 | 0 |
6 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 0 |
6 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 2 | 0 |
6 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 3 | 1 |
6 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 4 | 0 |
6 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 5 | 1 |
6 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 6 | 0 |