problem with zero-inflated negative binomial

1249

收藏 2015-02-06

各位大牛，
我在其他论坛上问了一下问题，暂时没有人回复，只好把问题搬到这里再问问，希望遇到牛人。在线急等，谢谢各位。

I'm hoping there was someone that could help me out of this problem when running the zero-inflated negative binomial model with stata. In this model, I'm trying to 1) predict the probability of time on care-giving > 0, and 2) the total amount of time on care-giving if greater than zero (hours/week). It went well when there was only one independent variable added. But it took forever when I started to add more variables there. Below is my code for the survey data that I'm using. Please see results in the attachment.

*unadjusted model

program trysimple
args ylist subset

svyset w1varunit [pweight=w1anfinwgt0], strata(w1varstrat) singleunit(centered)

svy, subpop(`subset' if `subset' < 2): zinb `ylist' ib0.aaWhite, inflate (ib0.aaWhite)
margins, subpop(`subset' if `subset' < 2) at(aaWhite=(0 1)) vce(unconditional) post
test 1._==2._

end program

trysimple time raceEthStrokeSubset
trysimple TOTnumhrswk1 raceEthStrokeSubset

* adjusted for soicaldemographic, comorbidity and physical capacity
program tryadjusted
args ylist subset
svyset w1varunit [pweight=w1anfinwgt0], strata(w1varstrat) singleunit(centered)

svy, subpop(`subset' if `subset' < 2): zinb `ylist' ib0.aaWhite i.ageCat i.gender i.educ3 i.married i.meanIncome5 ///
mi2 cad2 htn2 dm2 cancer2 dementia2 osteoporosis2 athritis2 i.phq2Positive i.gad2Positive c.capacityIndex , ///
inflate (ib0.aaWhite i.ageCat i.gender i.educ3 i.married i.meanIncome5 ///
mi2 cad2 htn2 dm2 cancer2 dementia2 osteoporosis2 athritis2 i.phq2Positive i.gad2Positive c.capacityIndex)

margins, subpop(`subset' if `subset' < 2) at(aaWhite=(0 1)) vce(unconditional) post
test 1._==2._

end program

tryadjusted time raceEthStrokeSubset
tryadjusted TOTnumhrswk1 raceEthStrokeSubset

Here, I use two different outcome variables with the model: TOTnumhrswk and time ( int(TOTnumhrswk)) since it is a count model, i guess it would only fit for count? The first unadjusted model worked while the adjusted one didn't. Stata kept running for a while without giving any result. My questions are：
1. The maximum of variable 'time' is over 500. Is that the reason for stata taking so long to get the result? Do i need to consider some other model? What the alternatives would be?
2. If not, do i need to have different variable list for the inflate part? Let's say, at least one different x variable there?
3. Variables like mi2, cad2, cancer2 are indicator of disease but with some missing data. They are supposed to be binary. Would that be a potential problem for running the estimate?

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群