全部版块 我的主页
论坛 计量经济学与统计论坛 五区 计量经济学与统计软件 HLM专版
2204 1
2014-03-28
RE: very different model coefficients when analysing a three-wave longitudinal dataset using mixed-effects logistic regression and MQL and MCMC (see output below).

I'd greatly appreciate any advice about this issue. The dataset, N,  and variable specifications are identical in both models, only the estimation procedures differ. I acknowledge that MCMC is the recommended estimation procedure with these types of data, and that estimates between MQL and MCMC will be different: but I was surprised (and concerned) about the magnitude of the differences (e.g. β0j, U0j). Also, the MQL estimates, when expressed as predicted probabilities, closely reflect the data, whereas the predicted probabilities derived from MCMC look nothing like the data (please note that I used the MQL estimates as starting values for the MCMC, and all of the MCMC accuracy diagnostics indicate convergence).

The data are in a three-wave (2007, 2009, & 2011) person-period format, and the outcome (cycling for transport) is binary: 1=yes, 0=no. The data are being analysed as a two-level logistic model (level 2=between subjects, level 1 within subjects) using MLwiN.

At each wave the number of people who report cycling for transport is small: 2007 (n=390/10,036), 2009 (n=270/7043) and 2011 (n=245/6185). The cross-wave correlations in cycling (here indicated using odds ratios) are strong: 2007-2009 OR=43.3; 2009-2011 OR=45.0; 2007-2011 OR=30.5.

Is the difference due to the combination of a small number of cases and strong cross-wave dependency, and a procedure (MCMC) that as a consequence is producing average estimates from a very wide simulation range?

Any advice greatly appreciated
Regards
Gavin



MQL

MCMC

Boj

-4.85 (0.14)

-8.72 (0.31)
Sex(male)

1.31 (0.09)

2.07 (0.16)

Education1 (High)

0.92 (0.11)

1.44 (0.17)

Education2

0.27 (0.16)

0.37 (0.25)

Education3

0.29 (0.14)

0.46 (0.21)

Education4 (Low)

Ref

Ref

Time

0.14 (0.05)

0.21 (0.06)

Uoj

4.86 (0.24)

9.11 (0.77)
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2014-3-28 10:14:57
A value of 5 from MQL for the between cluster variance (Uoj I am guessing?) is massize for a logistic regression and would suggest huge between cluster effects - almost as if the extra waves give little or no added value as people cycling in wave 1 continue and vice versa. I'd suggest taking a look at my paper from 2006 with David Draper in Bayesian Analysis (on method comparison for these models). We looked at a dataset used by Rodriguez and Goldman and they have also published comparing methods on the same scenario of large between cluster variance and small numbers of level 1 observations per level 2.
When working out predictive probabilities one should be wary that multilevel models give subject specific estimates as opposed to population averaged although one can convert them (see the MLwiN manual binary response chapter).

Hope this helps,

William Browne
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群