SAS PROC GLIMMIX vs. Stata generalized linear mixed model commands

3270

收藏 2014-04-13

Does anybody know about the best way to match up what SAS and Stata do in terms of multilevel / mixed models? Lesa Hoffman from U of Nebraska, Lincoln, has been able to find a matching pair of options (http://psych.unl.edu/hoffman/Sheets/Workshops/ICPSR4_Example10b_Generalized_Clustered_Models.pdf) for Gauss-Hermite quadrature with a fixed number of integration
points:

SAS:

PROC GLIMMIX DATA=... METHOD = QUAD (QPOINTS=7);
CLASS ... ;
MODEL response = predictors / SOLUTION LINK=LOGIT DIST=BIN DDFM=BW;
RANDOM INTERCEPT / TYPE=UN SUBJECT = cluster;
RUN;

Stata:

xtmelogit response predictors, || cluster: , variance
covariance(unstructured) intpoints(7)

Are there any other estimation methods for generalized linear mixed models (in particular, mixed/multilevel logistic model) that these two packages have in common?

Stas Kolenikov, PhD, PStat (ASA, SSC), Principal Survey Scientist, Abt SRBI

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

全部回复

ReneeBK

2014-4-13 11:13:48

Do not know what methods may be used to fit GLMMs in Stata. As you may've discovered you can fit GLMMs in GLIMMIX using GH quadrature, Laplace estimation and 1st-order pseudolikelihood (latter not recommended for binomial outcomes when cluster sizes small). You can also fit GLMMs in NLMIXED using GH quadrature. And you can fit GLMMs using MCMC using PROC MCMC.

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

ReneeBK

2014-4-13 11:16:51

Stas

You are correct MCMC results are hard to replicate *exactly* given their stochastic nature. But, long enough runs should reach a rather similar stationary distribution. Likelihood models are, on the other hand, deterministic.

But, differences within the programs themselves on specific computational details can yield slight differences between models out to three or four digits. For example, some software programs use log-likelihood values as a stopping rule, which is generally known to be bad when one encounters a ridge in the likelihood. One piece of software might stop whereas the other will keep "climbing" if it is based on a different stopping rule.

In the software I wrote for mixed models, I use matrix decompositions instead of "inverting" matrices. It is also well known that solving triangular systems of equations is more precise and computationally efficient than programs that take an explicit inverse of a matrix.

Some software programs profile the likelihood whereas other deal with the complete likelihood, and this is (though a small) another issue that can yield differences.

My point is this (without rambling on too much). Authors of software intentionally use different computational methods and this will yield (potentially) real differences in final results. This is actually part of the art.

Whenever I unit test my software, I run it against lmer() in R to see if it matches. I'm always close, and in some cases, almost exact, but never perfectly exact as my methods differ from the methods used by Bates et al.

Hope this is helpful

Doran Harold

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

ReneeBK

2014-4-13 11:18:29

Thanks, Harold -- the reasons you gave is why I am saying I would
understand differences to 3 digits. Art is great, but I am here, as I
said earlier, for factory production and standardization. If Stata
produces an estimate of 0.25, GLIMMIX produces -0.13, and glmer
produces -0.37, I would seriously doubt they are using the same
estimation method (although playing with different estimation options
within a given package can produce these differences; but at least I'd
be able to attribute them to using 15 quadrature points vs. Laplace).

I would say though that the variety of art products has serious
implications for research reproducibility, as (i) the software
developers have to VERY EXPLICITLY list in their output all of the
methods and options that were used (profiling or not; straight
quadrature or adaptive; number of points; Cholesky or QR or else;
REML, ML, deviance, etc. objective functions; PQL vs. true
likelihoods; you name it), ideally with clickable DOIs; and (ii)
researchers have the responsibility to retranslate these options in
their papers so that somebody interested (e.g., Stas Kolenikov) could
take their data and reproduce results in a different package to three
digits. Or write their own Fortran code and invert matrices in quad
precision :).

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

ReneeBK

2014-4-13 11:19:23

I think what you're saying is right; I would also be troubled if I saw differences on the order of magnitude you note as well and would be highly suspicious. I think your position that different software should produce replicable output is correct. My only concern is how tight that bar for "similar" should be.

One thing I have learned over time as I have become more involved in numerical methods is that small differences in machine precision in iterative computational methods (such as mixed models), can accumulate at each iteration and propagate out to the final estimates.

So, two statisticians might use the same computational framework for model estimation, say Henderson's methods. One person might solve the system using decompositions while the other person might take a matrix inverse. At each iteration, some small information is lost given how computers represent floating points that in the end may show up as small differences in models.

In contrast, if I use QR for least squares and then also take the inverse of (X'X), the differences may not be large because they doesn't accumulate over multiple iterations.

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

ReneeBK

2014-4-13 11:20:54

My personal take on this is that with results that differ by this much,
at least two of them are flawed.  However, it would be interesting to know
what kind of results you are using as an example - fixed effects?

There is an exception to my "rule" however:  if the estimate in question is
highly insignificant, then different programs can produce different results.
If a Hessian-gradient method is used to produce said estimates, a Hessian
with a determinant very close to zero is unstable enough to produce
disparate results.  Also, one never knows what methods are being used to
reign in negative definite Hessians.

Here's a question for the more knowledgeable: if an estimate is highly
insigificant, does it matter what is reported?

---
Richard Congdon,Senior Programmer NORC

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

扫码加我 拉你入群

分享

扫码加好友，拉您进群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群

扫码加我拉你入群