If a “not concave” message appears at the last step, there are two possibilities. One is that the result is valid, but there is collinearity in the model that the command did not otherwise catch. Stata checks for obvious collinearity among the independent variables before performing the maximization, but strange collinearities or near collinearities can sometimes arise between coefficients and ancillary parameters. The second, more likely cause for a “not concave” message at the final step is that the optimizer entered a flat region of the likelihood and prematurely declared convergence
If a “backed up” message appears at the last step, there are also two possibilities. One is that Stata found a perfect maximum and could not step to a better point; if this is the case, all is fine, but this is a highly unlikely occurrence. The second is that the optimizer worked itself into a bad concave spot where the computed gradient and Hessian gave a bad direction for stepping.
【difficult】 specifies that the likelihood function is likely to be difficult to maximize because of nonconcave regions. When the message “not concave” appears repeatedly, ml’s standard stepping algorithm may not be working well. difficult specifies that a different stepping algorithm be used in nonconcave regions. There is no guarantee that difficult will work better than the default; sometimes it is better and sometimes it is worse. You should use the difficult option only when the default stepper declares convergence and the last iteration is “not concave” or when the default stepper is repeatedly issuing “not concave” messages and producing only tiny improvements in the log likelihood.
【gradient】 adds to the iteration log a display of the current gradient vector.
【technique(algorithm spec)】 specifies how the likelihood function is to be maximized. The following algorithms are allowed. For details, see Gould, Pitblado, and Poi (2010). technique(nr) specifies Stata’s modified Newton–Raphson (NR) algorithm. technique(bhhh) specifies the Berndt–Hall–Hall–Hausman (BHHH) algorithm. technique(dfp) specifies the Davidon–Fletcher–Powell (DFP) algorithm. technique(bfgs) specifies the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm.
加入gradient之后,如果gradient变成0的时候,最终的结果就是可以接受的优化结果,但如果不是0,那么得到的就不是valid的结果,需要对于收敛进行严格定义,比如使用 ltol(0) tol(1e-7) 等。
If the gradient goes to zero, the optimizer has found a maximum that may not be unique but is a maximum. From the standpoint of maximum likelihood estimation, this is a valid result. If the gradient is not zero, it is not a valid result, and you should try tightening up the convergence criterion, or try ltol(0) tol(1e-7) to see if the optimizer can work its way out of the bad region.
使用difficult的option的时候,也要注意,可能会得到更坏的结果。
If you get repeated “not concave” steps with little progress being made at each step, try specifying the difficult option. Sometimes difficult works wonderfully, reducing the number of iterations and producing convergence at a good (that is, concave) point. Other times, difficult works poorly, taking much longer to converge than the default stepper.
Often and often. It is very difficult to tell from this kind of report whether you are trying to fit a model that is a bad idea for your data or the fitting process is just a bit tricky (or both). As you have several predictors, fitting an overcomplicated model really is a possibility, whatever the scientific (or non-scientific, e.g. economic) grounds for wanting to use them all. You could try tuning the -ml- engine by e.g. changing -technique()-. Simplifying the model first and then introducing complications gradually can sometimes isolate problematic predictors.
Nick
n.j.cox@durham.ac.uk
Anna-Leigh Stone
~~~~~~~~~~~~~~
I am using Stata 12.0 and I am attempting to run a fractional probit
regression with the command: glm dependent independent, fa(bin)
link(probit) cluster(gck). I have made sure that the dependent
variable values are not negative. I have 1500 dependent variable
observations out of 69,900 that fall at 0 and 1. Regardless of whether
I leave in the 0 and 1 values or take them out, I get the same log
likelihood iteration followed by backed up. It continues like this and
does not converge until I break it.
Has anyone else had this problem and know a solution to it? I do have
several variables but they are all necessary to my regression. I have
also tried the difficult option but that does not work either.