书名:Springer Series in Statistics--Inference in Hidden Markov Models
作者:
Olivier Cappe
Eric Moulines
Tobias Ryden
目录:
Preface ........................................................ V
Contributors................................................... IX
1 Introduction ............................................... 1
1.1 WhatIsaHiddenMarkovModel?......................... 1
1.2 BeyondHiddenMarkovModels ........................... 4
1.3 Examples............................................... 6
1.3.1 FiniteHiddenMarkovModels....................... 6
1.3.2 NormalHiddenMarkovModels ..................... 13
1.3.3 Gaussian Linear State-Space Models ................. 15
1.3.4 Conditionally Gaussian Linear State-Space Models .... 17
1.3.5 General (Continuous) State-Space HMMs ............ 24
1.3.6 Switching Processes with Markov Regime............. 29
1.4 Left-to-RightandErgodicHiddenMarkovModels........... 33
2 Main Definitions and Notations............................ 35
2.1 MarkovChains.......................................... 35
2.1.1 TransitionKernels................................. 35
2.1.2 HomogeneousMarkovChains....................... 37
2.1.3 Non-homogeneousMarkovChains ................... 40
2.2 HiddenMarkovModels................................... 42
2.2.1 Definitions and Notations .......................... 42
2.2.2 Conditional Independence in Hidden Markov Models... 44
2.2.3 HierarchicalHiddenMarkovModels ................. 46
Part I State Inference
3 Filtering and Smoothing Recursions ....................... 51
3.1 Basic Notations and Definitions ........................... 53
3.1.1 Likelihood........................................ 53
3.1.2 Smoothing........................................ 54
3.1.3 TheForward-BackwardDecomposition............... 56
3.1.4 ImplicitConditioning(PleaseReadThisSection!) ..... 58
3.2 Forward-Backward....................................... 59
3.2.1 TheForward-BackwardRecursions .................. 59
3.2.2 Filtering and Normalized Recursion.................. 61
3.3 MarkovianDecompositions ............................... 66
3.3.1 ForwardDecomposition ............................ 66
3.3.2 BackwardDecomposition........................... 70
3.4 Complements ........................................... 74
4 Advanced Topics in Smoothing ............................ 77
4.1 Recursive Computation of Smoothed Functionals ............ 77
4.1.1 Fixed Point Smoothing............................. 78
4.1.2 RecursiveSmoothersforGeneralFunctionals ......... 79
4.1.3 ComparisonwithForward-BackwardSmoothing....... 82
4.2 Filtering and Smoothing in More General Models............ 85
4.2.1 SmoothinginMarkov-switchingModels .............. 86
4.2.2 SmoothinginPartiallyObservedMarkovChains ...... 86
4.2.3 MarginalSmoothinginHierarchicalHMMs........... 87
4.3 ForgettingoftheInitialCondition......................... 89
4.3.1 Total Variation.................................... 90
4.3.2 LipshitzContractionforTransitionKernels........... 95
4.3.3 TheDoeblinConditionandUniformErgodicity ....... 97
4.3.4 ForgettingProperties ..............................100
4.3.5 Uniform Forgetting Under Strong Mixing Conditions...105
4.3.6 ForgettingUnderAlternativeConditions .............110
5 Applications of Smoothing.................................121
5.1 Models with Finite State Space ...........................121
5.1.1 Smoothing........................................122
5.1.2 Maximum a Posteriori SequenceEstimation..........125
5.2 Gaussian Linear State-Space Models .......................127
5.2.1 Filtering and Backward Markovian Smoothing ........127
5.2.2 Linear Prediction Interpretation.....................131
5.2.3 The Prediction and Filtering Recursions Revisited .....137
5.2.4 DisturbanceSmoothing ............................143
5.2.5 The Backward Recursion and the Two-Filter Formula..148
5.2.6 Application to Marginal Filtering and Smoothing in
CGLSSMs........................................155
6 Monte Carlo Methods .....................................161
6.1 BasicMonteCarloMethods ..............................161
6.1.1 MonteCarloIntegration............................162
6.1.2 Monte Carlo Simulation for HMM State Inference .....163
6.2 AMarkovChainMonteCarloPrimer......................166
6.2.1 The Accept-Reject Algorithm .......................166
6.2.2 MarkovChainMonteCarlo.........................170
6.2.3 Metropolis-Hastings ...............................171
6.2.4 HybridAlgorithms ................................179
6.2.5 GibbsSampling...................................180
6.2.6 StoppinganMCMCAlgorithm......................185
6.3 ApplicationstoHiddenMarkovModels ....................186
6.3.1 GenericSamplingStrategies ........................186
6.3.2 GibbsSamplinginCGLSSMs.......................194
7 Sequential Monte Carlo Methods ..........................209
7.1 ImportanceSamplingandResampling......................210
7.1.1 ImportanceSampling ..............................210
7.1.2 SamplingImportanceResampling ...................211
7.2 SequentialImportanceSampling...........................214
7.2.1 Sequential Implementation for HMMs................214
7.2.2 ChoiceoftheInstrumentalKernel...................218
7.3 SequentialImportanceSamplingwithResampling ...........231
7.3.1 WeightDegeneracy................................231
7.3.2 Resampling.......................................236
7.4 Complements ...........................................242
7.4.1 Implementation of Multinomial Resampling...........242
7.4.2 AlternativestoMultinomialResampling..............244
8 Advanced Topics in Sequential Monte Carlo ...............251
8.1 AlternativestoSISR.....................................251
8.1.1 I.I.D.Sampling ...................................253
8.1.2 Two-Stage Sampling ...............................256
8.1.3 Interpretation with Auxiliary Variables...............260
8.1.4 Auxiliary Accept-Reject Sampling ...................261
8.1.5 Markov Chain Monte Carlo Auxiliary Sampling .......263
8.2 SequentialMonteCarloinHierarchicalHMMs ..............264
8.2.1 Sequential Importance Sampling and Global Sampling .265
8.2.2 OptimalSampling.................................267
8.2.3 ApplicationtoCGLSSMs...........................274
8.3 ParticleApproximationofSmoothingFunctionals ...........278
9 Analysis of Sequential Monte Carlo Methods ..............287
9.1 ImportanceSampling ....................................287
9.1.1 UnnormalizedImportanceSampling .................287
9.1.2 DeviationInequalities..............................291
9.1.3 Self-normalized Importance Sampling Estimator.......293
9.2 SamplingImportanceResampling .........................295
9.2.1 TheAlgorithm....................................295
9.2.2 Definitions and Notations ..........................297
9.2.3 WeightingandResampling .........................300
9.2.4 Application to the Single-Stage SIR Algorithm ........307
9.3 Single-StepAnalysisofSMCMethods......................311
9.3.1 Mutation Step ....................................311
9.3.2 Description of Algorithms ..........................315
9.3.3 Analysis of the Mutation/Selection Algorithm.........319
9.3.4 Analysis of the Selection/Mutation Algorithm.........320
9.4 SequentialMonteCarloMethods ..........................321
9.4.1 SISR ............................................321
9.4.2 I.I.D.Sampling ...................................324
9.5 Complements ...........................................333
9.5.1 WeakLimitsTheoremsforTriangularArray..........333
9.5.2 Bibliographic Notes................................342
Part II Parameter Inference
10 Maximum Likelihood Inference, Part I:
Optimization Through Exact Smoothing...................347
10.1 Likelihood Optimization in Incomplete Data Models .........347
10.1.1 Problem Statement and Notations ...................348
10.1.2 The Expectation-Maximization Algorithm ............349
10.1.3 Gradient-basedMethods ...........................353
10.1.4 ProsandConsofGradient-basedMethods............358
10.2 ApplicationtoHMMs....................................359
10.2.1 Hidden Markov Models as Missing Data Models .......359
10.2.2 EMinHMMs.....................................360
10.2.3 ComputingDerivatives.............................362
10.2.4 Connection with the Sensitivity Equation Approach ...364
10.3 TheExampleofNormalHiddenMarkovModels.............367
10.3.1 EMParameterUpdateFormulas ....................367
10.3.2 EstimationoftheInitialDistribution ................370
10.3.3 Recursive Implementation of E-Step .................371
10.3.4 Computation of the Score and Observed Information...374
10.4 The Example of Gaussian Linear State-Space Models ........384
10.4.1 TheIntermediateQuantityofEM...................385
10.4.2 Recursive Implementation ..........................387
10.5 Complements ...........................................389
10.5.1 GlobalConvergenceoftheEMAlgorithm ............389
10.5.2 Rate of Convergence of EM.........................392
10.5.3 GeneralizedEMAlgorithms ........................393
10.5.4 Bibliographic Notes................................394
11 Maximum Likelihood Inference, Part II:
Monte Carlo Optimization.................................397
11.1 MethodsandAlgorithms .................................398
11.1.1 MonteCarloEM..................................398
11.1.2 SimulationSchedules ..............................403
11.1.3 Gradient-basedAlgorithms .........................408
11.1.4 Interlude: Stochastic Approximation and the
Robbins-MonroApproach ..........................411
11.1.5 StochasticGradientAlgorithms .....................412
11.1.6 StochasticApproximationEM ......................414
11.1.7 StochasticEM ....................................416
11.2 AnalysisoftheMCEMAlgorithm .........................419
11.2.1 ConvergenceofPerturbedDynamicalSystems ........420
11.2.2 ConvergenceoftheMCEMAlgorithm ...............423
11.2.3 Rate of Convergence of MCEM .....................426
11.3 AnalysisofStochasticApproximationAlgorithms............429
11.3.1 Basic Results for Stochastic Approximation
Algorithms .......................................429
11.3.2 ConvergenceoftheStochasticGradientAlgorithm.....431
11.3.3 Rate of Convergence of the Stochastic Gradient
Algorithm........................................432
11.3.4 ConvergenceoftheSAEMAlgorithm ................433
11.4 Complements ...........................................435
12 Statistical Properties of the Maximum Likelihood
Estimator..................................................441
12.1 A Primer on MLE Asymptotics ...........................442
12.2 Stationary Approximations ...............................443
12.3 Consistency.............................................446
12.3.1 Construction of the Stationary Conditional
Log-likelihood.....................................446
12.3.2 TheContrastFunctionandItsProperties ............448
12.4 Identifiability ...........................................450
12.4.1 EquivalenceofParameters..........................451
12.4.2 Identifiability of Mixture Densities...................454
12.4.3 Application of Mixture Identifiability to Hidden
MarkovModels ...................................455
12.5 Asymptotic Normality of the Score and Convergence of the
ObservedInformation....................................457
12.5.1 The Score Function and Invoking the Fisher Identity...457
12.5.2 Construction of the Stationary Conditional Score......459
12.5.3 WeakConvergenceoftheNormalizedScore...........464
12.5.4 Convergence of the Normalized Observed Information ..465
12.5.5 Asymptotics of the Maximum Likelihood Estimator....465
12.6 ApplicationstoLikelihood-basedTests.....................466
12.7 Complements ...........................................468
13 Fully Bayesian Approaches ................................471
13.1 ParameterEstimation....................................471
13.1.1 BayesianInference.................................471
13.1.2 PriorDistributionsforHMMs.......................475
13.1.3 Non-identifiability and Label Switching ..............478
13.1.4 MCMCMethodsforBayesianInference ..............481
13.2 ReversibleJumpMethods ................................488
13.2.1 VariableDimensionModels.........................488
13.2.2 Green’sReversibleJumpAlgorithm..................490
13.2.3 AlternativeSamplerDesigns........................498
13.2.4 AlternativestoReversibleJumpMCMC .............500
13.3 Multiple Imputations Methods and Maximum a Posteriori ...501
13.3.1 SimulatedAnnealing...............................502
13.3.2 TheSAMEAlgorithm .............................503
Part III Background and Complements
14 Elements of Markov Chain Theory ........................513
14.1 Chains on Countable State Spaces .........................513
14.1.1 Irreducibility .....................................513
14.1.2 RecurrenceandTransience .........................514
14.1.3 Invariant Measures and Stationarity .................517
14.1.4 Ergodicity........................................519
14.2 Chains on General State Spaces ...........................520
14.2.1 Irreducibility .....................................521
14.2.2 RecurrenceandTransience .........................523
14.2.3 Invariant Measures and Stationarity .................534
14.2.4 Ergodicity........................................541
14.2.5 Geometric Ergodicity and Foster-Lyapunov Conditions .548
14.2.6 LimitTheorems...................................552
14.3 ApplicationstoHiddenMarkovModels ....................556
14.3.1 Phi-irreducibility ..................................557
14.3.2 AtomsandSmallSets..............................558
14.3.3 RecurrenceandPositiveRecurrence .................560
15 An Information-Theoretic Perspective on Order
Estimation.................................................565
15.1 ModelOrderIdentification:WhatIsItAbout?..............566
15.2 OrderEstimationinPerspective...........................567
15.3 OrderEstimationandCompositeHypothesisTesting ........569
15.4 Code-basedIdentification.................................571
15.4.1 Definitions .......................................571
15.4.2 Information Divergence Rates.......................574
15.5 MDL Order Estimators in Bayesian Settings ................576
15.6 Strongly Consistent Penalized Maximum Likelihood
Estimators for HMM Order Estimation.....................577
15.7 E?ciencyIssues.........................................580
15.7.1 VariationsonStein’sLemma .......................581
15.7.2 AchievingOptimalErrorExponents .................584
15.8 Consistency of the BIC Estimator in the Markov Order
EstimationProblem .....................................587
15.8.1 SomeMartingaleTools.............................589
15.8.2 TheMartingaleApproach ..........................591
15.8.3 TheUnionBoundMeetsMartingaleInequalities ......592
15.9 Complements ...........................................600
Part IV Appendices
A Conditioning ..............................................605
A.1 Probability and Topology Terminology and Notation.........605
A.2 Conditional Expectation..................................606
A.3 ConditionalDistribution .................................611
A.4 ConditionalIndependence ................................614
B Linear Prediction ..........................................617
B.1 HilbertSpaces ..........................................617
B.2 TheProjectionTheorem .................................619
C Notations..................................................621
C.1 Mathematical...........................................621
C.2 Probability .............................................622
C.3 HiddenMarkovModels...................................622
C.4 SequentialMonteCarlo ..................................624
References.....................................................625
Index ..........................................................645