全部版块 我的主页
论坛 数据科学与人工智能 数据分析与数据科学 R语言论坛
1138 0
2020-05-22

#Introduction to gmvarkit)

经管之家:Do the best economic and management education!
bbs.pinggu.org[China] Author:Daniel tulips liu



title: “Introduction to gmvarkit”
author: “Savi Virolainen”
date: “r Sys.Date()
header-includes: \usepackage{amsmath}
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Introduction to gmvarkit}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

The package gmvarkit contains tools to estimate and analyze Gaussian Mixture Vector Autoregressive (GMVAR) model (see Kalliovirta et. al. 2016). We briefly give the definition of the GMVAR model in this vignette, and also very briefly cover some its attractive theoretical and practical properties. For more comprehensive discussion, we refer to the cited article by Kalliovirta, Meitz, and Saikkonen (2016).

See the readme file for a short introduction on how to use gmvarkit.

The GMVAR models in gmvarkit are defined as class gmvar S3 ob jects which can be created with the estimation function fitGMVAR or with the constructor function GMVAR. The created gmvar ob jects are then be conveniently used as main arguments in many other functions that allow, for example, model diagnostics, simulations, and forecasting. Therefore, after estimating a GMVAR model it’s easy to use the other functions for further analysis. However, some tasks such as imposing linear constraints, creating GMVAR models without estimation, or setting up initial population for the genetic algorithm employed by the estimation function, require accurate understanding on how the parameter vectors are constructed in this package.

The notations for (unconstrained) parameter vector are in line with the cited article by Kalliovirta et. al. (2016), but for clarity we repeat these notations in the second section of this vignette. In the third section, we show how to apply general linear constraints to the autoregressive parameters of a GMVAR model. Two examples are given to demonstrate how to use the general formula. In the fourth section we have listed some useful functions and methods found in gmvarkit and briefly explain how they work and how to use them.

Definition of the GMVAR model

The Gaussian mixture vector autoregressive (GMVAR) model with MM mixture components and autoregressive order pp, which we refer to as the GMVAR(p,Mp,M) model, is a mixture of MM stationary Gaussian VAR(pp) models with time varying mixing weights. At each point of time tt, a GMVAR process generates an observation from one of its mixture components (also called regimes) according to the probabilities pointed by the mixing weights.

Let yt=(y1t,...,ydt)y_t=(y_{1t},...,y_{dt}), t=1,2,...t=1,2,... be the dd-dimensional vector valued time series of interest, and let Ft1\mathcal{F}_{t-1} be the σ\sigma-algebra generated by the random variables {ytj,j>0}\lbrace y_{t-j},j>0 \rbrace (containing the information on the past of yty_t). Let MM be the number of mixture components, and st=(st,1,...,st,M)\boldsymbol{s}_t=(s_{t,1},...,s_{t,M}) a sequence of (non-observable) random vectors such that at each tt exactly one of its components takes the value one and others take the value zero. The component that takes the value one is selected according to the probabilities Pr(st,m=1Ft1)αm,tPr(s_{t,m}=1|\mathcal{F}_{t-1})\equiv\alpha_{m,t}, m=1,...,Mm=1,...,M with m=1Mαm,t=1\sum_{m=1}^M\alpha_{m,t}=1 for all tt.
Then, for a GMVAR(p,Mp,M) model, we have
yt=m=1Mst,m(μm,t+Ωm1/2εt),εtNID(0,Id), y_t = \sum_{m=1}^M s_{t,m}(\mu_{m,t}+\Omega_{m}^{1/2}\varepsilon_t), \quad \varepsilon_t\sim\text{NID}(0,I_d),
where Ωm\Omega_{m} is a conditional positive definite covariance matrix, NID stands for “normally and independently distributed”, and
μm,t=φm,0+i=1pAm,iyti \mu_{m,t} = \varphi_{m,0} + \sum_{i=1}^p A_{m,i}y_{t-i}
is interpreted as the conditional mean of the mmth mixture components. The terms φm,0\varphi_{m,0} (d×1)(d\times 1) are intercept parameters and Am,iA_{m,i} (d×d)(d\times d) are the autoregressive (AR) matrices. The random vectors εt\varepsilon_t are assumed to be independent from Ft1\mathcal{F}_{t-1} and conditionally indepedent from st\boldsymbol{s}_{t} given Ft1\mathcal{F}_{t-1}. The probabilities αm,t\alpha_{m,t} are referred to as the mixing weights.

For each m=1,...,Mm=1,...,M, the AR matrices Am,iA_{m,i}, i=1,...,pi=1,...,p are assumed to satisfy the usual stability condition of VAR models. Namely, denoting
Am=[Am,1Am,2Am,p1Am,pId0000Id000000Id0](dp×dp) \boldsymbol{A}_m= \begin{bmatrix} A_{m,1} & A_{m,2} & \cdots & A_{m,p-1} & A_{m,p} \\ I_d & 0 & \cdots & 0 & 0 \\ 0 & I_d & \cdots & 0 & 0 \\ \vdots & & \ddots & 0 & 0 \\ 0 & 0 & \cdots & I_d & 0 \end{bmatrix} \enspace (dp\times dp)
we assume that modulus of all eigenvalues of the matrix Am\boldsymbol{A}_m are smaller than one, for each m=1,...,Mm=1,...,M.

Based on the above definition, the conditional density function of yty_t given Ft1\mathcal{F}_{t-1}, f(Ft1)f(\cdot|\mathcal{F}_{t-1}), is given as
f(ytFt1)=m=1Mαm,t(2π)d/2det(Ωm)1/2exp{12(ytμm,t)´Ωm1(ytμm,t)}. f(y_t|\mathcal{F}_{t-1}) = \sum_{m=1}^M\alpha_{m,t}(2\pi)^{-d/2}\det(\Omega_m)^{-1/2}\exp\left\lbrace -\frac{1}{2}(y_t-\mu_{m,t})´\Omega_m^{-1}(y_t-\mu_{m,t}) \right\rbrace .
The distribution of yty_t given its past is thus a mixture of multivariate normal distributions with time varying mixing weights αm,t\alpha_{m,t}. The conditional mean and covariance matrix of yty_t given Ft1\mathcal{F}_{t-1} are obtained as
E[ytFt1]=m=1Mαm,tμm,t,Cov[ytFt1]=m=1Mαm,tΩm+m=1Mαm,t(μm,tn=1Mαn,tμn,t)(μm,tn=1Mαn,tμn,t). \text{E}[y_t|\mathcal{F}_{t-1}]=\sum_{m=1}^M\alpha_{m,t}\mu_{m,t}, \quad \text{Cov}[y_t|\mathcal{F}_{t-1}]=\sum_{m=1}^M\alpha_{m,t}\Omega_m + \sum_{m=1}^M\alpha_{m,t}\left(\mu_{m,t} - \sum_{n=1}^M\alpha_{n,t}\mu_{n,t} \right)\left(\mu_{m,t} - \sum_{n=1}^M\alpha_{n,t}\mu_{n,t} \right)'.

In order to define the mixing weights αm,t\alpha_{m,t}, consider first auxiliary stationary Gaussian VAR-processes zm,tz_{m,t}, and the related dpdp-dimensional random vectors zt,m=(zt,m,...,zm,tp+1)\boldsymbol{z}_{t,m}=(z_{t,m},...,z_{m,t-p+1}). The density of zt,m\boldsymbol{z}_{t,m} is
\begin{equation}\label{dpdensities}
n_{dp}(\boldsymbol{z}{t,m};\mu_m,\boldsymbol{\Sigma}{m,p})=(2\pi){-dp/2}\det(\boldsymbol{\Sigma}_{m,p}){-1/2}\exp\left\lbrace -\frac{1}{2}(\boldsymbol{z}{t,m} - \boldsymbol{1}p\otimes\mu_m)´\boldsymbol{\Sigma}{m,p}^{-1}(\boldsymbol{z}{t,m} - \boldsymbol{1}_p\otimes\mu_m)\right\rbrace ,
\end{equation}
where 1p=(1,...,1)\boldsymbol{1}_p = (1,...,1) (p×1)(p\times 1), μm=(Idi=1pAm,i)1φm,0\mu_m=(I_d - \sum_{i=1}^pA_{m,i})^{-1}\varphi_{m,0}, and Σm,p\boldsymbol{\Sigma}_{m,p} is obtained from vec(Σm,p)=(Idp2AmAm)1vec(Σm,ε)vec(\boldsymbol{\Sigma}_{m,p})=(I_{dp^2} - \boldsymbol{A}_m\otimes\boldsymbol{A}_m)^{-1}vec(\Sigma_{m,\varepsilon}) where
Σm,ε=[Ωm00000000](dp×dp). \Sigma_{m,\varepsilon} = \begin{bmatrix} \Omega_m & 0 & \cdots & 0 \\ 0 & 0 & \cdots & 0 \\ \vdots & & \ddots & \vdots \\ 0 & 0 & \cdots & 0 \\ \end{bmatrix} \enspace (dp\times dp).

Denoting yt1=(yt1,...,ytp)\boldsymbol{y}_{t-1}=(y_{t-1},...,y_{t-p}) (dp×1)(dp\times 1), the mixing weights of the GMVAR model are defined as
αm,t=αmndp(yt1;μm,Σm,p)n=1Mαnndp(yt1;μn,Σn,p), \alpha_{m,t} = \frac{\alpha_mn_{dp}(\boldsymbol{y}_{t-1};\mu_m,\boldsymbol{\Sigma}_{m,p})}{\sum_{n=1}^M\alpha_nn_{dp}(\boldsymbol{y}_{t-1};\mu_n,\boldsymbol{\Sigma}_{n,p})},
where the mixing weights parameters αm\alpha_m are assumed to satisfy m=1Mαm=1\sum_{m=1}^M\alpha_{m}=1 and the dpdp-dimensional normal densities are as in (\ref{dpdensities}). The probabilities for each regime occuring therefore depend on the level, variability, and temporal depedence of the past pp observations. This is a convenient feature for forecasting but it also enables the researcher to associate certain characteristics to different regimes.

Some theoretical properties of GMVAR model

Stationary distribution

\textbf{Theorem 1} Consider the GMVAR process yty_t that satisfies the model assumptions. Then yt=(yt,...,ytp+1)\boldsymbol{y}_t=(y_t,...,y_{t-p+1}) is a Markov Chain on Rdp\mathbb{R}^{dp} with stationary distribution characterized by the density
f(y;θ)=m=1Mαmndp(y;υm). f(\boldsymbol{y};\boldsymbol{\theta})=\sum_{m=1}^M\alpha_m n_{dp}(\boldsymbol{y};\upsilon_m).
Moreover, yt\boldsymbol{y}_t is ergodic.

Theorem 1 shows that the stationary distribution of yt\boldsymbol{y}_t is a mixture of M multinormal distributions with constant mixing weights αm\alpha_m. Consequently, all moments of the stationary distribution exist and are finite.

Interpretation of the mixing weights αm\alpha_m and αm,t\alpha_{m,t}

According to Theorem 1 αm\alpha_m immediately has the interpretation as the unconditional probability of the random vector yt\boldsymbol{y}_t being generated from the mmth mixture component. Consequently αm\alpha_m also represents the unconditional probability of the component yty_t being generated from the mmth mixture component (see Kalliovirta et al. 2016, Section 3.3 for more details).

As noted, the mixing weights αm,t\alpha_{m,t} are the conditional probabilities for the random vector yty_t being generated from the mmth mixture component. According the definition of the mixing weights, regimes with higher likelihood of the past pp observations are more likeli to generate the new observations but the likelihoods are also scaled with the mixing weight parameters αm\alpha_m.

Properties of the maximum likelihood estimator

Under general conditions (see Kalliovirta et al. 2016, Assumption 1), the maximum likelihood estimator for the parameters of the GMVAR model is strongly consistent.

Under general conditions (namely, Assumptions 1 and 2 in Kalliovirta et al. 2016), the maximum likelihood estimator has the conventional limiting distribution (normal).

General notations and unconstrained models

In this section, we cover the notation for parameter vectors (the same as in Kalliovirta et al. 2016 for unconstrained models) and explain how to impose linear constraints in gmvarkit.

General notations

Specifying the order of a GMVAR model requires specifying the autoregressive order p and the number of mixture components M. The number of time series in the system is denoted by d, and it’s assumed that d is larger than one.1 The form of the parameter vector depends on whether an unconstrained or constrained model is considered.

Parameter vector of unconstrained model

The parameter vector for unconstrained model is of size ((M(pd^2+d(d+1)/2+1)-1)x1) and has form
θ=(υ1,...,υM,α1,...,αM1),where\boldsymbol{\theta} = (\boldsymbol{\upsilon_1},...,\boldsymbol{\upsilon_M},\alpha_1,...,\alpha_{M-1}),\enspace \text{where}\quad\quad\enspace\quad
υm=(ϕm,0,ϕm,σm)((pd2+d(d+1)/2)x1),\boldsymbol{\upsilon_m}=(\phi_{m,0},\boldsymbol{\phi_m},\sigma_m)\enspace ((pd^2+d(d+1)/2)x1),\quad\quad\enspace
ϕm=(vec(Am,1),...,vec(Am,p))(pd2x1),and  \boldsymbol{\phi_m}=(vec(A_{m,1}),...,vec(A_{m,p}))\enspace (pd^2x1), \enspace \text{and}\quad\enspace\;
σm=vech(Ωm)((d(d+1)/2)x1),m=1,...,M.\sigma_m=vech(\Omega_m)\enspace ((d(d+1)/2)x1),\enspace m=1,...,M.

Above ϕm,0\phi_{m,0} denotes the intercept parameter of m:th mixture component, Am,iA_{m,i} denotes the coefficient matrix of m:th mixture component and i:th lag, Ωm\Omega_m denotes the (positive definite) error term covariance matrix, and αm\alpha_m denotes the mixing weight parameter of m:th mixture component. vec()vec() is a vectorization operator that stacks columns of a matrix into a vector and vech()vech() stacks columns of a matrix from the main diagonal downwards (including the main diagonal) into a vector.

The parameter vector above has “intercept” parametrization, referring to the intercept terms ϕm,0\phi_{m,0}. However, it’s also possible to use “mean” parametrization, where the intercept terms are simply replaced by the regimewise means μm=(Idi=1pAm,i)1ϕm,0\mu_m=(I_d-\sum_{i=1}^pA_{m,i})^{-1}\phi_{m,0}.

Models with linear constraints

General linear constraints and parameter vector

Imposing linear constraints on the autoregressive parameters of GMVAR model is straightforward in gmvarkit. The constraints are expressed in a rather general form allowing arbitrary linear constraints, but one needs to take the time to construct the constraint matrix carefully for each particular case.

We consider constraints of form
(ϕ1,...,ϕM)=Cψ,where(\boldsymbol{\phi_1},...,\boldsymbol{\phi_M}) = \boldsymbol{C}\boldsymbol{\psi},\enspace \text{where}\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\enspace
ϕm=(vec(Am,1),...,vec(Am,p))(pd2x1),m=1,...,M,\boldsymbol{\phi_m}=(vec(A_{m,1}),...,vec(A_{m,p}))\enspace (pd^2x1), \enspace m=1,...,M,
C\boldsymbol{C} is known (Mpd2xq)(Mpd^2xq) constraint matrix (of full column rank) and ψ\boldsymbol{\psi} is unknown (qx1)(qx1) parameter vector.

The parameter vector for constrained model is size ((M(d+d(d+1)/2+1)+q-1)x1) and has form
θ=(ϕ1,0,...,ϕM,0,ψ,α1,...,αM1),\boldsymbol{\theta} = (\phi_{1,0},...,\phi_{M,0},\boldsymbol{\psi},\alpha_1,...,\alpha_{M-1}),
where ψ\boldsymbol{\psi} is the (qx1)(qx1) parameter vector containing constrained autoregressive parameters. As in the case of regular models, instead of the intercept parametrization that takes use of intercept terms ϕm,0\phi_{m,0}, one may use the mean parametrization with regimewise means μm\mu_m instead (m=1,…,M).

Examples of linear constraints

Consider the following two common uses for linear constraints: restricting the autoregressive parameters to be the same for all regimes and constraining some parameters to zero. Of course also some other constraints may be useful, but we chose to show illustrative examples of these two as they occur in the cited article by Kalliovirta et al. (2016).

Restricting AR parameters to be the same for all regimes

To restrict the AR parameters to be the same for all regimes, we want ϕm\boldsymbol{\phi_m} to be the same for all m=1,…,M. The parameter vector ψ\boldsymbol{\psi} (qx1)(qx1) then corresponds to any ϕm=ϕ\boldsymbol{\phi_m}=\boldsymbol{\phi}, and therefore q=pd2q=pd^2. For the constraint matrix we choose
C=[Ipd2::Ipd2](Mpd2xpd2),\boldsymbol{C} = [I_{pd^2}:\cdots:I_{pd^2}]' \enspace (Mpd^2xpd^2),
that is, M pieces of (pd2xpd2)(pd^2xpd^2) diagonal matrices stacked on top of each other, because then Cψ=(ψ,...,ψ)=(ϕ,...,ϕ).\boldsymbol{C}\boldsymbol{\psi}=(\boldsymbol{\psi},...,\boldsymbol{\psi})=(\boldsymbol{\phi},...,\boldsymbol{\phi}).

Restricting AR parameters to be the same for all regimes and constraining non-diagonal elements of coefficient matrices to be zero

The previous example shows how to restrict the AR parameters to be the same for all regimes, but say we also want to constrain the non-diagonal elements of coefficient matrices Am,iA_{m,i} (m=1,…,M, i=1,…,p) to be zero. We have the constrained parameter ψ\boldsymbol{\psi} (qx1)(qx1) representing the unconstrained parameters (ϕ1,...,ϕM)(\boldsymbol{\phi_1},...,\boldsymbol{\phi_M}), where by assumption ϕm=ϕ=(vec(A1),...,vec(Ap))\boldsymbol{\phi_m}=\boldsymbol{\phi}=(vec(A_1),...,vec(A_p)) (pd2x1)(pd^2x1) and the elements of vec(Ai)vec(A_i) (i=1,…,p) corresponding to the diagonal are zero.

For illustrative purposes, let’s consider a GMVAR model with autoregressive degree p=2, number of mixture components M=2 and number of time series in the system d=2. Then we have
ϕ=(A1,(1,1),0,0,A1,(2,2),A2,(1,1),0,0,A2,(2,2))(8x1)and\boldsymbol{\phi}=(A_{1,(1,1)},0,0,A_{1,(2,2)},A_{2,(1,1)},0,0,A_{2,(2,2)}) \enspace (8x1) \enspace \text{and} ψ=(A1,(1,1),A1,(2,2),A2,(1,1),A2,(2,2))(4x1).\boldsymbol{\psi}=(A_{1,(1,1)},A_{1,(2,2)},A_{2,(1,1)},A_{2,(2,2)}) \enspace (4x1).\quad\quad\quad\quad\quad\enspace
By a direct calculation, we can see that choosing the constraint matrix
C=[c~c~](Mpd2x4),where\boldsymbol{C}=\left[{\begin{array}{c} \boldsymbol{\tilde{c}} \\ \boldsymbol{\tilde{c}} \\ \end{array}}\right] \enspace (Mpd^2x4), \enspace \text{where}

c~=[10000000000001000010000000000001](pd2x4)\boldsymbol{\tilde{c}}=\left[{\begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ \end{array}}\right] \enspace (pd^2x4)
satisfies Cψ=(ϕ,...,ϕ).\boldsymbol{C}\boldsymbol{\psi}=(\boldsymbol{\phi},...,\boldsymbol{\phi}).

Functions in gmvarkit

Estimating a GMVAR model

gmvarkit provides the function fitGMVAR for estimating a GMVAR model. The maximum likelihood estimation is performed in two phases: in the first phase fitGMVAR uses a genetic algorithm to find starting values for gradient based variable metric algorithm, which it then uses to finalize the estimation in the second phase. It’s important to keep in mind that it’s not guaranteed that the numerical estimation algorithms end up in the global maximum point rather than a local one. Because of multimodality and challenging surface of the log-likelihood function, it’s actually expected that most of the estimation rounds won’t find the global maximum point. For this reason one should always perform multiple estimation rounds. Parallel computing is used to obtain the results faster. The number of CPU cores used can be set with the argument ncores, and the number of estimation rounds can be controlled with the argument ncalls.

If the model estimates poorly, it is often because the number of mixture components M is chosen too large. One reason for is that the genetic algorithm is (with the default settings) designed to find solutions with non-zero mixing weights for all regimes. One may also adjust the settings of the genetic algorithm employed, or set up an initial population with guesses for the estimates. This can by done by passing arguments in fitGMVAR to the function GAfit which implements the genetic algorithm. To check the available settings, read the documentation ?GAfit. If the iteration limit is reached when estimating the model, the function iterate_more can be used to finish the estimation.

Parameters of the estimated model are printed in an illustrative and easy to read form. In order to easily compare approximate standard errors to certain estimates, one can print the approximate standard errors of the estimates in the same form with the function print_std_errors. Numerical approximation of the gradient and Hessian matrix of the log-likelihood at the estimates can be obtained conveniently with the functions get_gradient or get_foc and get_hessian, and eigenvalues of the Hessian can be obtained with the function get_soc.

Profile log-likelihood functions of the parameters around the estimates can be plotted with the function profile_logliks.

The estimated ob jects have their own print, plot, and summary methods.

The function alt_gmvar can be used to build GMVAR model based on an arbitrary estimation round. This can be used to consider estimates pointed by local maximums instead of the ML estimate.

Model diagnostics

gmvarkit considers model diagnostics based on multivariate extension of quantile residuals (see Kalliovirta and Saikkonen 2010), whose are under the correct model specification asymptotically multivariate standard normal distributed. The quantile residual tests introduced by Kalliovirta and Saikkonen (2010) can be performed with the function quantile_residual_tests by providing the estimated model (that is class gmvar ob ject) as an argument.

For graphical diagnostics one may use the function diagostic_plot, which enables one to plot the quantile residual time series, auto- and cross-correlation functions of quantile residuals or their squares, or quantile residual histograms and normal QQ-plots.

Constructing a GMVAR model without estimation

One may wish to construct an arbitrary GMVAR model without any estimation process, for example in order to simulate from a particular process of interest. An arbitrary model can be created with the function GMVAR. If one wants to add or update data to the model afterwards, it’s advisable to use the function add_data.

Simulating from GMVAR process

The function simulateGMVAR is the one for the job. As the main argument it uses a gmvar ob ject created with fitGMVAR or GMVAR.

Forecasting GMVAR process

The package gmvarkit contains predict method predict.gmvar for forecasting GMVAR processes. For one step predictions using the exact formula for conditional mean is supported, but the forecasts further than that are based on independent simulations. The predictions are either sample means or medians and the confidence intervals are based on sample quantiles. The ob jects generated by predict.gmvar have their own plot method. We also encourage directly using the function simulateGMVAR for forecasting (see the forecasting example in the documentation ?simulateGMVAR).

Univariate analysis

Use the package uGMAR for analysing univariate time series.

References

  • Kalliovirta L., Meitz M. and Saikkonen P. (2016) Gaussian mixture vector autoregression. Journal of Econometrics, 192, 485-498.
  • Kalliovirta L. and Saikkonen P. (2010) Reliable Residuals for Multivariate Nonlinear Time Series Models. Unpublished Revision of HECER Discussion Paper No. 247.

  1. For univariate analysis one may use the package uGMAR. ↩︎

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群