全部版块 我的主页
论坛 数据科学与人工智能 数据分析与数据科学 MATLAB等数学软件专版
2020-12-26 14:39:24
$$
\begin{align}
h(x) &= \int \left( \frac{f(x) + g(x)}{1+ f^{2}(x)}
+ \frac{1+ f(x)g(x)}{\sqrt{1 - \sin x}}
\right) \, dx\label{E:longInt}\\
&= \int \frac{1 + f(x)}{1 + g(x) } \, dx
- 2 \tan^{-1}(x-2)\notag
\end{align}
$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-26 14:39:55
$$
f(x)=
\begin{cases}
-x^{2}, &\text{if $x < 0$;}\\
\alpha + x, &\text{if $0 \leq x \leq 1$;}\\
x^{2}, &\text{otherwise.}
\end{cases}
$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-26 15:38:08
$$\left| \frac{a + b}{2} \right|, \quad \left\| A^{2} \right\|,
\quad \left( \frac{a}{2}, b \right],
\quad \left. F(x) \right|_{a}^{b}$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-26 15:38:45
$$
\begin{alignat*}{2}
(A + B C)x &+ &C &y = 0,\\
Ex &+ &(F + G)&y = 23.
\end{alignat*}
$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-26 15:39:08
$$
f(x) \overset{ \text{def} }{=} x^{2} - 1
$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-26 15:40:42
$$
\begin{alignat}{4}
a_{11}x_1 &+ a_{12}x_2 &&+ a_{13}x_3 &&
&&= y_1,\\
a_{21}x_1 &+ a_{22}x_2 && &&+ a_{24}x_4
&&= y_2,\\
a_{31}x_1 & &&+ a_{33}x_3 &&+ a_{34}x_4
&&= y_3.
\end{alignat}
$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-26 15:41:30
$$
\left(
\begin{matrix}
a + b + c & uv & x - y & 27\\
a+b &u+v&z & 1340
\end{matrix}
\right) =
\left(
\begin{matrix}
1 & 100 & 115 & 27\\
201 & 0 & 1 & 1340
\end{matrix}
\right)
$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-27 18:03:09
Strictly speaking, Shannon's formula  is a measure for uncertainty, which increases with the number of bits needed to optimally encode a sequence of realizations of $J$.
In order to measure the information flow between two processes, Shannon entropy is combined with the concept of the Kullback-Leibler distance [@KL51] and by assuming that the underlying processes evolve over time according to a Markov process [@schreiber2000].
Let $I$ and $J$ denote two discrete random variables with marginal probability distributions $p(i)$ and $p(j)$ and joint probability distribution $p(i,j)$, whose dynamical structures correspond to stationary Markov processes of order $k$ (process $I$) and $l$ (process $J$).
The Markov property implies that the probability to observe $I$ at time $t+1$ in state $i$ conditional on the $k$ previous observations is $p(i_{t+1}|i_t,...,i_{t-k+1})=p(i_{t+1}|i_t,...,i_{t-k})$.
The average number of bits needed to encode the observation in $t+1$ if the previous $k$ values are known is given by
  
$$
  h_I(k)=- \sum_i p\left(i_{t+1}, i_t^{(k)}\right) \cdot log \left(p\left(i_{t+1}|i_t^{(k)}\right)\right),
$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-27 18:03:26
where $i^{(k)}_t=(i_t,...,i_{t-k+1})$. $h_J(l)$ can be derived analogously for process $J$.
In the bivariate case, information flow from process $J$ to process $I$ is measured by quantifying the deviation from the generalized Markov property $p(i_{t+1}| i_t^{(k)})=p(i_{t+1}| i_t^{(k)},j_t^{(l)})$ relying on the Kullback-Leibler distance [@schreiber2000].
Thus, (Shannon) transfer entropy is given by
  
$$
  T_{J \rightarrow I}(k,l) = \sum_{i,j} p\left(i_{t+1}, i_t^{(k)}, j_t^{(l)}\right) \cdot log \left(\frac{p\left(i_{t+1}| i_t^{(k)}, j_t^{(l)}\right)}{p\left(i_{t+1}|i_t^{(k)}\right)}\right),
$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-27 18:03:44
where $T_{J\rightarrow I}$ consequently measures the information flow from $J$ to $I$ ( $T_{I \rightarrow J}$ as a measure for the information flow from $I$ to $J$ can be derived analogously).

Transfer entropy can also be based on Rényi entropy [@R70] rather than Shannon entropy.
Rényi entropy introduces a weighting parameter $q>0$ for the individual probabilities $p(j)$ and can be calculated as
$$
  H^q_J = \frac{1}{1-q} log \left(\sum_j p^q(j)\right).
$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-27 18:04:03
For $q\rightarrow 1$, Rényi entropy converges to Shannon entropy.
For $0<q<1$  events that have a low probability to occur receive more weight, while for $q>1$ the weights induce a preference for outcomes $j$ with a higher initial probability.
Consequently, Rényi entropy provides a more flexible tool for estimating uncertainty, since different areas of a distribution can be emphasized, depending on the parameter $q$.

Using the escort distribution [for more information, see @BeckS93] $\phi_q(j)=\frac{p^q(j)}{\sum_j p^q(j)}$ with $q >0$ to normalize the weighted distributions, @JKS12 derive the Rényi transfer entropy measure as
$$
  RT_{J \rightarrow I}(k,l) = \frac{1}{1-q} log \left(\frac{\sum_i \phi_q\left(i_t^{(k)}\right)p^q\left(i_{t+1}|i^{(k)}_t\right)}{\sum_{i,j} \phi_q\left(i^{(k)}_t,j^{(l)}_t\right)p^q\left(i_{t+1}|i^{(k)}_t,j^{(l)}_t \right)}\right).
$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-27 18:04:24
Analogously to (Shannon) transfer entropy, Rényi transfer entropy measures the information flow from $J$ to $I$.
Note that, contrary to Shannon transfer entropy, the calculation of Rényi transfer entropy can result in negative values.
In such a situation, knowing the history of $J$ reveals even greater risk than would otherwise be indicated by only knowing the history of $I$ alone. For more details on this issue see @JKS12.
  
The above transfer entropy estimates are commonly biased due to small sample effects.
A remedy is provided by the effective transfer entropy [@MK02], which is computed in the following way:
  
$$
  ET_{J \rightarrow I}(k,l)=  T_{J \rightarrow I}(k,l)- T_{J_{\text{shuffled}} \rightarrow I}(k,l),
$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-27 18:04:42
where $T_{J_{\text{shuffled}} \rightarrow I}(k,l)$ indicates the transfer entropy using a shuffled  version of the time series of $J$.
Shuffling implies randomly drawing values from the time series of $J$ and realigning them to generate a new time series.
This procedure destroys the time series dependencies of $J$ as well as the statistical dependencies between $J$ and $I$.
As a result $T_{J_{\text{shuffled}} \rightarrow I}(k,l)$ converges to zero with increasing sample size and any nonzero value of $T_{J_{\text{shuffled}} \rightarrow I}(k,l)$ is due to small sample effects.
The transfer entropy estimates from shuffled data can therefore be used as an estimator for the bias induced by these small sample effects.
To derive a consistent estimator, shuffling is repeated many times and the average of the resulting shuffled transfer entropy estimates across all replications is subtracted from the Shannon or Rényi transfer entropy estimate to obtain a bias corrected effective transfer entropy estimate.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-27 18:05:06
In order to assess the statistical significance of transfer entropy estimates, we rely on  a Markov block bootstrap as proposed by @Dimpfl2013.
In contrast to shuffling, the Markov block bootstrap preserves the dependencies within each time series.
Thereby, it generates the distribution of transfer entropy estimates under the null hypothesis of no information transfer, i.e. randomly drawn blocks of process $J$ are realigned to form a simulated series, which retains the univariate dependencies of $J$ but eliminates the statistical dependencies between $J$ and $I$.
Shannon or Rényi transfer entropy is then estimated based on the simulated time series.
Repeating this procedure yields the distribution of the transfer entropy estimate under the null of no information flow.
The p-value associated with the null hypothesis of no information transfer is given by $1-\hat{q}_{TE}$, where $\hat{q}_{TE}$ denotes the quantile of the simulated distribution that corresponds to the original transfer entropy estimate.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-27 18:05:27
Before we turn to different applications below, we provide a simple example here to demonstrate how the outputs of the different functions look like.
Let us consider a linear relationship between two random variables $X$ and $Y$, where $Y$ depends on $X$ with one lag and $X$ is independent of $Y$:


$$
\begin{split}
x_t & = & 0.2x_{t-1} + \varepsilon_{x,t} \\
y_t & = & x_{t-1} + \varepsilon_{y,t},
\end{split}
$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-27 18:05:57
Consider again the above example, where the results show a significant information flow from $X$ to $Y$, but not vice versa.
Similar conclusions could be drawn from using a vector autoregressive model and testing for Granger causality.
However, the main advantage of using transfer entropy is that it is not limited to linear relationships.
Consider the following nonlinear relation between $X$ and $Y$, where, again, only $Y$ depends on $X$:
$$
\begin{split}
x_t & = & 0.2x_{t-1} + \varepsilon_{x,t}\\
y_t & = & \sqrt{\mid x_{t-1}\mid} + \varepsilon_{y,t},
\end{split}
$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-27 18:06:24
To illustrate the use of Rényi transfer entropy, we simulate data for which the dependence of $Y$ on $X$ changes with the level of the innovation.

$$
\begin{split}
x_t & = & 0.2x_{t-1} + \varepsilon_{x,t}\\
y_t & = & \begin{cases} \phantom{0.3}x_{t-1} + \varepsilon_{y,t} \quad \text{if } |\varepsilon_{y,t}| > s \\ 0.2x_{t-1} + \varepsilon_{y,t} \quad \text{if } |\varepsilon_{y,t}| < s
\end{cases},
\end{split}
$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-28 14:11:09
Even though we really have a vanilla formulation, we can still consider a direct nonconvex formulation. Consider the risk expression:
$$R(\mathbf{w}) = \sum_{i=1}^{N}\left(\frac{w_{i}\left(\boldsymbol{\Sigma}\mathbf{w}\right)_i}{\mathbf{w}^T\boldsymbol{\Sigma}\mathbf{w}}-b_i\right)^{2}.$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-28 14:11:39
Consider now the risk expression:
$$R(\mathbf{w}) = \sum_{i=1}^{N}\left(\frac{w_{i}\left(\boldsymbol{\Sigma}\mathbf{w}\right)_i}{\sqrt{\mathbf{w}^T\boldsymbol{\Sigma}\mathbf{w}}}-b_i\sqrt{\mathbf{w}^T\boldsymbol{\Sigma}\mathbf{w}}\right)^{2} = \sum_{i=1}^{N}\left(\frac{r_i}{\sqrt{\mathbf{1}^T\mathbf{r}}}-b_i\sqrt{\mathbf{1}^T\mathbf{r}}\right)^{2}.$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-28 14:13:28
From Euler's theorem, the volatility of the portfolio $\sigma\left(\mathbf{w}\right)=\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}$ can be decomposed as
$$\sigma\left(\mathbf{w}\right)=\sum_{i=1}^{N}w_i\frac{\partial\sigma}{\partial w_i}
= \sum_{i=1}^N\frac{w_i\left(\boldsymbol{\Sigma}\mathbf{w}\right)_{i}}{\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}}.$$

The **risk contribution (RC)** from the $i$th asset to the total risk $\sigma(\mathbf{w})$ is defined as
$${\sf RC}_i =\frac{w_i\left(\boldsymbol{\Sigma}\mathbf{w}\right)_i}{\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}}$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-28 14:13:48
which satisfies $\sum_{i=1}^{N}{\sf RC}_i=\sigma\left(\mathbf{w}\right)$.

The **relative risk contribution (RRC)** is a normalized version:
$${\sf RRC}_i = \frac{w_i\left(\boldsymbol{\Sigma}\mathbf{w}\right)_i}{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-28 14:14:16
so that $\sum_{i=1}^{N}{\sf RRC}_i=1$.

The **risk parity portfolio (RPP)** attemps to “equalize” the risk contributions:
$${\sf RC}_i = \frac{1}{N}\sigma(\mathbf{w})\quad\text{or}\quad{\sf RRC}_i = \frac{1}{N}.$$

More generally, the **risk budgeting portfolio (RBP)** attemps to allocate the risk according to the risk profile determined by the weights $\mathbf{b}$ (with $\mathbf{1}^T\mathbf{b}=1$ and $\mathbf{b}\ge \mathbf{0}$):
$${\sf RC}_i = b_i \sigma(\mathbf{w})\quad\text{or}\quad{\sf RRC}_i = b_i.$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-28 14:14:34
In practice, one can express the condition ${\sf RC}_i = \frac{1}{N}\sigma(\mathbf{w})$ in different equivalent ways such as
$$w_i(\Sigma \mathbf{w})_{i} = w_j(\Sigma \mathbf{w})_{j}, \quad\forall i, j.$$ The budget condition ${\sf RC}_i = b_i \sigma(\mathbf{w})$ can also be expressed as
$$w_i (\Sigma \mathbf{w})_i = b_i \mathbf{w}^{T}\Sigma\mathbf{w}, \quad\forall i.$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-28 14:15:03
Assuming that the assets are uncorrelated, i.e., that $\boldsymbol{\Sigma}$ is diagonal, and simply using the volatilities $\boldsymbol{\sigma} = \sqrt{{\sf diag(\boldsymbol{\Sigma})}}$, one obtains
$$\mathbf{w} = \frac{\boldsymbol{\sigma}^{-1}}{\mathbf{1}^T\boldsymbol{\sigma}^{-1}}$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-28 14:15:30
or, more generally,
$$\mathbf{w} = \frac{\sqrt{\mathbf{b}}\odot\boldsymbol{\sigma}^{-1}}{\mathbf{1}^T\left(\sqrt{\mathbf{b}}\odot\boldsymbol{\sigma}^{-1}\right)}.$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-28 14:16:16
## Vanilla convex formulation
Suppose we only have the constraints $\mathbf{1}^T\mathbf{w}=1$ and $\mathbf{w} \ge \mathbf{0}$. Then, after the change of variable $\mathbf{x}=\mathbf{w}/\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}$, the equations $w_i (\Sigma \mathbf{w})_i = b_i \mathbf{w}^{T}\Sigma\mathbf{w}$ become $x_i\left(\boldsymbol{\Sigma}\mathbf{x}\right)_i = b_i$ or, more compactly in vector form, as
$$\boldsymbol{\Sigma}\mathbf{x} = \mathbf{b}/\mathbf{x}$$

with $\mathbf{x} \ge \mathbf{0}$ and we can always recover the portfolio by normalizing: $\mathbf{w} = \mathbf{x}/(\mathbf{1}^T\mathbf{x})$.

At this point, one could use a nonlinear multivariate root finder for $\boldsymbol{\Sigma}\mathbf{x} = \mathbf{b}/\mathbf{x}$. For example, in R we can use the package [rootSolve](https://CRAN.R-project.org/package=rootSolve).


With the goal of designing risk budget portfolios, Spinu proposed in [@Spinu2013] to solve the
following convex optimization problem:
$$\underset{\mathbf{x}\ge\mathbf{0}}{\textsf{minimize}} \quad \frac{1}{2}\mathbf{x}^{T}\boldsymbol{\Sigma}\mathbf{x} - \sum_{i=1}^{N}b_i\log(x_i),$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-28 14:17:14
where the portfolio can be recovered as $\mathbf{w} = \mathbf{x}/(\mathbf{1}^T\mathbf{x})$.

Indeed, Spinu realized in [@Spinu2013] that precisely the risk budgeting equation $\boldsymbol{\Sigma}\mathbf{x} = \mathbf{b}/\mathbf{x}$ corresponds to the gradient of the convex function $f(\mathbf{x}) = \frac{1}{2}\mathbf{x}^{T}\boldsymbol{\Sigma}\mathbf{x} - \mathbf{b}^T\log(\mathbf{x})$ set to zero:
$$\nabla f(\mathbf{x}) = \boldsymbol{\Sigma}\mathbf{x} - \mathbf{b}/\mathbf{x} = \mathbf{0}.$$

Thus, a convenient way to solve the problem is by solving the following convex optimization problem:
$$\underset{\mathbf{x}\ge\mathbf{0}}{\textsf{minimize}} \quad \frac{1}{2}\mathbf{x}^{T}\boldsymbol{\Sigma}\mathbf{x} - \mathbf{b}^T\log(\mathbf{x})$$

which has optimality condition $\boldsymbol{\Sigma}\mathbf{x} = \mathbf{b}/\mathbf{x}$.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-28 14:18:03
## General nonconvex formulation
The previous methods are based on a convex reformulation of the problem so they are guaranteed to converge to the optimal risk budgeting solution. However, they can only be employed for the simplest risk budgeting formulation with a simplex constraint set (i.e., $\mathbf{1}^T\mathbf{w}=1$ and $\mathbf{w} \ge \mathbf{0}$). They cannot be used if

- we have other constraints like allowing shortselling or box constraints: $l_i \le w_i \le u_i$
- on top of the risk budgeting constraints $w_i\left(\boldsymbol{\Sigma}\mathbf{w}\right)_i = b_i \;\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}$ we have other objectives like maximizing the expected return $\mathbf{w}^T\boldsymbol{\mu}$ or minimizing the overall variance $\mathbf{w}^T\boldsymbol{\Sigma}\mathbf{w}$.

For those more general cases, we need more sophisticated formulations, which unfortunately are not convex. The idea is to try to achieve equal risk contributions
${\sf RC}_i = \frac{w_i\left(\boldsymbol{\Sigma}\mathbf{w}\right)_i}{\sqrt{\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}}}$
by penalizing the differences between the terms $w_{i}\left(\boldsymbol{\Sigma}\mathbf{w}\right)_{i}$.

There are many reformulations possible. For illustrative purposes, one such formulation is
$$\begin{array}{ll}
\underset{\mathbf{w}}{\textsf{minimize}} & \sum_{i,j=1}^{N}\left(w_{i}\left(\boldsymbol{\Sigma}\mathbf{w}\right)_{i}-w_{j}\left(\boldsymbol{\Sigma}\mathbf{w}\right)_{j}\right)^{2} \color{blue}{- \;F(\mathbf{w})}\\
\textsf{subject to} & \mathbf{w} \ge \mathbf{0}, \quad\mathbf{1}^T\mathbf{w}=1, \quad\color{blue}{\mathbf{w}\in\cal{W}}
\end{array}$$

where $F(\mathbf{w})$ denotes some additional objective function and $\cal{W}$ denotes an arbitrary convex set of constraints. More expressions for the risk concentration terms are listed in
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-28 14:18:20
## RPP with additional variance term
Similarly, the `riskParityPortfolio` package allows us to include the variance $\mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}$
as an objective term:
$$\begin{array}{ll}
\underset{\mathbf{w}}{\textsf{minimize}} &
R(\mathbf{w}) + \lambda_{\sf var} \mathbf{w}^{T}\boldsymbol{\Sigma}\mathbf{w}\\
\textsf{subject to} & \mathbf{1}^T\mathbf{w}=1, \mathbf{w} \ge \mathbf{0},
\end{array}$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2020-12-28 14:18:40
## RPP with general linear constraints
In version 2.0, we added support for general linear constraints, i.e., `riskParityPortfolio` is now able
to solve the following problem:
$$\begin{array}{ll}
\underset{\mathbf{w}}{\textsf{minimize}} &
R(\mathbf{w}) + \lambda F(\mathbf{w})\\
\textsf{subject to} & \mathbf{C}\mathbf{w} = \mathbf{c},~~\mathbf{D}\mathbf{w} \leq \mathbf{d}.
\end{array}$$
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群