# Local sampler

Link to the source files:

## Generic model

A model must be an immutable type with an associated `gradloglik`

function. It is important this function be coded *as efficiently as possible* since it is called a large number of time in any simulation.

## Multivariate Gaussian

### Hierarchy of types

Multiple parametrisation are possible. Some being more efficient than others while some are maybe more intuitive than others.

```
MvGaussian (abstract)
| — MvGaussianStandard
| — MvDiagonalGaussian
| — MvGaussianCanon
| — MvGaussianNatural
```

In the sequel we write $\mu$ the mean, $\Sigma$ the covariance matrix and $\Omega$ the precision matrix. The different way to parametrise the distributions are as follows:

`MvGaussianStandard`

, direct: $(\mu, \Sigma)$, indirect: (\Omega\mu,\Omega)`MvDiagonalGaussian`

, direct: $(\mu, (\sigma_i))$, indirect: $(\sigma_i^2)$`MvGaussianCanon`

, direct: $(\mu, \Omega)$, indirect: $(\Omega\mu)$`MvGaussianNatural`

, direct: $(\Omega\mu,-\Omega)$

The preferred way is the "canonical" representation (most efficient).

**Note**: "direct" means that these are the parameters passed to the constructor while "indirect" means that these values are computed when the constructor is called.

### Auxiliary functions

Internally, the types mentioned above are shortened to `MvGS`

, `MvDG`

etc. Then a number of simplifying functions are defined (these simplify the computation of the log-likelihood and gradient of the log-likelihood)

`mvg_mu`

to recover $\mu$`mvg_precmu`

to recover $\Omega\mu$`mvg_precmult`

taking a point and multiplying it by $\Omega$

`gradloglik`

is then trivial to compute.

## Logistic Regression

The logistic regression considers a feature matrix `X`

, a response `y`

, the Lipschitz constant associated to it and dimensionality parameters.

### Auxiliary functions

A number of auxiliary functions are defined to prevent numerical instabilities and ensure that the computation of the log-likelihood and gradient of the log-likelihood can be expressed simply.

The `gradloglik_cv`

considers a control-variate gradient developed around a given point (see this paper for more details).

**Note**: the response is in $\{-1,1\}$.

## Probabilistic Matrix Factorisation

This model considers a normal distribution on every entry of a matrix $r_{ij}$:

\begin{equation} \mathcal N(r_{ij}; \langle u,v\rangle , \sigma^2) \end{equation}

The resulting intensity can be shown to be a truncated cubic for which we can in fact also do exact sampling.

The `pmf_case*`

correspond to the various possible cases depending on where the roots of the cubic are.