In the previous tutorial we used variables and methods to find properties and traits of the Normal distribution. Before that we looked at statistical methods and construction of the Normal distribution. In this tutorial we look at multivariate distributions, which should feel similar to univariate distributions.

Constructing a Multivariate Distribution

Construction of a multivariate distribution is identical to a univariate distribution, except that a vector input is likely to be required for one of the parameters. In keeping with our running Normal example, we will now use the Multivariate Normal distribution.

MN <- MultivariateNormal$new(mean = c(0,0), cov = c(1,0,0,1))
MultivariateNormal$new() # This is in fact the default
#> MultiNorm(mean = c(0, 0), cov = c(1, 0, 0, 1), prec = c(1, 0, 0, 1))

Notice how this is almost identical to constructing a univariate Normal distribution. We even allow multiple parameterisations

MultivariateNormal$new(mean = c(0,0), prec = c(1,0,0,1))
#> MultiNorm(mean = c(0, 0), cov = c(1, 0, 0, 1), prec = c(1, 0, 0, 1))

d/p/q/r

The biggest difference between univariate and multivariate distributions is in how arguments are passed to the d/p/q/r methods. This differs slightly from R stats. For example to evaluate the pdf of the multinomial distribution at (1,2) in R stats we would run

dmultinom(c(1,2), size = 3, prob = c(0.2,0.8))
#> [1] 0.384

Whereas in distr6, each point is its own argument

Multinomial$new(size = 3, probs = c(0.2,0.8))$pdf(1,2)
#> [1] 0.384

There is a very important reason for this: vectorisation. In R stats there is no way to generate multiple points from a multivariate distribution, whereas in distr6…

MN$pdf(c(1,2), c(2,3))
#> [1] 0.0130642333 0.0002392798
MN$rand(5)
#>           V1          V2
#> 1: 1.3709584 -0.56469817
#> 2: 0.3631284  0.63286260
#> 3: 0.4042683 -0.10612452
#> 4: 1.5115220 -0.09465904
#> 5: 2.0184237 -0.06271410

Note: cdf() and quantile() are often omitted from multivariate distributions n distr6 as no closed form analytic expression exists.

Summary

In this tutorial we looked at multivariate distributions and discussed the difference between distr6 and R stats in using the d/p/q/r functions. The next tutorial concludes the ‘Basic’ set of tutorials with a look at listing in distr6 to help you navigate the package more easily.