`vignettes/webs/constructing_a_distribution.rmd`

`constructing_a_distribution.rmd`

First and foremost, distr6 is a package for probability distributions in R. Currently there are 36 distributions implemented in distr6 and another 11 kernels. These tutorials try to follow the journey for using, editing and analysing a probability distribution. Don’t worry if you’ve never used R6 before, these tutorials assume no prior knowledge of R6. If you’re interested in learning the basics of R6, see the tutorial “Introduction to R6”, this tutorial is particularly useful for understanding how to copy (or clone) distributions in R.

We will use the Normal probability distribution as a running example.

All distributions are constructed using the distribution name, followed by `$new(...)`

where `...`

are arguments to the constructor. This is different from base R in which a function is called directly. To construct a Standard Normal distribution,

N <- Normal$new()

Printing the distribution shows the distribution name and parameterisation. Notice how printing works the same as always in R, i.e. with `print()`

.

print(N) #> Norm(mean = 0, var = 1)

To access useful statistics, distr6 includes a summary method

summary(N) #> Normal Probability Distribution. Parameterised with: #> c("mean", "var") = c(0, 1) #> #> Quick Statistics #> Mean: 0 #> Variance: 1 #> Skewness: 0 #> Ex. Kurtosis: 0 #> #> Support: ℝ Scientific Type: ℝ #> #> Traits: continuous; univariate #> Properties: symmetric; mesokurtic; no skew

And a shortened form is available too

summary(N, full = F) #> Norm(mean = 0, var = 1) #> Scientific Type: ℝ See $traits for more #> Support: ℝ See $properties for more

The distribution parameterisation refers to the parameters that are used to define the distribution. The choice of parameterisation often depends on how you want to the distribution to be interpreted and if fitting procedures will be utilised.

In the case of the Normal distribution, several parameterisations are possible. These include with mean and variance, standard deviation or precision. The parameterisation is specified in construction by naming the parameters, for example mean and standard deviation:

Normal$new(mean = 2, sd = 2) #> Norm(mean = 2, sd = 2)

Or mean and precision:

Normal$new(mean = 2, prec = 2) #> Norm(mean = 2, prec = 2)

Notice how the parameterisation chosen in construction is shown in the print method. Finally be careful not to construct a distribution using conflicting parameterisations. For example:

Normal$new(mean = 2, var = 2, sd = 3, prec = 4) #> Norm(mean = 2, prec = 4)

Only the mean and precision arguments are used, if you’re unsure which parameters are conflicting see the help page for the distribution, `?Normal`

. To have the distribution ‘tell’ you which parameters are used in construction, add the `verbose = T`

argument:

Normal$new(verbose = TRUE) #> Parameterised with mean and var. #> Norm(mean = 0, var = 1) Normal$new(mean = 2, var = 2, sd = 3, prec = 4, verbose = TRUE) #> Parameterised with mean and prec. #> Norm(mean = 2, prec = 4)

In this tutorial we looked at constructing a distribution with different parameterisations. In the next tutorial we look at getting and setting parameters and how this functions with your chosen parameterisation.