A generalised distribution object for defining custom probability distributions as well as serving as the parent class to specific, familiar distributions.

Value

Returns R6 object of class Distribution.

Public fields

name

Full name of distribution.

short_name

Short name of distribution for printing.

description

Brief description of the distribution.

Active bindings

decorators

Returns decorators currently used to decorate the distribution.

traits

Returns distribution traits.

valueSupport

Deprecated, use $traits$valueSupport.

variateForm

Deprecated, use $traits$variateForm.

type

Deprecated, use $traits$type.

properties

Returns distribution properties, including skewness type and symmetry.

support

Deprecated, use $properties$type.

symmetry

Deprecated, use $properties$symmetry.

sup

Returns supremum (upper bound) of the distribution support.

inf

Returns infimum (lower bound) of the distribution support.

dmax

Returns maximum of the distribution support.

dmin

Returns minimum of the distribution support.

kurtosisType

Deprecated, use $properties$kurtosis.

skewnessType

Deprecated, use $properties$skewness.

Methods

Public methods


Method new()

Creates a new instance of this R6 class.

Usage

Distribution$new(
  name = NULL,
  short_name = NULL,
  type,
  support = NULL,
  symmetric = FALSE,
  pdf = NULL,
  cdf = NULL,
  quantile = NULL,
  rand = NULL,
  parameters = NULL,
  decorators = NULL,
  valueSupport = NULL,
  variateForm = NULL,
  description = NULL,
  .suppressChecks = FALSE
)

Arguments

name

character(1)
Full name of distribution.

short_name

character(1)
Short name of distribution for printing.

type

([set6::Set])
Distribution type.

support

([set6::Set])
Distribution support.

symmetric

logical(1)
Symmetry type of the distribution.

pdf

function(1)
Probability density function of the distribution. At least one of pdf and cdf must be provided.

cdf

function(1)
Cumulative distribution function of the distribution. At least one of pdf and cdf must be provided.

quantile

function(1)
Quantile (inverse-cdf) function of the distribution.

rand

function(1)
Simulation function for drawing random samples from the distribution.

parameters

([ParameterSet])
Parameter set for defining the parameters in the distribution, which should be set before construction.

decorators

(character())
Decorators to add to the distribution during construction.

valueSupport

(character(1))
The support type of the distribution, one of "discrete", "continuous", "mixture". If NULL, determined automatically.

variateForm

(character(1))
The variate type of the distribution, one of "univariate", "multivariate", "matrixvariate". If NULL, determined automatically.

description

(character(1))
Optional short description of the distribution.

.suppressChecks

(logical(1))
Used internally.


Method strprint()

Printable string representation of the Distribution. Primarily used internally.

Usage

Distribution$strprint(n = 2)

Arguments

n

(integer(1))
Number of parameters to display when printing.


Method print()

Prints the Distribution.

Usage

Distribution$print(n = 2, ...)

Arguments

n

(integer(1))
Passed to $strprint.

...

ANY
Unused. Added for consistency.


Method summary()

Prints a summary of the Distribution.

Usage

Distribution$summary(full = TRUE, ...)

Arguments

full

(logical(1))
If TRUE (default) prints a long summary of the distribution, otherwise prints a shorter summary.

...

ANY
Unused. Added for consistency.


Method parameters()

Returns the full parameter details for the supplied parameter.

Usage

Distribution$parameters(id = NULL)

Arguments

id

character()
id of parameter value to return.


Method getParameterValue()

Returns the value of the supplied parameter.

Usage

Distribution$getParameterValue(id, error = "warn")

Arguments

id

character()
id of parameter value to return.

error

(character(1))
If "warn" then returns a warning on error, otherwise breaks if "stop".


Method setParameterValue()

Sets the value(s) of the given parameter(s).

Usage

Distribution$setParameterValue(..., lst = NULL, error = "warn")

Arguments

...

ANY
Named arguments of parameters to set values for. See examples.

lst

(list(1))
Alternative argument for passing parameters. List names should be parameter names and list values are the new values to set.

error

(character(1))
If "warn" then returns a warning on error, otherwise breaks if "stop".

Examples

b = Binomial$new()
b$setParameterValue(size = 4, prob = 0.4)
b$setParameterValue(lst = list(size = 4, prob = 0.4))


Method pdf()

For discrete distributions the probability mass function (pmf) is returned, defined as $$p_X(x) = P(X = x)$$ for continuous distributions the probability density function (pdf), \(f_X\), is returned $$f_X(x) = P(x < X \le x + dx)$$ for some infinitesimally small \(dx\).

If available a pdf will be returned using an analytic expression. Otherwise, if the distribution has not been decorated with FunctionImputation, NULL is returned.

Usage

Distribution$pdf(..., log = FALSE, simplify = TRUE, data = NULL)

Arguments

...

(numeric())
Points to evaluate the function at Arguments do not need to be named. The length of each argument corresponds to the number of points to evaluate, the number of arguments corresponds to the number of variables in the distribution. See examples.

log

(logical(1))
If TRUE returns the logarithm of the probabilities. Default is FALSE.

simplify

logical(1)
If TRUE (default) simplifies the return if possible to a numeric, otherwise returns a data.table::data.table.

data

array
Alternative method to specify points to evaluate. If univariate then rows correspond with number of points to evaluate and columns correspond with number of variables to evaluate. In the special case of VectorDistributions of multivariate distributions, then the third dimension corresponds to the distribution in the vector to evaluate.

Examples

b &lt;- Binomial$new()
b$pdf(1:10)
b$pdf(1:10, log = TRUE)
b$pdf(data = matrix(1:10))

mvn &lt;- MultivariateNormal$new()
mvn$pdf(1, 2)
mvn$pdf(1:2, 3:4)
mvn$pdf(data = matrix(1:4, nrow = 2), simplify = FALSE)


Method cdf()

The (lower tail) cumulative distribution function, \(F_X\), is defined as $$F_X(x) = P(X \le x)$$ If lower.tail is FALSE then \(1 - F_X(x)\) is returned, also known as the survival function.

If available a cdf will be returned using an analytic expression. Otherwise, if the distribution has not been decorated with FunctionImputation, NULL is returned.

Usage

Distribution$cdf(
  ...,
  lower.tail = TRUE,
  log.p = FALSE,
  simplify = TRUE,
  data = NULL
)

Arguments

...

(numeric())
Points to evaluate the function at Arguments do not need to be named. The length of each argument corresponds to the number of points to evaluate, the number of arguments corresponds to the number of variables in the distribution. See examples.

lower.tail

(logical(1))
If TRUE (default), probabilities are X <= x, otherwise, P(X > x).

log.p

(logical(1))
If TRUE returns the logarithm of the probabilities. Default is FALSE.

simplify

logical(1)
If TRUE (default) simplifies the return if possible to a numeric, otherwise returns a data.table::data.table.

data

array
Alternative method to specify points to evaluate. If univariate then rows correspond with number of points to evaluate and columns correspond with number of variables to evaluate. In the special case of VectorDistributions of multivariate distributions, then the third dimension corresponds to the distribution in the vector to evaluate.

Examples

b &lt;- Binomial$new()
b$cdf(1:10)
b$cdf(1:10, log.p = TRUE, lower.tail = FALSE)
b$cdf(data = matrix(1:10))


Method quantile()

The quantile function, \(q_X\), is the inverse cdf, i.e. $$q_X(p) = F^{-1}_X(p) = \inf\{x \in R: F_X(x) \ge p\}$$ #nolint

If lower.tail is FALSE then \(q_X(1-p)\) is returned.

If available a quantile will be returned using an analytic expression. Otherwise, if the distribution has not been decorated with FunctionImputation, NULL is returned.

Usage

Distribution$quantile(
  ...,
  lower.tail = TRUE,
  log.p = FALSE,
  simplify = TRUE,
  data = NULL
)

Arguments

...

(numeric())
Points to evaluate the function at Arguments do not need to be named. The length of each argument corresponds to the number of points to evaluate, the number of arguments corresponds to the number of variables in the distribution. See examples.

lower.tail

(logical(1))
If TRUE (default), probabilities are X <= x, otherwise, P(X > x).

log.p

(logical(1))
If TRUE returns the logarithm of the probabilities. Default is FALSE.

simplify

logical(1)
If TRUE (default) simplifies the return if possible to a numeric, otherwise returns a data.table::data.table.

data

array
Alternative method to specify points to evaluate. If univariate then rows correspond with number of points to evaluate and columns correspond with number of variables to evaluate. In the special case of VectorDistributions of multivariate distributions, then the third dimension corresponds to the distribution in the vector to evaluate.

Examples

b &lt;- Binomial$new()
b$quantile(0.42)
b$quantile(log(0.42), log.p = TRUE, lower.tail = TRUE)
b$quantile(data = matrix(c(0.1,0.2)))


Method rand()

The rand function draws n simulations from the distribution.

If available simulations will be returned using an analytic expression. Otherwise, if the distribution has not been decorated with FunctionImputation, NULL is returned.

Usage

Distribution$rand(n, simplify = TRUE)

Arguments

n

(numeric(1))
Number of points to simulate from the distribution. If length greater than \(1\), then n <- length(n),

simplify

logical(1)
If TRUE (default) simplifies the return if possible to a numeric, otherwise returns a data.table::data.table.

Examples

b &lt;- Binomial$new()
b$rand(10)

mvn &lt;- MultivariateNormal$new()
mvn$rand(5)


Method prec()

Returns the precision of the distribution as 1/self$variance().

Usage

Distribution$prec()


Method stdev()

Returns the standard deviation of the distribution as sqrt(self$variance()).

Usage

Distribution$stdev()


Method median()

Returns the median of the distribution. If an analytical expression is available returns distribution median, otherwise if symmetric returns self$mean, otherwise returns self$quantile(0.5).

Usage

Distribution$median(na.rm = NULL, ...)

Arguments

na.rm

(logical(1))
Ignored, addded for consistency.

...

ANY
Ignored, addded for consistency.


Method iqr()

Inter-quartile range of the distribution. Estimated as self$quantile(0.75) - self$quantile(0.25).

Usage

Distribution$iqr()


Method correlation()

If univariate returns 1, otherwise returns the distribution correlation.

Usage

Distribution$correlation()


Method liesInSupport()

Tests if the given values lie in the support of the distribution. Uses [set6::Set]$contains.

Usage

Distribution$liesInSupport(x, all = TRUE, bound = FALSE)

Arguments

x

ANY
Values to test.

all

logical(1)
If TRUE (default) returns TRUE if all x are in the distribution, otherwise returns a vector of logicals corresponding to each element in x.

bound

logical(1)
If TRUE then tests if x lie between the upper and lower bounds of the distribution, otherwise tests if x lie between the maximum and minimum of the distribution.


Method liesInType()

Tests if the given values lie in the type of the distribution. Uses [set6::Set]$contains.

Usage

Distribution$liesInType(x, all = TRUE, bound = FALSE)

Arguments

x

ANY
Values to test.

all

logical(1)
If TRUE (default) returns TRUE if all x are in the distribution, otherwise returns a vector of logicals corresponding to each element in x.

bound

logical(1)
If TRUE then tests if x lie between the upper and lower bounds of the distribution, otherwise tests if x lie between the maximum and minimum of the distribution.


Method workingSupport()

Returns an estimate for the computational support of the distribution. If an analytical cdf is available, then this is computed as the smallest interval in which the cdf lower bound is 0 and the upper bound is 1, bounds are incremented in 10^i intervals. If no analytical cdf is available, then this is computed as the smallest interval in which the lower and upper bounds of the pdf are 0, this is much less precise and is more prone to error. Used primarily by decorators.

Usage

Distribution$workingSupport()


Method clone()

The objects of this class are cloneable with this method.

Usage

Distribution$clone(deep = FALSE)

Arguments

deep

Whether to make a deep clone.

Examples

## ------------------------------------------------ ## Method `Distribution$setParameterValue` ## ------------------------------------------------ b = Binomial$new() b$setParameterValue(size = 4, prob = 0.4) b$setParameterValue(lst = list(size = 4, prob = 0.4)) ## ------------------------------------------------ ## Method `Distribution$pdf` ## ------------------------------------------------ b <- Binomial$new() b$pdf(1:10)
#> [1] 0.0097656250 0.0439453125 0.1171875000 0.2050781250 0.2460937500 #> [6] 0.2050781250 0.1171875000 0.0439453125 0.0097656250 0.0009765625
b$pdf(1:10, log = TRUE)
#> [1] -4.628887 -3.124809 -2.143980 -1.584364 -1.402043 -1.584364 -2.143980 #> [8] -3.124809 -4.628887 -6.931472
b$pdf(data = matrix(1:10))
#> [1] 0.0097656250 0.0439453125 0.1171875000 0.2050781250 0.2460937500 #> [6] 0.2050781250 0.1171875000 0.0439453125 0.0097656250 0.0009765625
mvn <- MultivariateNormal$new() mvn$pdf(1, 2)
#> [1] 0.01306423
mvn$pdf(1:2, 3:4)
#> [1] 1.072378e-03 7.225623e-06
mvn$pdf(data = matrix(1:4, nrow = 2), simplify = FALSE)
#> MultiNorm #> 1: 1.072378e-03 #> 2: 7.225623e-06
## ------------------------------------------------ ## Method `Distribution$cdf` ## ------------------------------------------------ b <- Binomial$new() b$cdf(1:10)
#> [1] 0.01074219 0.05468750 0.17187500 0.37695313 0.62304687 0.82812500 #> [7] 0.94531250 0.98925781 0.99902344 1.00000000
b$cdf(1:10, log.p = TRUE, lower.tail = FALSE)
#> [1] -0.01080030 -0.05623972 -0.18859117 -0.47313352 -0.97563444 -1.76098781 #> [7] -2.90612011 -4.53357653 -6.93147181 -Inf
b$cdf(data = matrix(1:10))
#> [1] 0.01074219 0.05468750 0.17187500 0.37695313 0.62304687 0.82812500 #> [7] 0.94531250 0.98925781 0.99902344 1.00000000
## ------------------------------------------------ ## Method `Distribution$quantile` ## ------------------------------------------------ b <- Binomial$new() b$quantile(0.42)
#> [1] 5
b$quantile(log(0.42), log.p = TRUE, lower.tail = TRUE)
#> [1] 5
b$quantile(data = matrix(c(0.1,0.2)))
#> [1] 3 4
## ------------------------------------------------ ## Method `Distribution$rand` ## ------------------------------------------------ b <- Binomial$new() b$rand(10)
#> [1] 7 4 2 4 5 4 5 3 5 8
mvn <- MultivariateNormal$new() mvn$rand(5)
#> V1 V2 #> 1: -0.5536994 0.62898204 #> 2: 2.0650249 -1.63098940 #> 3: 0.5124269 -1.86301149 #> 4: -0.5220125 -0.05260191 #> 5: 0.5429963 -0.91407483