vignettes/webs/create_sdistribution.rmd
create_sdistribution.rmd
This tutorial assumes that you have a good knowledge of R6 and so we will not be going through the basics of inheritance and private/public methods/variables.
All implemented probability distributions (excluding Kernels) in distr6 inherit from the SDistribution
class. This means they share a common interface. The only differences between these distributions is that some will have methods missing as no analytic results are available. See the uml diagram for an overview of how this all fits in together. A core design principle in distr6 is that only analytical methods are defined in the SDistribution child classes, all numerical results are available through decorators. See the decorators tutorial for more information on decorators and the analytical and numerical article for further discussions on analytical and numerical methods. The summary is that when creating your own SDistribution class, please do not put any numerical methods in the core interface, if a closed form expression cannot be found, omit the method entirely and it can be imputed with a decorator. If your desired method is not available in one of our decorators but you think it is useful, see the creating a decorator extension guidelines.
Every class inheriting from SDistribution must have the following public variables:
For the Normal distribution, the above all looks like
Normal <- R6::R6Class("Normal", inherit = SDistribution, lock_objects = F)
Normal$set("public","name","Normal")
Normal$set("public","short_name","Norm")
Normal$set("public","description","Normal Probability Distribution.")
Normal$set("public","package","stats")
Note:
lock_objects=F
is not left out as it ensures decorators work correctly.set
method to add private/public variables/methodsFor the full list of methods to (optimally) include see the ‘Statistical Methods’ section in the help pages of SDistribution: ?SDistribution
. This does not include pdf/cdf/quantile/rand, these are defined in the constructor and not in the class. Once again, if there is no closed form analytical expression possible, omit the method completely.
The following methods are included by default and can therefore by omitted from the class definition:
Additionally the pgf
method returns NaN if omitted but this can be overloaded by including the method in the class definition. Below is an example of adding four methods to the Normal distribution
Normal$set("public","mean",function(){
return(self$getParameterValue("mean"))
})
Normal$set("public","variance",function(){
return(self$getParameterValue("var"))
})
Normal$set("public","skewness",function(){
return(0)
})
Normal$set("public", "mgf", function(t){
return(exp((self$getParameterValue("mean") * t) + (self$getParameterValue("var") * t^2 * 0.5)))
})
Note:
self
keyword and the getParameterValue
method, we looked at this in the custom distribution tutorial
?SDistribution
help page, this ensures that the automated S3 dispatch methods run correctlyThe constructor for all SDistribution objects looks the same, below is the constructor for the Normal distribution, which we will talk through as an example.
Normal$set("public","initialize",function(mean = 0, var = 1, sd = NULL, prec = NULL,
decorators = NULL, verbose = FALSE){
private$.parameters <- getParameterSet(self, mean, var, sd, prec, verbose)
self$setParameterValue(mean = mean, var = var, sd = sd, prec = prec)
pdf <- function(x1) dnorm(x1, self$getParameterValue("mean"), self$getParameterValue("sd"))
cdf <- function(x1) pnorm(x1, self$getParameterValue("mean"), self$getParameterValue("sd"))
quantile <- function(p) qnorm(p, self$getParameterValue("mean"), self$getParameterValue("sd"))
rand <- function(n) rnorm(n, self$getParameterValue("mean"), self$getParameterValue("sd"))
super$initialize(decorators = decorators, pdf = pdf, cdf = cdf, quantile = quantile,
rand = rand, support = Reals$new(zero = T),
symmetric = TRUE, type = Reals$new(),
valueSupport = "continuous", variateForm = "univariate")
invisible(self)
})
Note the following:
getParameterSet
and setParameterValue
methods, we will return to these later.In a separate script called getParameterSet.R we have the generic and dispatch methods for every SDistribution. This function validates the initial choice of parameterisation, tells the user the parameterisation (if verbose == TRUE
), constructs and returns the ParameterSet for the distribution. Below is the getParameterSet
method for the Normal distribution
getParameterSet.Normal <- function(x, mean, var, sd = NULL, prec = NULL, verbose = FALSE){
var.bool = sd.bool = prec.bool = FALSE
if(!is.null(prec)){
if(verbose) message("Parameterised with mean and prec.")
prec.bool = TRUE
} else if(!is.null(sd)){
if(verbose) message("Parameterised with mean and sd.")
sd.bool = TRUE
} else{
if(verbose) message("Parameterised with mean and var.")
var.bool = TRUE
}
ps <- ParameterSet$new(id = list("mean","var","sd","prec"),
value = list(0, 1, 1, 1),
support = list(Reals$new(), PosReals$new(), PosReals$new(), PosReals$new()),
settable = list(TRUE, var.bool, sd.bool, prec.bool),
description = list("Mean - Location Parameter",
"Variance - Squared Scale Parameter",
"Standard Deviation - Scale Parameter",
"Precision - Inverse Squared Scale Parameter"))
ps$addDeps("var", "sd", function(self) self$getParameterValue('var')^0.5)
ps$addDeps("var", "prec", function(self) self$getParameterValue('var')^-1)
ps$addDeps("sd", "var", function(self) self$getParameterValue('sd')^2)
ps$addDeps("sd", "prec", function(self) self$getParameterValue('sd')^-2)
ps$addDeps("prec", "sd", function(self) self$getParameterValue('prec')^-0.5)
ps$addDeps("prec", "var", function(self) self$getParameterValue('prec')^-1)
return(ps)
}
Note:
getParameterSet.Normal
TRUE
whilst all others are FALSE
.addDeps
.That’s everything that is required to create your own SDistribution class. In summary the different components include
?SDistribution
getParameterSet
dispatch method, written in the getParameterSet.R
script