Transformers and Other Unsupervised Models

Several unsupervised models used to perform common transformations, such as one-hot encoding, are available in MLJ out-of-the-box. These are detailed in Built-in transformers below.

A transformer is static if it has no learned parameters. While such a transformer is tantamount to an ordinary function, realizing it as an MLJ static transformer (subtype of Static <: Unsupervised) can be useful, especially if the function depends on parameters the user would like to manipulate (which become hyper-parameters of the model). The necessary syntax for defining your own static transformers is described in Static transformers below.

Some unsupervised models, such as clustering algorithms, have a predict method in addition to a transform method. We give an example of this in Transformers that also predict

Finally we note that models that fit a distribution, or more generally a sampler object, to some data, which are sometimes viewed as unsupervised, are treated in MLJ as supervised models. See Models that learn a probability distribution for an example.

Built-in transformers

MLJModels.StandardizerType
Standardizer

A model type for constructing a standardizer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

Standardizer = @load Standardizer pkg=MLJModels

Do model = Standardizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in Standardizer(features=...).

Use this model to standardize (whiten) a Continuous vector, or relevant columns of a table. The rescalings applied by this transformer to new data are always those learned during the training phase, which are generally different from what would actually standardize the new data.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table or any abstract vector with Continuous element scitype (any abstract float vector). Only features in a table with Continuous scitype can be standardized; check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: one of the following, with the behavior indicated below:

    • [] (empty, the default): standardize all features (columns) having Continuous element scitype

    • non-empty vector of feature names (symbols): standardize only the Continuous features in the vector (if ignore=false) or Continuous features not named in the vector (ignore=true).

    • function or other callable: standardize a feature if the callable returns true on its name. For example, Standardizer(features = name -> name in [:x1, :x3], ignore = true, count=true) has the same effect as Standardizer(features = [:x1, :x3], ignore = true, count=true), namely to standardize all Continuous and Count features, with the exception of :x1 and :x3.

    Note this behavior is further modified if the ordered_factor or count flags are set to true; see below

  • ignore=false: whether to ignore or standardize specified features, as explained above

  • ordered_factor=false: if true, standardize any OrderedFactor feature wherever a Continuous feature would be standardized, as described above

  • count=false: if true, standardize any Count feature wherever a Continuous feature would be standardized, as described above

Operations

  • transform(mach, Xnew): return Xnew with relevant features standardized according to the rescalings learned during fitting of mach.

  • inverse_transform(mach, Z): apply the inverse transformation to Z, so that inverse_transform(mach, transform(mach, Xnew)) is approximately the same as Xnew; unavailable if ordered_factor or count flags were set to true.

Fitted parameters

fitted_params(mach) is a dictionary of the rescaling parameters, keyed on feature name. In each value the first component is the training data mean, the second the standard deviation.

Warning: This format for fitted_params(mach) is not standard and may change in the future.

Report

The fields of report(mach) are:

  • features_fit: the names of features that will be standardized

Examples

using MLJ

X = (ordinal1 = [1, 2, 3],
     ordinal2 = coerce([:x, :y, :x], OrderedFactor),
     ordinal3 = [10.0, 20.0, 30.0],
     ordinal4 = [-20.0, -30.0, -40.0],
     nominal = coerce(["Your father", "he", "is"], Multiclass));

julia> schema(X)
┌──────────┬──────────────────┐
│ names    │ scitypes         │
├──────────┼──────────────────┤
│ ordinal1 │ Count            │
│ ordinal2 │ OrderedFactor{2} │
│ ordinal3 │ Continuous       │
│ ordinal4 │ Continuous       │
│ nominal  │ Multiclass{3}    │
└──────────┴──────────────────┘

stand1 = Standardizer();

julia> transform(fit!(machine(stand1, X)), X)
(ordinal1 = [1, 2, 3],
 ordinal2 = CategoricalValue{Symbol,UInt32}[:x, :y, :x],
 ordinal3 = [-1.0, 0.0, 1.0],
 ordinal4 = [1.0, 0.0, -1.0],
 nominal = CategoricalValue{String,UInt32}["Your father", "he", "is"],)

stand2 = Standardizer(features=[:ordinal3, ], ignore=true, count=true);

julia> transform(fit!(machine(stand2, X)), X)
(ordinal1 = [-1.0, 0.0, 1.0],
 ordinal2 = CategoricalValue{Symbol,UInt32}[:x, :y, :x],
 ordinal3 = [10.0, 20.0, 30.0],
 ordinal4 = [1.0, 0.0, -1.0],
 nominal = CategoricalValue{String,UInt32}["Your father", "he", "is"],)

See also OneHotEncoder, ContinuousEncoder.

MLJModels.OneHotEncoderType
OneHotEncoder

A model type for constructing a one-hot encoder, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

OneHotEncoder = @load OneHotEncoder pkg=MLJModels

Do model = OneHotEncoder() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in OneHotEncoder(features=...).

Use this model to one-hot encode the Multiclass and OrderedFactor features (columns) of some table, leaving other columns unchanged.

New data to be transformed may lack features present in the fit data, but no new features can be present.

Warning: This transformer assumes that levels(col) for any Multiclass or OrderedFactor column, col, is the same for training data and new data to be transformed.

To ensure all features are transformed into Continuous features, or dropped, use ContinuousEncoder instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table. Columns can be of mixed type but only those with element scitype Multiclass or OrderedFactor can be encoded. Check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: a vector of symbols (column names). If empty (default) then all Multiclass and OrderedFactor features are encoded. Otherwise, encoding is further restricted to the specified features (ignore=false) or the unspecified features (ignore=true). This default behavior can be modified by the ordered_factor flag.

  • ordered_factor=false: when true, OrderedFactor features are universally excluded

  • drop_last=true: whether to drop the column corresponding to the final class of encoded features. For example, a three-class feature is spawned into three new features if drop_last=false, but just two features otherwise.

Fitted parameters

The fields of fitted_params(mach).fitresult are:

  • all_features: names of all features encountered in training

  • fitted_levels_given_feature: dictionary of the levels associated with each feature encoded, keyed on the feature name

  • ref_name_pairs_given_feature: dictionary of pairs r => ftr (such as 0x00000001 => :grad__A) where r is a CategoricalArrays.jl reference integer representing a level, and ftr the corresponding new feature name; the dictionary is keyed on the names of features that are encoded

Warning: fitted_params(mach) does not have a standard form and may change in the future.

Report

The fields of report(mach) are:

  • features_to_be_encoded: names of input features to be encoded

  • new_features: names of all output features

Example

using MLJ

X = (name=categorical(["Danesh", "Lee", "Mary", "John"]),
     grade=categorical(["A", "B", "A", "C"], ordered=true),
     height=[1.85, 1.67, 1.5, 1.67],
     n_devices=[3, 2, 4, 3])

julia> schema(X)
┌───────────┬──────────────────┐
│ names     │ scitypes         │
├───────────┼──────────────────┤
│ name      │ Multiclass{4}    │
│ grade     │ OrderedFactor{3} │
│ height    │ Continuous       │
│ n_devices │ Count            │
└───────────┴──────────────────┘

hot = OneHotEncoder(drop_last=true)
mach = fit!(machine(hot, X))
W = transform(mach, X)

julia> schema(W)
┌──────────────┬────────────┐
│ names        │ scitypes   │
├──────────────┼────────────┤
│ name__Danesh │ Continuous │
│ name__John   │ Continuous │
│ name__Lee    │ Continuous │
│ grade__A     │ Continuous │
│ grade__B     │ Continuous │
│ height       │ Continuous │
│ n_devices    │ Count      │
└──────────────┴────────────┘

See also ContinuousEncoder.

MLJModels.ContinuousEncoderType
ContinuousEncoder

A model type for constructing a continuous encoder, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

ContinuousEncoder = @load ContinuousEncoder pkg=MLJModels

Do model = ContinuousEncoder() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in ContinuousEncoder(drop_last=...).

Use this model to arrange all features (columns) of a table to have Continuous element scitype, by applying the following protocol to each feature ftr:

  • If ftr is already Continuous retain it.

  • If ftr is Multiclass, one-hot encode it.

  • If ftr is OrderedFactor, replace it with coerce(ftr, Continuous) (vector of floating point integers), unless ordered_factors=false is specified, in which case one-hot encode it.

  • If ftr is Count, replace it with coerce(ftr, Continuous).

  • If ftr has some other element scitype, or was not observed in fitting the encoder, drop it from the table.

Warning: This transformer assumes that levels(col) for any Multiclass or OrderedFactor column, col, is the same for training data and new data to be transformed.

To selectively one-hot-encode categorical features (without dropping columns) use OneHotEncoder instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any Tables.jl compatible table. Columns can be of mixed type but only those with element scitype Multiclass or OrderedFactor can be encoded. Check column scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • drop_last=true: whether to drop the column corresponding to the final class of one-hot encoded features. For example, a three-class feature is spawned into three new features if drop_last=false, but two just features otherwise.

  • one_hot_ordered_factors=false: whether to one-hot any feature with OrderedFactor element scitype, or to instead coerce it directly to a (single) Continuous feature using the order

Fitted parameters

The fields of fitted_params(mach) are:

  • features_to_keep: names of features that will not be dropped from the table

  • one_hot_encoder: the OneHotEncoder model instance for handling the one-hot encoding

  • one_hot_encoder_fitresult: the fitted parameters of the OneHotEncoder model

Report

  • features_to_keep: names of input features that will not be dropped from the table

  • new_features: names of all output features

Example

X = (name=categorical(["Danesh", "Lee", "Mary", "John"]),
     grade=categorical(["A", "B", "A", "C"], ordered=true),
     height=[1.85, 1.67, 1.5, 1.67],
     n_devices=[3, 2, 4, 3],
     comments=["the force", "be", "with you", "too"])

julia> schema(X)
┌───────────┬──────────────────┐
│ names     │ scitypes         │
├───────────┼──────────────────┤
│ name      │ Multiclass{4}    │
│ grade     │ OrderedFactor{3} │
│ height    │ Continuous       │
│ n_devices │ Count            │
│ comments  │ Textual          │
└───────────┴──────────────────┘

encoder = ContinuousEncoder(drop_last=true)
mach = fit!(machine(encoder, X))
W = transform(mach, X)

julia> schema(W)
┌──────────────┬────────────┐
│ names        │ scitypes   │
├──────────────┼────────────┤
│ name__Danesh │ Continuous │
│ name__John   │ Continuous │
│ name__Lee    │ Continuous │
│ grade        │ Continuous │
│ height       │ Continuous │
│ n_devices    │ Continuous │
└──────────────┴────────────┘

julia> setdiff(schema(X).names, report(mach).features_to_keep) # dropped features
1-element Vector{Symbol}:
 :comments

See also OneHotEncoder

MLJModels.FillImputerType
FillImputer

A model type for constructing a fill imputer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FillImputer = @load FillImputer pkg=MLJModels

Do model = FillImputer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FillImputer(features=...).

Use this model to impute missing values in tabular data. A fixed "filler" value is learned from the training data, one for each column of the table.

For imputing missing values in a vector, use UnivariateFillImputer instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any table of input features (eg, a DataFrame) whose columns each have element scitypes Union{Missing, T}, where T is a subtype of Continuous, Multiclass, OrderedFactor or Count. Check scitypes with schema(X).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: a vector of names of features (symbols) for which imputation is to be attempted; default is empty, which is interpreted as "impute all".

  • continuous_fill: function or other callable to determine value to be imputed in the case of Continuous (abstract float) data; default is to apply median after skipping missing values

  • count_fill: function or other callable to determine value to be imputed in the case of Count (integer) data; default is to apply rounded median after skipping missing values

  • finite_fill: function or other callable to determine value to be imputed in the case of Multiclass or OrderedFactor data (categorical vectors); default is to apply mode after skipping missing values

Operations

  • transform(mach, Xnew): return Xnew with missing values imputed with the fill values learned when fitting mach

Fitted parameters

The fields of fitted_params(mach) are:

  • features_seen_in_fit: the names of features (columns) encountered during training

  • univariate_transformer: the univariate model applied to determine the fillers (it's fields contain the functions defining the filler computations)

  • filler_given_feature: dictionary of filler values, keyed on feature (column) names

Examples

using MLJ
imputer = FillImputer()

X = (a = [1.0, 2.0, missing, 3.0, missing],
     b = coerce(["y", "n", "y", missing, "y"], Multiclass),
     c = [1, 1, 2, missing, 3])

schema(X)
julia> schema(X)
┌───────┬───────────────────────────────┐
│ names │ scitypes                      │
├───────┼───────────────────────────────┤
│ a     │ Union{Missing, Continuous}    │
│ b     │ Union{Missing, Multiclass{2}} │
│ c     │ Union{Missing, Count}         │
└───────┴───────────────────────────────┘

mach = machine(imputer, X)
fit!(mach)

julia> fitted_params(mach).filler_given_feature
(filler = 2.0,)

julia> fitted_params(mach).filler_given_feature
Dict{Symbol, Any} with 3 entries:
  :a => 2.0
  :b => "y"
  :c => 2

julia> transform(mach, X)
(a = [1.0, 2.0, 2.0, 3.0, 2.0],
 b = CategoricalValue{String, UInt32}["y", "n", "y", "y", "y"],
 c = [1, 1, 2, 2, 3],)

See also UnivariateFillImputer.

MLJModels.UnivariateFillImputerType
UnivariateFillImputer

A model type for constructing a single variable fill imputer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateFillImputer = @load UnivariateFillImputer pkg=MLJModels

Do model = UnivariateFillImputer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateFillImputer(continuous_fill=...).

Use this model to imputing missing values in a vector with a fixed value learned from the non-missing values of training vector.

For imputing missing values in tabular data, use FillImputer instead.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector with element scitype Union{Missing, T} where T is a subtype of Continuous, Multiclass, OrderedFactor or Count; check scitype using scitype(x)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • continuous_fill: function or other callable to determine value to be imputed in the case of Continuous (abstract float) data; default is to apply median after skipping missing values

  • count_fill: function or other callable to determine value to be imputed in the case of Count (integer) data; default is to apply rounded median after skipping missing values

  • finite_fill: function or other callable to determine value to be imputed in the case of Multiclass or OrderedFactor data (categorical vectors); default is to apply mode after skipping missing values

Operations

  • transform(mach, xnew): return xnew with missing values imputed with the fill values learned when fitting mach

Fitted parameters

The fields of fitted_params(mach) are:

  • filler: the fill value to be imputed in all new data

Examples

using MLJ
imputer = UnivariateFillImputer()

x_continuous = [1.0, 2.0, missing, 3.0]
x_multiclass = coerce(["y", "n", "y", missing, "y"], Multiclass)
x_count = [1, 1, 1, 2, missing, 3, 3]

mach = machine(imputer, x_continuous)
fit!(mach)

julia> fitted_params(mach)
(filler = 2.0,)

julia> transform(mach, [missing, missing, 101.0])
3-element Vector{Float64}:
 2.0
 2.0
 101.0

mach2 = machine(imputer, x_multiclass) |> fit!

julia> transform(mach2, x_multiclass)
5-element CategoricalArray{String,1,UInt32}:
 "y"
 "n"
 "y"
 "y"
 "y"

mach3 = machine(imputer, x_count) |> fit!

julia> transform(mach3, [missing, missing, 5])
3-element Vector{Int64}:
 2
 2
 5

For imputing tabular data, use FillImputer.

MLJModels.FeatureSelectorType
FeatureSelector

A model type for constructing a feature selector, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

FeatureSelector = @load FeatureSelector pkg=MLJModels

Do model = FeatureSelector() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in FeatureSelector(features=...).

Use this model to select features (columns) of a table, usually as part of a model Pipeline.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, X)

where

  • X: any table of input features, where "table" is in the sense of Tables.jl

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • features: one of the following, with the behavior indicated:

    • [] (empty, the default): filter out all features (columns) which were not encountered in training

    • non-empty vector of feature names (symbols): keep only the specified features (ignore=false) or keep only unspecified features (ignore=true)

    • function or other callable: keep a feature if the callable returns true on its name. For example, specifying FeatureSelector(features = name -> name in [:x1, :x3], ignore = true) has the same effect as FeatureSelector(features = [:x1, :x3], ignore = true), namely to select all features, with the exception of :x1 and :x3.

  • ignore: whether to ignore or keep specified features, as explained above

Operations

  • transform(mach, Xnew): select features from the table Xnew as specified by the model, taking features seen during training into account, if relevant

Fitted parameters

The fields of fitted_params(mach) are:

  • features_to_keep: the features that will be selected

Example

using MLJ

X = (ordinal1 = [1, 2, 3],
     ordinal2 = coerce(["x", "y", "x"], OrderedFactor),
     ordinal3 = [10.0, 20.0, 30.0],
     ordinal4 = [-20.0, -30.0, -40.0],
     nominal = coerce(["Your father", "he", "is"], Multiclass));

selector = FeatureSelector(features=[:ordinal3, ], ignore=true);

julia> transform(fit!(machine(selector, X)), X)
(ordinal1 = [1, 2, 3],
 ordinal2 = CategoricalValue{Symbol,UInt32}["x", "y", "x"],
 ordinal4 = [-20.0, -30.0, -40.0],
 nominal = CategoricalValue{String,UInt32}["Your father", "he", "is"],)
MLJModels.UnivariateBoxCoxTransformerType
UnivariateBoxCoxTransformer

A model type for constructing a single variable Box-Cox transformer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateBoxCoxTransformer = @load UnivariateBoxCoxTransformer pkg=MLJModels

Do model = UnivariateBoxCoxTransformer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateBoxCoxTransformer(n=...).

Box-Cox transformations attempt to make data look more normally distributed. This can improve performance and assist in the interpretation of models which suppose that data is generated by a normal distribution.

A Box-Cox transformation (with shift) is of the form

x -> ((x + c)^λ - 1)/λ

for some constant c and real λ, unless λ = 0, in which case the above is replaced with

x -> log(x + c)

Given user-specified hyper-parameters n::Integer and shift::Bool, the present implementation learns the parameters c and λ from the training data as follows: If shift=true and zeros are encountered in the data, then c is set to 0.2 times the data mean. If there are no zeros, then no shift is applied. Finally, n different values of λ between -0.4 and 3 are considered, with λ fixed to the value maximizing normality of the transformed data.

Reference: Wikipedia entry for power transform.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector with element scitype Continuous; check the scitype with scitype(x)

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • n=171: number of values of the exponent λ to try

  • shift=false: whether to include a preliminary constant translation in transformations, in the presence of zeros

Operations

  • transform(mach, xnew): apply the Box-Cox transformation learned when fitting mach

  • inverse_transform(mach, z): reconstruct the vector z whose transformation learned by mach is z

Fitted parameters

The fields of fitted_params(mach) are:

  • λ: the learned Box-Cox exponent

  • c: the learned shift

Examples

using MLJ
using UnicodePlots
using Random
Random.seed!(123)

transf = UnivariateBoxCoxTransformer()

x = randn(1000).^2

mach = machine(transf, x)
fit!(mach)

z = transform(mach, x)

julia> histogram(x)
                ┌                                        ┐
   [ 0.0,  2.0) ┤███████████████████████████████████  848
   [ 2.0,  4.0) ┤████▌ 109
   [ 4.0,  6.0) ┤█▍ 33
   [ 6.0,  8.0) ┤▍ 7
   [ 8.0, 10.0) ┤▏ 2
   [10.0, 12.0) ┤  0
   [12.0, 14.0) ┤▏ 1
                └                                        ┘
                                 Frequency

julia> histogram(z)
                ┌                                        ┐
   [-5.0, -4.0) ┤█▎ 8
   [-4.0, -3.0) ┤████████▊ 64
   [-3.0, -2.0) ┤█████████████████████▊ 159
   [-2.0, -1.0) ┤█████████████████████████████▊ 216
   [-1.0,  0.0) ┤███████████████████████████████████  254
   [ 0.0,  1.0) ┤█████████████████████████▊ 188
   [ 1.0,  2.0) ┤████████████▍ 90
   [ 2.0,  3.0) ┤██▊ 20
   [ 3.0,  4.0) ┤▎ 1
                └                                        ┘
                                 Frequency
MLJModels.UnivariateDiscretizerType
UnivariateDiscretizer

A model type for constructing a single variable discretizer, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateDiscretizer = @load UnivariateDiscretizer pkg=MLJModels

Do model = UnivariateDiscretizer() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateDiscretizer(n_classes=...).

Discretization converts a Continuous vector into an OrderedFactor vector. In particular, the output is a CategoricalVector (whose reference type is optimized).

The transformation is chosen so that the vector on which the transformer is fit has, in transformed form, an approximately uniform distribution of values. Specifically, if n_classes is the level of discretization, then 2*n_classes - 1 ordered quantiles are computed, the odd quantiles being used for transforming (discretization) and the even quantiles for inverse transforming.

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector with Continuous element scitype; check scitype with scitype(x).

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • n_classes: number of discrete classes in the output

Operations

  • transform(mach, xnew): discretize xnew according to the discretization learned when fitting mach

  • inverse_transform(mach, z): attempt to reconstruct from z a vector that transforms to give z

Fitted parameters

The fields of fitted_params(mach).fitesult include:

  • odd_quantiles: quantiles used for transforming (length is n_classes - 1)

  • even_quantiles: quantiles used for inverse transforming (length is n_classes)

Warning. fitted_params(mach) does not have a standard form and may change in the future.

Example

using MLJ
using Random
Random.seed!(123)

discretizer = UnivariateDiscretizer(n_classes=100)
mach = machine(discretizer, randn(1000))
fit!(mach)

julia> x = rand(5)
5-element Vector{Float64}:
 0.6342070799721164
 0.8681793651724181
 0.43780421808821424
 0.5740792503574783
 0.22444170437768007

julia> z = transform(mach, x)
5-element CategoricalArrays.CategoricalArray{UInt8,1,UInt8}:
 0x49
 0x50
 0x43
 0x47
 0x3a5

julia> x_approx = inverse_transform(mach, z)
5-element Vector{Float64}:
 0.6333797607904535
 0.855839325856769
 0.433203047224622
 0.5662624832429449
 0.222065923759177
MLJModels.UnivariateTimeTypeToContinuousType
UnivariateTimeTypeToContinuous

A model type for constructing a single variable transformer that creates continuous representations of temporally typed data, based on MLJModels.jl, and implementing the MLJ model interface.

From MLJ, the type can be imported using

UnivariateTimeTypeToContinuous = @load UnivariateTimeTypeToContinuous pkg=MLJModels

Do model = UnivariateTimeTypeToContinuous() to construct an instance with default hyper-parameters. Provide keyword arguments to override hyper-parameter defaults, as in UnivariateTimeTypeToContinuous(zero_time=...).

Use this model to convert vectors with a TimeType element type to vectors of Float64 type (Continuous element scitype).

Training data

In MLJ or MLJBase, bind an instance model to data with

mach = machine(model, x)

where

  • x: any abstract vector whose element type is a subtype of Dates.TimeType

Train the machine using fit!(mach, rows=...).

Hyper-parameters

  • zero_time: the time that is to correspond to 0.0 under transformations, with the type coinciding with the training data element type. If unspecified, the earliest time encountered in training is used.

  • step::Period=Hour(24): time interval to correspond to one unit under transformation

Operations

  • transform(mach, xnew): apply the encoding inferred when mach was fit

Fitted parameters

fitted_params(mach).fitresult is the tuple (zero_time, step) actually used in transformations, which may differ from the user-specified hyper-parameters.

Example

using MLJ
using Dates

x = [Date(2001, 1, 1) + Day(i) for i in 0:4]

encoder = UnivariateTimeTypeToContinuous(zero_time=Date(2000, 1, 1),
                                         step=Week(1))

mach = machine(encoder, x)
fit!(mach)
julia> transform(mach, x)
5-element Vector{Float64}:
 52.285714285714285
 52.42857142857143
 52.57142857142857
 52.714285714285715
 52.857142

Static transformers

The main use-case for static transformers is for insertion into Linear Pipelines or other exported learning networks (see Composing Models). If a static transformer has no hyper-parameters, it is tantamount to an ordinary function. An ordinary function can be inserted directly into a pipeline; the situation for learning networks is only slightly more complicated; see Static operations on nodes.

The following example defines a new model type Averager to perform the weighted average of two vectors (target predictions, for example). We suppose the weighting is normalized, and therefore controlled by a single hyper-parameter, mix.

mutable struct Averager <: Static
    mix::Float64
end

MLJ.transform(a::Averager, _, y1, y2) = (1 - a.mix)*y1 + a.mix*y2

Important. Note the sub-typing <: Static.

Such static transformers with (unlearned) parameters can have arbitrarily many inputs, but only one output. In the single input case an inverse_transform can also be defined. Since they have no real learned parameters, you bind a static transformer to a machine without specifying training arguments.

mach = machine(Averager(0.5)) |> fit!
transform(mach, [1, 2, 3], [3, 2, 1])
3-element Vector{Float64}:
 2.0
 2.0
 2.0

Let's see how we can include our Averager in a learning network (see Composing Models) to mix the predictions of two regressors, with one-hot encoding of the inputs:

X = source()
y = source()

ridge = (@load RidgeRegressor pkg=MultivariateStats)()
knn = (@load KNNRegressor)()
averager = Averager(0.5)

hotM = machine(OneHotEncoder(), X)
W = transform(hotM, X) # one-hot encode the input

ridgeM = machine(ridge, W, y)
y1 = predict(ridgeM, W)

knnM = machine(knn, W, y)
y2 = predict(knnM, W)

averagerM= machine(averager)
yhat = transform(averagerM, y1, y2)
Node
  args:
    1:	Node
    2:	Node
  formula:
    transform(
      machine(Averager(mix = 0.5), …), 
      predict(
        machine(RidgeRegressor(lambda = 1.0, …), …), 
        transform(
          machine(OneHotEncoder(features = Symbol[], …), …), 
          Source @352)),
      predict(
        machine(KNNRegressor(K = 5, …), …), 
        transform(
          machine(OneHotEncoder(features = Symbol[], …), …), 
          Source @352)))

Now we export to obtain a Deterministic composite model and then instantiate composite model

learning_mach = machine(Deterministic(), X, y; predict=yhat)
Machine{DeterministicSurrogate} @772 trained 0 times.
  args:
    1:	Source @415 ⏎ `Unknown`
    2:	Source @389 ⏎ `Unknown`


@from_network learning_mach struct DoubleRegressor
       regressor1=ridge
       regressor2=knn
       averager=averager
       end

composite = DoubleRegressor()
julia> composite = DoubleRegressor()
DoubleRegressor(
    regressor1 = RidgeRegressor(
            lambda = 1.0),
    regressor2 = KNNRegressor(
            K = 5,
            algorithm = :kdtree,
            metric = Distances.Euclidean(0.0),
            leafsize = 10,
            reorder = true,
            weights = :uniform),
    averager = Averager(
            mix = 0.5)) @301

which can be can be evaluated like any other model:

composite.averager.mix = 0.25 # adjust mix from default of 0.5
julia> evaluate(composite, (@load_reduced_ames)..., measure=rms)
Evaluating over 6 folds: 100%[=========================] Time: 0:00:00
┌───────────┬───────────────┬────────────────────────────────────────────────────────┐
│ _.measure │ _.measurement │ _.per_fold                                             │
├───────────┼───────────────┼────────────────────────────────────────────────────────┤
│ rms       │ 26800.0       │ [21400.0, 23700.0, 26800.0, 25900.0, 30800.0, 30700.0] │
└───────────┴───────────────┴────────────────────────────────────────────────────────┘
_.per_observation = [missing]
_.fitted_params_per_fold = [ … ]
_.report_per_fold = [ … ]

Transformers that also predict

Some clustering algorithms learn to label data by identifying a collection of "centroids" in the training data. Any new input observation is labeled with the cluster to which it is closest (this is the output of predict) while the vector of all distances from the centroids defines a lower-dimensional representation of the observation (the output of transform). In the following example a K-means clustering algorithm assigns one of three labels 1, 2, 3 to the input features of the iris data set and compares them with the actual species recorded in the target (not seen by the algorithm).

import Random.seed!
seed!(123)

X, y = @load_iris;
KMeans = @load KMeans pkg=ParallelKMeans
kmeans = KMeans()
mach = machine(kmeans, X) |> fit!

# transforming:
Xsmall = transform(mach);
selectrows(Xsmall, 1:4) |> pretty
julia> selectrows(Xsmall, 1:4) |> pretty
┌─────────────────────┬────────────────────┬────────────────────┐
│ x1                  │ x2                 │ x3                 │
│ Float64             │ Float64            │ Float64            │
│ Continuous          │ Continuous         │ Continuous         │
├─────────────────────┼────────────────────┼────────────────────┤
│ 0.0215920000000267  │ 25.314260355029603 │ 11.645232464391299 │
│ 0.19199200000001326 │ 25.882721893491123 │ 11.489658693899486 │
│ 0.1699920000000077  │ 27.58656804733728  │ 12.674412792260142 │
│ 0.26919199999998966 │ 26.28656804733727  │ 11.64392098898145  │
└─────────────────────┴────────────────────┴────────────────────┘

# predicting:
yhat = predict(mach);
compare = zip(yhat, y) |> collect;
compare[1:8]
8-element Array{Tuple{CategoricalValue{Int64,UInt32},CategoricalString{UInt32}},1}:
 (1, "setosa")
 (1, "setosa")
 (1, "setosa")
 (1, "setosa")
 (1, "setosa")
 (1, "setosa")
 (1, "setosa")
 (1, "setosa")

compare[51:58]
8-element Array{Tuple{CategoricalValue{Int64,UInt32},CategoricalString{UInt32}},1}:
 (2, "versicolor")
 (3, "versicolor")
 (2, "versicolor")
 (3, "versicolor")
 (3, "versicolor")
 (3, "versicolor")
 (3, "versicolor")
 (3, "versicolor")

compare[101:108]
8-element Array{Tuple{CategoricalValue{Int64,UInt32},CategoricalString{UInt32}},1}:
 (2, "virginica")
 (3, "virginica")
 (2, "virginica")
 (2, "virginica")
 (2, "virginica")
 (2, "virginica")
 (3, "virginica")
 (2, "virginica")