Controlling Iterative Models

Iterative supervised machine learning models are usually trained until an out-of-sample estimate of the performance satisfies some stopping criterion, such as k consecutive deteriorations of the performance (see Patience below). A more sophisticated kind of control might dynamically mutate parameters, such as a learning rate, in response to the behavior of these estimates.

Some iterative model implementations enable some form of automated control, with the method and options for doing so varying from model to model. But sometimes it is up to the user to arrange control, which in the crudest case reduces to manually experimenting with the iteration parameter.

In response to this ad hoc state of affairs, MLJ provides a uniform and feature-rich interface for controlling any iterative model that exposes its iteration parameter as a hyper-parameter, and which implements the "warm restart" behavior described in Machines.

Basic use

As in Tuning models, iteration control in MLJ is implemented as a model wrapper, which allows composition with other meta-algorithms. Ordinarily, the wrapped model behaves just like the original model, but with the training occurring on a subset of the provided data (to allow computation of an out-of-sample loss) and with the iteration parameter automatically determined by the controls specified in the wrapper.

By setting retrain=true one can ask that the wrapped model retrain on all supplied data, after learning the appropriate number of iterations from the controlled training phase:

using MLJ

X, y = make_moons(100, rng=123, noise=0.5)
EvoTreeClassifier = @load EvoTreeClassifier verbosity=0

iterated_model = IteratedModel(model=EvoTreeClassifier(rng=123, η=0.005),
                               resampling=Holdout(),
                               measures=log_loss,
                               controls=[Step(5),
                                         Patience(2),
                                         NumberLimit(100)],
                               retrain=true)

mach = machine(iterated_model, X, y)
julia> fit!(mach)
[ Info: Training Machine{ProbabilisticIteratedModel{EvoTreeClassifier{Float64,…}},…} @638.
[ Info: final loss: 0.45796842397751225
[ Info: Stop triggered by Patience(2) stopping criterion. 
[ Info: Retraining on all provided data. To suppress, specify `retrain=false`. 
[ Info: Total of 210 iterations. 
Machine{ProbabilisticIteratedModel{EvoTreeClassifier{Float64,…}},…} @638 trained 1 time; does not cache data
  args:
    1:	Source @370 ⏎ `Table{AbstractVector{Continuous}}`
    2:	Source @875 ⏎ `AbstractVector{Multiclass{2}}`

As detailed under IteratedModel below, the specified controls are repeatedly applied in sequence to a training machine, constructed under the hood, until one of the controls triggers a stop. Here Step(5) means "Compute 5 more iterations" (in this case starting from none); Patience(2) means "Stop at the end of the control cycle if there have been 2 consecutive drops in the log loss"; and NumberLimit(100) is a safeguard ensuring a stop after 100 control cycles (500 iterations). See Controls provided below for a complete list.

Because iteration is implemented as a wrapper, the "self-iterating" model can be evaluated using cross-validation, say, and the number of iterations on each fold will generally be different:

e = evaluate!(mach, resampling=CV(nfolds=3), measure=log_loss, verbosity=0);
map(e.report_per_fold) do r
    r.n_iterations
end
3-element Vector{Int64}:
 225
 500
 500

Alternatively, one might wrap the self-iterating model in a tuning strategy, using TunedModel; see Tuning models. In this way, the optimization of some other hyper-parameter is realized simultaneously with that of the iteration parameter, which will frequently be more efficient than a direct two-parameter search.

Controls provided

In the table below, mach is the training machine being iterated, constructed by binding the supplied data to the model specified in the IteratedModel wrapper, but trained in each iteration on a subset of the data, according to the value of the resampling hyper-parameter of the wrapper.

controldescriptioncan trigger a stop
Step(n=1)Train model for n more iterationsno
TimeLimit(t=0.5)Stop after t hoursyes
NumberLimit(n=100)Stop after n applications of the controlyes
NumberSinceBest(n=6)Stop when best loss occurred n control applications agoyes
InvalidValue()Stop when NaN, Inf or -Inf loss/training loss encounteredyes
Threshold(value=0.0)Stop when loss < valueyes
GL(alpha=2.0)† Stop after the "generalization loss (GL)" exceeds alphayes
PQ(alpha=0.75, k=5)† Stop after "progress-modified GL" exceeds alphayes
Patience(n=5)† Stop after n consecutive loss increasesyes
Info(f=identity)Log to Info the value of f(mach), where mach is current machineno
Warn(predicate; f="")Log to Warn the value of f or f(mach), if predicate(mach) holdsno
Error(predicate; f="")Log to Error the value of f or f(mach), if predicate(mach) holds and then stopyes
Callback(f=mach->nothing)Call f(mach)yes
WithNumberDo(f=n->@info(n))Call f(n + 1) where n is the number of complete control cycles so faryes
WithIterationsDo(f=i->@info("num iterations: $i"))Call f(i), where i is total number of iterationsyes
WithLossDo(f=x->@info("loss: $x"))Call f(loss) where loss is the current lossyes
WithTrainingLossesDo(f=v->@info(v))Call f(v) where v is the current batch of training lossesyes
Save(filename="machine.jlso")* Save current machine to machine1.jlso, machine2.jslo, etcyes

Table 1. Atomic controls. Some advanced options omitted.

† For more these controls see Prechelt, Lutz (1998): "Early Stopping - But When?", in Neural Networks: Tricks of the Trade, ed. G. Orr, Springer.

  • If using MLJIteration without MLJ, then Save is not available unless one is also using MLJSerialization.

Stopping option. All the following controls trigger a stop if the provided function f returns true and stop_if_true=true is specified in the constructor: Callback, WithNumberDo, WithLossDo, WithTrainingLossesDo.

There are also three control wrappers to modify a control's behavior:

wrapperdescription
IterationControl.skip(control, predicate=1)Apply control every predicate applications of the control wrapper (can also be a function; see doc-string)
IterationControl.louder(control, by=1)Increase the verbosity level of control by the specified value (negative values lower verbosity)
IterationControl.debug(control)Apply control but also log its state to Info (irrespective of verbosity level)
IterationControl.composite(controls...)Apply each control in controls in sequence; used internally by IterationControl.jl

Table 2. Wrapped controls

Using training losses, and controlling model tuning

Some iterative models report a training loss, as a byproduct of a fit! call, and these can be used in two ways:

  1. To supplement an out-of-sample estimate of the loss in deciding when to stop, as in the PQ stopping criterion (see Prechelt, Lutz (1998))); or

  2. As a (generally less reliable) substitute for an out-of-sample loss, when wishing to train excusivley on all supplied data.

To have IteratedModel bind all data to the training machine and use training losses in place of an out-of-sample loss, specify resampling=nothing. To check if MyFavoriteIterativeModel reports training losses, load the model code and inspect supports_training_losses(MyFavoriteIterativeModel) (or do info("MyFavoriteIterativeModel"))

Controlling model tuning

An example of scenario 2 occurs when controlling hyper-parameter optimization (model tuning). Recall that MLJ's TunedModel wrapper is implemented as an iterative model. Moreover, this wrapper reports, as a training loss, the lowest value of the optimization objective function so far (typically the lowest value of an out-of-sample loss, or -1 times an out-of-sample score). One may want to simply end the hyper-parameter search when this value meets the NumberSinceBest stopping criterion discussed below, say, rather than introduce an extra layer of resampling to first "learn" the optimal value of the iteration parameter.

In the following example we conduct a RandomSearch for the optimal value of the regularization parameter lambda in a RidgeRegressor using 6-fold cross-validation. By wrapping our "self-tuning" version of the regressor as an IteratedModel, with resampling=nothing and NumberSinceBest(20) in the controls, we terminate the search when the number of lambda values tested since the previous best cross-validation loss reaches 20.

using MLJ

X, y = @load_boston;
RidgeRegressor = @load RidgeRegressor pkg=MLJLinearModels verbosity=0
model = RidgeRegressor()
r = range(model, :lambda, lower=-1, upper=2, scale=x->10^x)
self_tuning_model = TunedModel(model=model,
                               tuning=RandomSearch(rng=123),
                               resampling=CV(nfolds=6),
                               range=r,
                               measure=mae);
iterated_model = IteratedModel(model=self_tuning_model,
                               resampling=nothing,
                               control=[Step(1), NumberSinceBest(20), NumberLimit(1000)])
mach = machine(iterated_model, X, y)
julia> fit!(mach)
[ Info: Training Machine{DeterministicIteratedModel{DeterministicTunedModel{RandomSearch,…}},…} @177.
[ Info: final loss: 3.8417263706934257
[ Info: final training loss: 3.8417263706934257
[ Info: Stop triggered by NumberSinceBest(20) stopping criterion. 
[ Info: Total of 41 iterations. 
Machine{DeterministicIteratedModel{DeterministicTunedModel{RandomSearch,…}},…} @177 trained 1 time; does not cache data
  args:
    1:	Source @245 ⏎ `Table{AbstractVector{Continuous}}`
    2:	Source @345 ⏎ `AbstractVector{Continuous}`
julia> report(mach).model_report.best_model
RidgeRegressor(
    lambda = 13.109345058947303,
    fit_intercept = true,
    penalize_intercept = false,
    solver = nothing) @935

We can use mach here to directly obtain predictions using the optimal model (trained on all data), as in

julia> predict(mach, selectrows(X, 1:4))
4-element Vector{Float64}:
 29.539102434295277
 24.743126919561234
 31.299067261061705
 29.857334853812457

Custom controls

Under the hood, control in MLJIteration is implemented using IterationControl.jl. Rather than iterating a training machine directly, we iterate a wrapped version of this object, which includes other information that controls may want to access, such the MLJ evaluation object. This information is summarized under The training machine wrapper below.

Controls must implement two update! methods, one for initializing the control's state on the first application of the control (this state being external to the control struct) and one for all subsequent control applications, which generally updates state also. There are two optional methods: done, for specifying conditions triggering a stop, and takedown for specifying actions to perform at the end of controlled training.

We summarize the training algorithm, as it relates to controls, after giving a simple example.

Example 1 - Non-uniform iteration steps

Below we define a control, IterateFromList(list), to train, on each application of the control, until the iteration count reaches the next value in a user-specified list, triggering a stop when the list is exhausted. For example, to train on iteration counts on a log scale, one might use IterateFromList([round(Int, 10^x) for x in range(1, 2, length=10)].

In the code, wrapper is an object that wraps the training machine (see above). The variable n is a counter for control cycles (unused in this example).


import IterationControl # or MLJ.IterationControl

struct IterateFromList
    list::Vector{<:Int} # list of iteration parameter values
    IterateFromList(v) = new(unique(sort(v)))
end

function IterationControl.update!(control::IterateFromList, wrapper, verbosity, n)
    Δi = control.list[1]
    verbosity > 1 && @info "Training $Δi more iterations. "
    MLJIteration.train!(wrapper, Δi) # trains the training machine
    return (index = 2, )
end

function IterationControl.update!(control::IterateFromList, wrapper, verbosity, n, state)
    index = state.positioin_in_list
    Δi = control.list[i] - wrapper.n_iterations
    verbosity > 1 && @info "Training $Δi more iterations. "
    MLJIteration.train!(wrapper, Δi)
    return (index = index + 1, )
end

The first update method will be called the first time the control is applied, returning an initialized state = (index = 2,), which is passed to the second update method, which is called on subsequent control applications (and which returns the updated state).

A done method articulates the criterion for stopping:

IterationControl.done(control::IterateFromList, state) =
    state.index > length(control.list)

For the sake of illustration, we'll implement a takedown method; its return value is included in the IteratedModel report:

IterationControl.takedown(control::IterateFromList, verbosity, state)
    verbosity > 1 && = @info "Stepped through these values of the "*
                              "iteration parameter: $(control.list)"
    return (iteration_values=control.list, )
end

The training machine wrapper

A training machine wrapper has these properties:

  • wrapper.machine - the training machine, type Machine

  • wrapper.model - the mutable atomic model, coinciding with wrapper.machine.model

  • wrapper.n_cycles - the number IterationControl.train!(wrapper, _) calls so far; generally the current control cycle count

  • wrapper.n_iterations - the total number of iterations applied to the model so far

  • wrapper.Δiterations - the number of iterations applied in the last IterationControl.train!(wrapper, _) call

  • wrapper.loss - the out-of-sample loss (based on the first measure in measures)

  • wrapper.training_losses - the last batch of training losses (if reported by model), an abstract vector of length wrapper.Δiteration.

  • wrapper.evaluation - the complete MLJ performance evaluation object, which has the following properties: measure, measurement, per_fold, per_observation, fitted_params_per_fold, report_per_fold (here there is only one fold). For further details, see Evaluating Model Performance.

The training algorithm

Here now is a simplified description of the training of an IteratedModel. First, the atomic model is bound in a machine - the training machine above - to a subset of the supplied data, and then wrapped in an object called wrapper below. To train the training machine machine for i more iterations, and update the other data in the wrapper, requires the call MLJIteration.train!(wrapper, i). Only controls can make this call (e.g., Step(...), or IterateFromList(...) above). If we assume for simplicity there is only a single control, called control, then training proceeds as follows:

n = 1 # initialize control cycle counter
state = update!(control, wrapper, verbosity, n)
finished = done(control, state)

# subsequent training events:
while !finished
    n += 1
    state = update!(control, wrapper, verbosity, n, state)
    finished = done(control, state)
end

# finalization:
return takedown(control, verbosity, state)

Example 2 - Cyclic learning rates

The control below implements a triangular cyclic learning rate policy, close to that described in L. N. Smith (2019): "Cyclical Learning Rates for Training Neural Networks," 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, CA, USA, pp. 464-472. [In that paper learning rates are mutated (slowly) during each training iteration (epoch) while here mutations can occur once per iteration of the model, at most.]

For the sake of illustration, we suppose the iterative model, model, specified in the IteratedModel constructor, has a field called :learning_parameter, and that mutating this parameter does not trigger cold-restarts.

struct CycleLearningRate{F<:AbstractFloat}
    stepsize::Int
    lower::F
    upper::F
end

# return one cycle of learning rate values:
function one_cycle(c::CycleLearningRate)
    rise = range(c.lower, c.upper, length=c.stepsize + 1)
    fall = reverse(rise)
    return vcat(rise[1:end - 1], fall[1:end - 1])
end

function IterationControl.update!(control::CycleLearningRate,
                                  wrapper,
                                  verbosity,
                                  n,
                                  state = (learning_rates=nothing, ))
    rates = n == 0 ? one_cycle(control) : state.learning_rates
    index = mod(n, length(rates)) + 1
    r = rates[index]
    verbosity > 1 && @info "learning rate: $r"
    wrapper.model.iteration_control = r
    return (learning_rates = rates,)
end

API Reference

MLJIteration.IteratedModelFunction
IteratedModel(model=nothing,
              controls=Any[Step(10), Patience(5), GL(2.0), TimeLimit(Dates.Millisecond(108000)), InvalidValue()],
              retrain=false,
              resampling=Holdout(),
              measure=nothing,
              weights=nothing,
              class_weights=nothing,
              operation=predict,
              verbosity=1,
              check_measure=true,
              iteration_parameter=nothing,
              cache=true)

Wrap the specified model <: Supervised in the specified iteration controls. Training a machine bound to the wrapper iterates a corresonding machine bound to model. Here model should support iteration.

To list all controls, do MLJIteration.CONTROLS. Controls are summarized at https://alan-turing-institute.github.io/MLJ.jl/dev/getting_started/ but query individual doc-strings for details and advanced options. For creating your own controls, refer to the documentation just cited.

To make out-of-sample losses available to the controls, the machine bound to model is only trained on part of the data, as iteration proceeds. See details on training below. Specify retrain=true to ensure the model is retrained on all available data, using the same number of iterations, once controlled iteration has stopped.

Specify resampling=nothing if all data is to be used for controlled iteration, with each out-of-sample loss replaced by the most recent training loss, assuming this is made available by the model (supports_training_losses(model) == true). Otherwise, resampling must have type Holdout (eg, Holdout(fraction_train=0.8, rng=123)).

Assuming retrain=true or resampling=nothing, iterated_model behaves exactly like the original model but with the iteration parameter automatically selected. If retrain=false (default) and resampling is not nothing, then iterated_model behaves like the original model trained on a subset of the provided data.

Controlled iteration can be continued with new fit! calls (warm restart) by mutating a control, or by mutating the iteration parameter of model, which is otherwise ignored.

Training

Given an instance iterated_model of IteratedModel, calling fit!(mach) on a machine mach = machine(iterated_model, data...) performs the following actions:

  • Assuming resampling !== nothing, the data is split into train and test sets, according to the specified resampling strategy, which must have type Holdout.

  • A clone of the wrapped model, iterated_model.model, is bound to the train data in an internal machine, train_mach. If resampling === nothing, all data is used instead. This machine is the object to which controls are applied. For example, Callback(fitted_params |> print) will print the value of fitted_params(train_mach).

  • The iteration parameter of the clone is set to 0.

  • The specified controls are repeatedly applied to train_mach in sequence, until one of the controls triggers a stop. Loss-based controls (eg, Patience(), GL(), Threshold(0.001)) use an out-of-sample loss, obtained by applying measure to predictions and the test target values. (Specifically, these predictions are those returned by operation(train_mach).) If resampling === nothing then the most recent training loss is used instead. Some controls require both out-of-sample and training losses (eg, PQ()).

  • Once a stop has been triggered, a clone of model is bound to all data in a machine called mach_production below, unless retrain == false or resampling === nothing, in which case mach_production coincides with train_mach.

Prediction

Calling predict(mach, Xnew) returns predict(mach_production, Xnew). Similar similar statements hold for predict_mean, predict_mode, predict_median.

Controls

A control is permitted to mutate the fields (hyper-parameters) of train_mach.model (the clone of model). For example, to mutate a learning rate one might use the control

Callback(mach -> mach.model.eta = 1.05*mach.model.eta)

However, unless model supports warm restarts with respect to changes in that parameter, this will trigger retraining of train_mach from scratch, with a different training outcome, which is not recommended.

Warm restarts

If iterated_model is mutated and fit!(mach) is called again, then a warm restart is attempted if the only parameters to change are model or controls or both.

Specifically, train_mach.model is mutated to match the current value of iterated_model.model and the iteration parameter of the latter is updated to the last value used in the preceding fit!(mach) call. Then repeated application of the (updated) controls begin anew.

Controls

IterationControl.StepType
Step(; n=1)

An iteration control, as in, Step(2).

Train for n more iterations. Will never trigger a stop.

EarlyStopping.TimeLimitType
TimeLimit(; t=0.5)

An early stopping criterion for loss-reporting iterative algorithms.

Stopping is triggered after t hours have elapsed since the stopping criterion was initiated.

Any Julia built-in Real type can be used for t. Subtypes of Period may also be used, as in TimeLimit(t=Minute(30)).

Internally, t is rounded to nearest millisecond. ``

EarlyStopping.NumberLimitType
NumberLimit(; n=100)

An early stopping criterion for loss-reporting iterative algorithms.

A stop is triggered by n consecutive loss updates, excluding "training" loss updates.

If wrapped in a stopper::EarlyStopper, this is the number of calls to done!(stopper).

EarlyStopping.NumberSinceBestType
NumberSinceBest(; n=6)

An early stopping criterion for loss-reporting iterative algorithms.

A stop is triggered when the number of calls to the control, since the lowest value of the loss so far, is n.

For a customizable loss-based stopping criterion, use WithLossDo or WithTrainingLossesDo with the stop_if_true=true option.

EarlyStopping.InvalidValueType
InvalidValue()

An early stopping criterion for loss-reporting iterative algorithms.

Stop if a loss (or training loss) is NaN, Inf or -Inf (or, more precisely, if isnan(loss) or isinf(loss) is true).

For a customizable loss-based stopping criterion, use WithLossDo or WithTrainingLossesDo with the stop_if_true=true option.

EarlyStopping.ThresholdType
Threshold(; value=0.0)

An early stopping criterion for loss-reporting iterative algorithms.

A stop is triggered as soon as the loss drops below value.

For a customizable loss-based stopping criterion, use WithLossDo or WithTrainingLossesDo with the stop_if_true=true option.

EarlyStopping.GLType
GL(; alpha=2.0)

An early stopping criterion for loss-reporting iterative algorithms.

A stop is triggered when the (rescaled) generalization loss exceeds the threshold alpha.

Terminology. Suppose $E_1, E_2, ..., E_t$ are a sequence of losses, for example, out-of-sample estimates of the loss associated with some iterative machine learning algorithm. Then the generalization loss at time t, is given by

$GL_t = 100 (E_t - E_{opt}) \over |E_{opt}|$

where $E_{opt}$ is the minimum value of the sequence.

Reference: Prechelt, Lutz (1998): "Early Stopping- But When?", in Neural Networks: Tricks of the Trade, ed. G. Orr, Springer..

EarlyStopping.PQType
PQ(; alpha=0.75, k=5, tol=eps(Float64))

A stopping criterion for training iterative supervised learners.

A stop is triggered when Prechelt's progress-modified generalization loss exceeds the threshold alpha, or if the training progress drops below tol. Here k is the maximum number of training (in-sample) losses to be used to estimate the training progress.

Context and explanation of terminology. The progress-modified loss is defined in the following scenario: Estimates, $E_1, E_2, ..., E_t$, of the out-of-sample loss of an iterative supervised learner are being computed, but not necessarily at every iteration. However, training losses for every iteration are being made available (usually as a by-product of training) which can be used to quantify recent training progress, as follows.

Fix a time $j$, corresponding to some out-of-sample loss $E_j$, and let $F_1$ be the corresponding training loss, $F_2$ the training loss in the previous interation of the model, $F_3$, the training loss two iterations previously, and so on. Let $K$ denote the number of model iterations since the last out-of-sample loss $E_{j-1}$ was computed, or k, whichever is the smaller. Then the training progress at time $j$ is defined by

$P_j = 1000 |M - m|/|m|$

where $M$ is the mean of the training losses $F_1, F_2, \ldots , F_K$ and $m$ the minimum value of those losses.

The progress-modified generalization loss at time $t$ is given by

$PQ_t = GL_t / P_t$

where $GL_t$ is the generalization loss at time $t$; see GL.

Reference: Prechelt, Lutz (1998): "Early Stopping- But When?", in Neural Networks: Tricks of the Trade, ed. G. Orr, Springer..

IterationControl.InfoType
Info(f=identity)

An iteration control, as in, Info(my_loss_function).

Log to Info the value of f(m), where m is the object being iterated. If IterativeControl.expose(m) has been overloaded, then log f(expose(m)) instead.

Can be suppressed by setting the global verbosity level sufficiently low.

See also Warn, Error.

IterationControl.WarnType
Warn(predicate; f="")

An iteration control, as in, Warn(m -> length(m.cache) > 100, f="Memory low").

If predicate(m) is true, then log to Warn the value of f (or f(IterationControl.expose(m)) if f is a function). Here m is the object being iterated.

Can be suppressed by setting the global verbosity level sufficiently low.

See also Info, Error.

IterationControl.ErrorType
Error(predicate; f="", exception=nothing))

An iteration control, as in, Error(m -> isnan(m.bias), f="Bias overflow!").

If predicate(m) is true, then log at the Error level the value of f (or f(IterationControl.expose(m)) if f is a function) and stop iteration at the end of the current control cycle. Here m is the object being iterated.

Specify exception=... to throw an immediate execption, without waiting to the end of the control cycle.

See also Info, Warn.

IterationControl.CallbackType
Callback(f=_->nothing, stop_if_true=false, stop_message=nothing, raw=false)

An iteration control, as in, Callback(m->put!(v, my_loss_function(m)).

Call f(IterationControl.expose(m)), where m is the object being iterated, unless raw=true, in which case call f(m) (guaranteed if expose has not been overloaded.) If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

IterationControl.WithNumberDoType
WithNumberDo(f=n->@info("number: $n"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithNumberDo(n->put!(my_channel, n)).

Call f(n + 1), where n is the number of complete control cycles. of the control (so, n = 1, 2, 3, ..., unless control is wrapped in a IterationControl.skip)`.

If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

MLJIteration.WithIterationsDoType
WithIterationsDo(f=n->@info("num iterations: $n"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithIterationsDo(i->put!(my_channel, i)).

Call f(i), where i is the current number of model iterations (generally more than the number of control cycles). If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

IterationControl.WithLossDoType
WithLossDo(f=x->@info("loss: $x"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithLossDo(x->put!(my_losses, x)).

Call f(loss), where loss is current loss.

If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

IterationControl.WithTrainingLossesDoType
WithTrainingLossesDo(f=v->@info("training: $v"), stop_if_true=false, stop_message=nothing)

An iteration control, as in, WithTrainingLossesDo(v->put!(my_losses, last(v)).

Call f(training_losses), where training_losses is the vector of most recent batch of training losses.

If stop_if_true is true, then trigger an early stop if the value returned by f is true, logging the stop_message if specified.

MLJSerialization.SaveType
Save(filename="machine.jlso"; kwargs...)

An iteration control, as in, Save("run3/machine.jlso", compression=:gzip).

Save the current state of the machine being iterated to disk, using the provided filename, decorated with a number, as in "run3/machine_42.jlso". The specified kwargs are passed to the model-specific serializer (JLSO for most Julia models).

For more on what is meant by "the machine being iterated", see IteratedModel.

Control wrappers

IterationControl.skipFunction
IterationControl.skip(control, predicate=1)

An iteration control wrapper.

If predicate is an integer, k: Apply control on every k calls to apply the wrapped control, starting with the kth call.

If predicate is a function: Apply control as usual when predicate(n + 1) is true but otherwise skip. Here n is the number of control cycles applied so far.

IterationControl.louderFunction
IterationControl.louder(control, by=1)

Wrap control to make in more (or less) verbose. The same as control, but as if the global verbosity were increased by the value by.

IterationControl.debugFunction
IterationControl.debug(control)

Wrap control for debugging purposes. Acts exactly like control except that the internal state of control is logged to Info at every update.

IterationControl.compositeFunction
composite(controls...)

Construct an iteration control that applies the specified controls in sequence.