More on Probabilistic Predictors

Although one can call predict_mode on a probabilistic binary classifier to get deterministic predictions, a more flexible strategy is to wrap the model using BinaryThresholdPredictor, as this allows the user to specify the threshold probability for predicting a positive class. This wrapping converts a probabilistic classifier into a deterministic one.

The positive class is always the second class returned when calling levels on the training target y.

MLJModels.BinaryThresholdPredictorType
BinaryThresholdPredictor(model; threshold=0.5)

Wrap the Probabilistic model, model, assumed to support binary classification, as a Deterministic model, by applying the specified threshold to the positive class probability. In addition to conventional supervised classifiers, it can also be applied to outlier detection models that predict normalized scores - in the form of appropriate UnivariateFinite distributions - that is, models that subtype AbstractProbabilisticUnsupervisedDetector or AbstractProbabilisticSupervisedDetector.

By convention the positive class is the second class returned by levels(y), where y is the target.

If threshold=0.5 then calling predict on the wrapped model is equivalent to calling predict_mode on the atomic model.

Example

Below is an application to the well-known Pima Indian diabetes dataset, including optimization of the threshold parameter, with a high balanced accuracy the objective. The target class distribution is 500 positives to 268 negatives.

Loading the data:

using MLJ, Random
rng = Xoshiro(123)

diabetes = OpenML.load(43582)
outcome, X = unpack(diabetes, ==(:Outcome), rng=rng);
y = coerce(Int.(outcome), OrderedFactor);

Choosing a probabilistic classifier:

EvoTreesClassifier = @load EvoTreesClassifier
prob_predictor = EvoTreesClassifier()

Wrapping in TunedModel to get a deterministic classifier with threshold as a new hyperparameter:

point_predictor = BinaryThresholdPredictor(prob_predictor, threshold=0.6)
Xnew, _ = make_moons(3, rng=rng)
mach = machine(point_predictor, X, y) |> fit!
predict(mach, X)[1:3] # [0, 0, 0]

Estimating performance:

balanced = BalancedAccuracy(adjusted=true)
e = evaluate!(mach, resampling=CV(nfolds=6), measures=[balanced, accuracy])
e.measurement[1] # 0.405 ± 0.089

Wrapping in tuning strategy to learn threshold that maximizes balanced accuracy:

r = range(point_predictor, :threshold, lower=0.1, upper=0.9)
tuned_point_predictor = TunedModel(
    point_predictor,
    tuning=RandomSearch(rng=rng),
    resampling=CV(nfolds=6),
    range = r,
    measure=balanced,
    n=30,
)
mach2 = machine(tuned_point_predictor, X, y) |> fit!
optimized_point_predictor = report(mach2).best_model
optimized_point_predictor.threshold # 0.260
predict(mach2, X)[1:3] # [1, 1, 0]

Estimating the performance of the auto-thresholding model (nested resampling here):

e = evaluate!(mach2, resampling=CV(nfolds=6), measure=[balanced, accuracy])
e.measurement[1] # 0.477 ± 0.110
source