# More on Probabilistic Predictors

Although one can call `predict_mode`

on a probabilistic binary classifier to get deterministic predictions, a more flexible strategy is to wrap the model using `BinaryThresholdPredictor`

, as this allows the user to specify the threshold probability for predicting a positive class. This wrapping converts a probabilistic classifier into a deterministic one.

The positive class is always the second class returned when calling `levels`

on the training target `y`

.

`MLJModels.BinaryThresholdPredictor`

— Type`BinaryThresholdPredictor(model; threshold=0.5)`

Wrap the `Probabilistic`

model, `model`

, assumed to support binary classification, as a `Deterministic`

model, by applying the specified `threshold`

to the positive class probability. In addition to conventional supervised classifiers, it can also be applied to outlier detection models that predict normalized scores - in the form of appropriate `UnivariateFinite`

distributions - that is, models that subtype `AbstractProbabilisticUnsupervisedDetector`

or `AbstractProbabilisticSupervisedDetector`

.

By convention the positive class is the second class returned by `levels(y)`

, where `y`

is the target.

If `threshold=0.5`

then calling `predict`

on the wrapped model is equivalent to calling `predict_mode`

on the atomic model.

**Example**

Below is an application to the well-known Pima Indian diabetes dataset, including optimization of the `threshold`

parameter, with a high balanced accuracy the objective. The target class distribution is 500 positives to 268 negatives.

Loading the data:

```
using MLJ, Random
rng = Xoshiro(123)
diabetes = OpenML.load(43582)
outcome, X = unpack(diabetes, ==(:Outcome), rng=rng);
y = coerce(Int.(outcome), OrderedFactor);
```

Choosing a probabilistic classifier:

```
EvoTreesClassifier = @load EvoTreesClassifier
prob_predictor = EvoTreesClassifier()
```

Wrapping in `TunedModel`

to get a deterministic classifier with `threshold`

as a new hyperparameter:

```
point_predictor = BinaryThresholdPredictor(prob_predictor, threshold=0.6)
Xnew, _ = make_moons(3, rng=rng)
mach = machine(point_predictor, X, y) |> fit!
predict(mach, X)[1:3] # [0, 0, 0]
```

Estimating performance:

```
balanced = BalancedAccuracy(adjusted=true)
e = evaluate!(mach, resampling=CV(nfolds=6), measures=[balanced, accuracy])
e.measurement[1] # 0.405 ± 0.089
```

Wrapping in tuning strategy to learn `threshold`

that maximizes balanced accuracy:

```
r = range(point_predictor, :threshold, lower=0.1, upper=0.9)
tuned_point_predictor = TunedModel(
point_predictor,
tuning=RandomSearch(rng=rng),
resampling=CV(nfolds=6),
range = r,
measure=balanced,
n=30,
)
mach2 = machine(tuned_point_predictor, X, y) |> fit!
optimized_point_predictor = report(mach2).best_model
optimized_point_predictor.threshold # 0.260
predict(mach2, X)[1:3] # [1, 1, 0]
```

Estimating the performance of the auto-thresholding model (nested resampling here):

```
e = evaluate!(mach2, resampling=CV(nfolds=6), measure=[balanced, accuracy])
e.measurement[1] # 0.477 ± 0.110
```