Threshold Optimiser

Description

Threshold Optimiser adjusts decision thresholds for different demographic groups after model training to satisfy specific fairness constraints. This post-processing technique optimises group-specific thresholds by analysing the probability distribution of model outputs, allowing practitioners to achieve fairness goals like demographic parity or equalised opportunity without modifying the underlying model. The optimiser finds optimal threshold values for each group that balance fairness requirements with overall model performance, making it particularly useful when fairness considerations arise after model deployment.

Example Use Cases

Fairness

Adjusting hiring decision thresholds in a recruitment system to ensure equal opportunity rates across gender and ethnicity groups, where the model outputs probability scores but different demographic groups require different thresholds to achieve equitable outcomes.

Optimising credit approval thresholds for different demographic groups in loan applications to satisfy regulatory requirements for equal treatment whilst maintaining acceptable default rates across all groups.

Calibrating medical diagnosis thresholds across age and gender groups to ensure diagnostic accuracy is maintained whilst preventing systematic over-diagnosis or under-diagnosis in specific populations.

Limitations

Requires a held-out dataset with known group memberships to determine optimal thresholds for each demographic group.
Threshold values may need recalibration when input data distributions shift or model performance changes over time.
Using different decision thresholds per group can raise legal or ethical concerns in deployment contexts where equal treatment is mandated.
Performance depends on the quality and representativeness of the calibration dataset for each demographic group.
May lead to reduced overall accuracy as the optimisation trades off individual accuracy for group fairness.

Resources

Group-Aware Threshold Adaptation for Fair Classification

Research Paper•Jang, Taeuk, Shi, Pengyi, and Wang, Xiaoqian•Nov 8, 2021

Introduces a novel post-processing method for learning adaptive classification thresholds for each demographic group by optimising confusion matrices estimated from model probability distributions.

Equality of Opportunity in Supervised Learning

Research Paper•Hardt, Moritz, Price, Eric, and Srebro, Nathan•Oct 7, 2016

Foundational work introducing threshold optimisation techniques to achieve equalized opportunity and demographic parity in supervised learning.

AIF360: A comprehensive set of fairness metrics and algorithms

Software Package

Open-source library containing threshold optimisation implementations for various fairness constraints including equalized odds and demographic parity.

Fairlearn: A toolkit for assessing and improving fairness in AI

Software Package

Python library providing threshold optimisation methods and post-processing algorithms for achieving fairness in machine learning models.

HolisticAI: Randomized Threshold Optimizer

Documentation

Documentation for the Randomized Threshold Optimizer implementation that achieves statistical parity through group-aware threshold adjustment with randomization.

Related Techniques

Name	Description	Assurance Goals
Attribute Removal (Fairness Through Unawareness)	Attribute Removal (Fairness Through Unawareness) ensures fairness by completely excluding protected attributes such as race, gender, or age from the model's input features. While this approach prevents direct discrimination, it may not eliminate bias if other features are correlated with protected attributes (proxy discrimination). This technique represents the most basic fairness intervention but often needs to be combined with other approaches to address indirect bias through seemingly neutral features.	Fairness Transparency
Empirical Calibration	Empirical calibration adjusts a model's predicted probabilities to match observed frequencies. For example, if events predicted with 80% confidence only occur 60% of the time, calibration would correct this overconfidence. Common techniques include Platt scaling and isotonic regression, which learn transformations that map the model's raw scores to well-calibrated probabilities, improving the reliability of confidence measures for downstream decisions.	Reliability Transparency Fairness
Equal Opportunity Difference	A fairness metric that quantifies discrimination by measuring the difference in true positive rates (recall) between protected and privileged groups. Based on Hardt et al.'s equality of opportunity framework, this metric computes the maximum difference in TPR across demographic groups, with a value of 0 indicating perfect fairness. The technique provides a mathematical measure of whether qualified individuals from different groups have equal chances of receiving positive predictions.	Fairness Transparency Reliability
Calibration with Equality of Opportunity	A post-processing fairness technique that adjusts model predictions to achieve equal true positive rates across protected groups whilst maintaining calibration within each group. The method addresses fairness by ensuring that qualified individuals from different demographic groups have equal chances of receiving positive predictions, whilst preserving the meaning of probability scores within each group. This technique attempts to balance the competing objectives of group fairness and accurate probability estimation.	Fairness Transparency Reliability
Reject Option Classification	A post-processing fairness technique that modifies predictions in regions of high uncertainty to favour disadvantaged groups and achieve fairness objectives. The method identifies a 'rejection region' where the model's confidence is low (typically near the decision boundary) and reassigns predictions within this region to benefit underrepresented groups. By leveraging model uncertainty, this approach can improve fairness metrics like demographic parity or equalised odds whilst minimising changes to confident predictions, thus preserving overall accuracy for cases where the model is certain.	Fairness Reliability Transparency
Red Teaming	Red teaming involves systematic adversarial testing of AI/ML systems by dedicated specialists who attempt to identify flaws, vulnerabilities, harmful outputs, and ways to circumvent safety measures. Drawing from cybersecurity practices, red teams employ diverse attack vectors including prompt injection, adversarial examples, edge case exploitation, social engineering scenarios, and goal misalignment probes. Unlike standard testing that validates expected behaviour, red teaming specifically seeks to break systems through creative and adversarial approaches, revealing non-obvious risks and failure modes that could be exploited maliciously or cause harm in deployment.	Safety Reliability Fairness Security