Reject Option Classification

Description

A post-processing fairness technique that modifies predictions in regions of high uncertainty to favour disadvantaged groups and achieve fairness objectives. The method identifies a 'rejection region' where the model's confidence is low (typically near the decision boundary) and reassigns predictions within this region to benefit underrepresented groups. By leveraging model uncertainty, this approach can improve fairness metrics like demographic parity or equalised odds whilst minimising changes to confident predictions, thus preserving overall accuracy for cases where the model is certain.

Example Use Cases

Fairness

Adjusting hiring algorithm predictions in the uncertainty region where candidate scores are close to the threshold, reassigning borderline cases to ensure equal selection rates across gender and ethnicity groups whilst maintaining decisions for clearly qualified or unqualified candidates.

Reliability

Improving reliability of loan approval systems by identifying applications where the model is uncertain and adjusting these edge cases to ensure consistent approval rates across demographic groups, reducing the risk of systematic discrimination in borderline creditworthiness assessments.

Transparency

Creating transparent bail decision systems that clearly document which predictions fall within the rejection region and how adjustments are made, providing courts with explainable fairness interventions that show exactly when and why decisions were modified for equity.

Limitations

  • Requires models that provide reliable uncertainty estimates or probability scores, limiting applicability to deterministic classifiers without confidence outputs.
  • Selection of the rejection region threshold is subjective and requires careful tuning to balance fairness improvements with accuracy preservation.
  • May reject too many instances if tuned conservatively, potentially affecting a large portion of predictions and reducing the model's practical utility.
  • Cannot address bias in confident predictions outside the rejection region, limiting effectiveness when discrimination occurs in high-certainty cases.
  • Performance depends on the quality of uncertainty estimates, which may be poorly calibrated in some models, leading to inappropriate rejection regions.

Resources

Machine Learning with a Reject Option: A survey
DocumentationKilian Hendrickx et al.
aif360.algorithms.postprocessing.RejectOptionClassification ...
Documentation
Survey on Leveraging Uncertainty Estimation Towards Trustworthy Deep Neural Networks: The Case of Reject Option and Post-training Processing
DocumentationM. Hasan et al.

Tags

Applicable Models:
Assurance Goal Category:
Data Type:
Expertise Needed:
Fairness Approach:
Lifecycle Stage:
Technique Type: