Multi-Accuracy Boosting

Description

An in-processing fairness technique that employs boosting algorithms to improve accuracy uniformly across demographic groups by iteratively correcting errors where the model performs poorly for certain subgroups. The method uses a multi-calibration approach that trains weak learners to focus on prediction errors for underperforming groups, ensuring that no group experiences systematically worse accuracy. This iterative boosting process continues until accuracy parity is achieved across all groups whilst maintaining overall model performance.

Example Use Cases

Fairness

Training a medical diagnosis model that achieves equal accuracy across age, gender, and ethnicity groups by using boosting to specifically target prediction errors for underrepresented patient demographics, ensuring equitable healthcare outcomes for all populations.

Reliability

Building a robust fraud detection system that maintains consistent accuracy across different customer segments by iteratively correcting errors where the model performs poorly for specific demographic or geographic groups, ensuring reliable fraud prevention across all user types.

Transparency

Developing a transparent hiring algorithm that provides clear evidence of equal performance across candidate demographics by using multi-accuracy boosting to systematically address group-specific prediction errors, enabling auditable fair recruitment processes.

Limitations

  • Requires identifying and defining relevant subgroups or error regions, which may be challenging when group boundaries are unclear or overlapping.
  • Could increase model complexity significantly as the boosting process adds multiple weak learners, potentially affecting interpretability and computational efficiency.
  • May overfit to training data if very granular corrections are made, particularly when subgroups are small or the boosting process continues for too many iterations.
  • Performance depends on the quality of subgroup identification, and may fail to achieve fairness if important demographic intersections are not properly captured.
  • Convergence to equal accuracy across groups is not guaranteed, especially when there are fundamental differences in data distributions between groups.

Resources

Research Papers

mcboost: Multi-Calibration Boosting for R
Bernd Bischl et al.Aug 24, 2021
Multigroup Robustness
Lunjia Hu, Charlotte Peale, and Judy Hanwen ShenMay 1, 2024

To address the shortcomings of real-world datasets, robust learning algorithms have been designed to overcome arbitrary and indiscriminate data corruption. However, practical processes of gathering data may lead to patterns of data corruption that are localized to specific partitions of the training dataset. Motivated by critical applications where the learned model is deployed to make predictions about people from a rich collection of overlapping subpopulations, we initiate the study of multigroup robust algorithms whose robustness guarantees for each subpopulation only degrade with the amount of data corruption inside that subpopulation. When the data corruption is not distributed uniformly over subpopulations, our algorithms provide more meaningful robustness guarantees than standard guarantees that are oblivious to how the data corruption and the affected subpopulations are related. Our techniques establish a new connection between multigroup fairness and robustness.

Software Packages

mcboost
Dec 28, 2020

Multi-Calibration & Multi-Accuracy Boosting for R

Tags

Data Type:
Expertise Needed:
Fairness Approach:
Lifecycle Stage:
Technique Type: