Multi-Accuracy Boosting
Description
An in-processing fairness technique that employs boosting algorithms to improve accuracy uniformly across demographic groups by iteratively correcting errors where the model performs poorly for certain subgroups. The method uses a multi-calibration approach that trains weak learners to focus on prediction errors for underperforming groups, ensuring that no group experiences systematically worse accuracy. This iterative boosting process continues until accuracy parity is achieved across all groups whilst maintaining overall model performance.
Example Use Cases
Fairness
Training a medical diagnosis model that achieves equal accuracy across age, gender, and ethnicity groups by using boosting to specifically target prediction errors for underrepresented patient demographics, ensuring equitable healthcare outcomes for all populations.
Reliability
Building a robust fraud detection system that maintains consistent accuracy across different customer segments by iteratively correcting errors where the model performs poorly for specific demographic or geographic groups, ensuring reliable fraud prevention across all user types.
Transparency
Developing a transparent hiring algorithm that provides clear evidence of equal performance across candidate demographics by using multi-accuracy boosting to systematically address group-specific prediction errors, enabling auditable fair recruitment processes.
Limitations
- Requires identifying and defining relevant subgroups or error regions, which may be challenging when group boundaries are unclear or overlapping.
- Could increase model complexity significantly as the boosting process adds multiple weak learners, potentially affecting interpretability and computational efficiency.
- May overfit to training data if very granular corrections are made, particularly when subgroups are small or the boosting process continues for too many iterations.
- Performance depends on the quality of subgroup identification, and may fail to achieve fairness if important demographic intersections are not properly captured.
- Convergence to equal accuracy across groups is not guaranteed, especially when there are fundamental differences in data distributions between groups.
Resources
Research Papers
Multigroup Robustness
To address the shortcomings of real-world datasets, robust learning algorithms have been designed to overcome arbitrary and indiscriminate data corruption. However, practical processes of gathering data may lead to patterns of data corruption that are localized to specific partitions of the training dataset. Motivated by critical applications where the learned model is deployed to make predictions about people from a rich collection of overlapping subpopulations, we initiate the study of multigroup robust algorithms whose robustness guarantees for each subpopulation only degrade with the amount of data corruption inside that subpopulation. When the data corruption is not distributed uniformly over subpopulations, our algorithms provide more meaningful robustness guarantees than standard guarantees that are oblivious to how the data corruption and the affected subpopulations are related. Our techniques establish a new connection between multigroup fairness and robustness.