Average Odds Difference
Description
Average Odds Difference measures fairness by calculating the average difference in both false positive rates and true positive rates between different demographic groups. This metric captures how consistently a model performs across groups for both positive and negative predictions. A value of 0 indicates perfect fairness under the equalized odds criterion, while larger absolute values indicate greater disparities in model performance between groups.
Example Use Cases
Fairness
Evaluating criminal risk assessment tools to ensure equal false positive rates (wrongly flagging low-risk individuals as high-risk) and true positive rates (correctly identifying high-risk individuals) across racial and ethnic groups.
Auditing hiring algorithms to verify that both the rate of correctly identifying qualified candidates and the rate of incorrectly rejecting qualified candidates remain consistent across gender and demographic groups.
Reliability
Monitoring loan approval systems to ensure reliable performance by checking that both approval rates for creditworthy applicants and rejection rates for non-creditworthy applicants are consistent across protected demographic categories.
Testing medical diagnostic models to validate that diagnostic accuracy (both correctly identifying disease and correctly ruling out disease) remains consistent across patient demographics, ensuring reliable healthcare delivery.
Limitations
- Averaging effect can mask important disparities when false positive and true positive rate differences compensate for each other, potentially hiding significant bias in one direction.
- Requires access to ground truth labels and sensitive attribute information, which may not be available in all deployment scenarios or may be subject to privacy constraints.
- Does not account for base rate differences between groups, meaning equal error rates may not translate to equal treatment when group prevalences differ significantly.
- Focuses solely on prediction accuracy disparities without considering whether the underlying decision-making process or feature selection introduces systematic bias against certain groups.
- May encourage optimization for fairness metrics at the expense of overall model performance, potentially reducing utility for the primary prediction task.
Resources
Equality of Opportunity in Supervised Learning
Foundational paper introducing equalized odds and related fairness metrics including average odds difference
FairBalance: How to Achieve Equalized Odds With Data Pre-processing
Research on achieving equalized odds through data preprocessing techniques with practical implementation guidance