Average Odds Difference

Description

Average Odds Difference measures fairness by calculating the average difference in both false positive rates and true positive rates between different demographic groups. This metric captures how consistently a model performs across groups for both positive and negative predictions. A value of 0 indicates perfect fairness under the equalized odds criterion, while larger absolute values indicate greater disparities in model performance between groups.

Example Use Cases

Fairness

Evaluating criminal risk assessment tools to ensure equal false positive rates (wrongly flagging low-risk individuals as high-risk) and true positive rates (correctly identifying high-risk individuals) across racial and ethnic groups.

Auditing hiring algorithms to verify that both the rate of correctly identifying qualified candidates and the rate of incorrectly rejecting qualified candidates remain consistent across gender and demographic groups.

Reliability

Monitoring loan approval systems to ensure reliable performance by checking that both approval rates for creditworthy applicants and rejection rates for non-creditworthy applicants are consistent across protected demographic categories.

Testing medical diagnostic models to validate that diagnostic accuracy (both correctly identifying disease and correctly ruling out disease) remains consistent across patient demographics, ensuring reliable healthcare delivery.

Limitations

  • Averaging effect can mask important disparities when false positive and true positive rate differences compensate for each other, potentially hiding significant bias in one direction.
  • Requires access to ground truth labels and sensitive attribute information, which may not be available in all deployment scenarios or may be subject to privacy constraints.
  • Does not account for base rate differences between groups, meaning equal error rates may not translate to equal treatment when group prevalences differ significantly.
  • Focuses solely on prediction accuracy disparities without considering whether the underlying decision-making process or feature selection introduces systematic bias against certain groups.
  • May encourage optimization for fairness metrics at the expense of overall model performance, potentially reducing utility for the primary prediction task.

Resources

Equality of Opportunity in Supervised Learning
Research PaperHardt, Moritz, Price, Eric, and Srebro, NathanOct 7, 2016

Foundational paper introducing equalized odds and related fairness metrics including average odds difference

FairBalance: How to Achieve Equalized Odds With Data Pre-processing
Research PaperYu, Zhe, Chakraborty, Joymallya, and Menzies, TimJul 17, 2021

Research on achieving equalized odds through data preprocessing techniques with practical implementation guidance

AIF360: Average Odds Difference Documentation
Documentation

IBM AIF360 toolkit implementation and documentation for computing average odds difference metrics

Fairlearn: A toolkit for assessing and improving fairness in machine learning
Software Package

Microsoft's comprehensive fairness toolkit with implementations of various fairness metrics including average odds difference

Tags

Applicable Models:
Data Requirements:
Data Type:
Evidence Type:
Expertise Needed:
Fairness Approach:
Lifecycle Stage:
Technique Type: