Average Odds Difference

Description

Average Odds Difference measures fairness by calculating the average difference in both false positive rates and true positive rates between different demographic groups. This metric captures how consistently a model performs across groups for both positive and negative predictions. A value of 0 indicates perfect fairness under the equalized odds criterion, while larger absolute values indicate greater disparities in model performance between groups.

Example Use Cases

Fairness

Evaluating criminal risk assessment tools to ensure equal false positive rates (wrongly flagging low-risk individuals as high-risk) and true positive rates (correctly identifying high-risk individuals) across racial and ethnic groups.

Auditing hiring algorithms to verify that both the rate of correctly identifying qualified candidates and the rate of incorrectly rejecting qualified candidates remain consistent across gender and demographic groups.

Reliability

Monitoring loan approval systems to ensure reliable performance by checking that both approval rates for creditworthy applicants and rejection rates for non-creditworthy applicants are consistent across protected demographic categories.

Testing medical diagnostic models to validate that diagnostic accuracy (both correctly identifying disease and correctly ruling out disease) remains consistent across patient demographics, ensuring reliable healthcare delivery.

Limitations

Averaging effect can mask important disparities when false positive and true positive rate differences compensate for each other, potentially hiding significant bias in one direction.
Requires access to ground truth labels and sensitive attribute information, which may not be available in all deployment scenarios or may be subject to privacy constraints.
Does not account for base rate differences between groups, meaning equal error rates may not translate to equal treatment when group prevalences differ significantly.
Focuses solely on prediction accuracy disparities without considering whether the underlying decision-making process or feature selection introduces systematic bias against certain groups.
May encourage optimization for fairness metrics at the expense of overall model performance, potentially reducing utility for the primary prediction task.

Resources

Equality of Opportunity in Supervised Learning

Research Paper•Hardt, Moritz, Price, Eric, and Srebro, Nathan•Oct 7, 2016

Foundational paper introducing equalized odds and related fairness metrics including average odds difference

FairBalance: How to Achieve Equalized Odds With Data Pre-processing

Research Paper•Yu, Zhe, Chakraborty, Joymallya, and Menzies, Tim•Jul 17, 2021

Research on achieving equalized odds through data preprocessing techniques with practical implementation guidance

AIF360: Average Odds Difference Documentation

Documentation

IBM AIF360 toolkit implementation and documentation for computing average odds difference metrics

Fairlearn: A toolkit for assessing and improving fairness in machine learning

Software Package

Microsoft's comprehensive fairness toolkit with implementations of various fairness metrics including average odds difference

Related Techniques

Name	Description	Assurance Goals
Automated Documentation Generation	Automated documentation generation creates and maintains up-to-date documentation using various methods including programmatic scripts, large language models (LLMs), and extraction tools. These approaches can capture model architectures, data schemas, feature importance, performance metrics, API specifications, and lineage information without manual writing. Methods range from traditional code parsing and template-based generation to modern AI-assisted documentation that can understand context and generate human-readable explanations.	Transparency Reliability
Temperature Scaling	Temperature scaling adjusts a model's confidence by applying a single parameter (temperature) to its predictions. When a model is too confident in its wrong answers, temperature scaling can fix this by making the predictions more realistic. It works by dividing the model's outputs by the temperature value before converting them to probabilities. Higher temperatures make the model less confident, whilst lower temperatures increase confidence. The technique maintains the model's accuracy whilst ensuring that when it says it's 90% confident, it's actually right about 90% of the time.	Reliability Transparency Fairness
Monotonicity Constraints	Monotonicity constraints enforce consistent directional relationships between input features and model predictions, ensuring that increasing a feature value either always increases, always decreases, or has no effect on the output. These constraints integrate domain knowledge into model training, preventing counterintuitive relationships that may arise from spurious correlations in data. By maintaining logical feature relationships (e.g., experience always positively influences salary), monotonicity constraints enhance model trustworthiness, interpretability, and alignment with business logic whilst often improving generalisation to new data.	Transparency Reliability
Disparate Impact Remover	Disparate Impact Remover is a preprocessing technique that transforms feature values in a dataset to reduce statistical dependence between features and protected attributes (like race or gender). The method modifies non-protected features through mathematical transformations that preserve the utility of the data whilst reducing correlations that could lead to discriminatory outcomes. This approach specifically targets the '80% rule' disparate impact threshold by adjusting feature distributions to ensure more equitable treatment across demographic groups in downstream model predictions.	Fairness Transparency Reliability
Feature Attribution with Integrated Gradients in NLP	Applies Integrated Gradients to natural language processing models to attribute prediction importance to individual input tokens, words, or subword units. This technique computes gradients along a straight-line path from a baseline input (typically all-zeros, padding tokens, or neutral text) to the actual input, integrating these gradients to obtain attribution scores. Unlike vanilla gradient methods, Integrated Gradients satisfies axioms of sensitivity and implementation invariance, making it particularly valuable for understanding transformer-based language models where token interactions are complex.	Explainability Fairness Safety
Model Cards	Model cards are standardised documentation frameworks that systematically document machine learning models through structured templates. The templates cover intended use cases, performance metrics across different demographic groups and operating conditions, training data characteristics, evaluation procedures, limitations, and ethical considerations. They serve as comprehensive technical specifications that enable informed model selection, prevent inappropriate deployment, support regulatory compliance, and facilitate fair assessment by providing transparent reporting of model capabilities and constraints across diverse populations and scenarios.	Transparency Fairness Safety