Description

Quantile regression estimates specific percentiles (quantiles) of the target variable rather than just predicting the average outcome. For example, instead of predicting 'average house price = £300,000', it can predict 'there's a 10% chance the price will be below £250,000, 50% chance below £300,000, and 90% chance below £380,000'. This technique reveals how input features affect different parts of the outcome distribution - perhaps property size strongly influences luxury homes (90th percentile) but barely affects budget properties (10th percentile). By capturing the full conditional distribution, quantile regression provides rich uncertainty information and enables robust prediction intervals.

Example Use Cases

Reliability

Predicting patient recovery times after surgery by estimating multiple quantiles (e.g., 25th, 50th, 75th percentiles), enabling doctors to communicate realistic timeframes: 'Most patients recover within 2-4 weeks, but some may take up to 8 weeks', providing robust uncertainty estimates for treatment planning.

Transparency

Revealing how income inequality affects different segments of society by showing how education's impact varies across income quantiles - demonstrating that education benefits high earners much more than low earners, providing transparent insights into systemic inequalities.

Fairness

Ensuring equitable loan amount predictions across demographic groups by verifying that the spread of predicted loan amounts (difference between 90th and 10th percentiles) is consistent across protected groups, preventing discriminatory practices in lending ranges.

Limitations

  • Computationally intensive when fitting multiple quantiles simultaneously, especially for large datasets or complex models, as each quantile requires separate optimization.
  • May produce crossing quantiles without proper constraints, where predicted 90th percentile values are lower than 50th percentile values, creating logically inconsistent and unusable prediction intervals.
  • Sensitive to outliers and heavy-tailed distributions, particularly in extreme quantiles (e.g., 5th or 95th percentiles), which can lead to unstable and unreliable estimates.
  • Requires careful selection of quantile levels and may need domain expertise to interpret results meaningfully, as different quantiles may reveal conflicting patterns in feature relationships.
  • Less effective with small datasets where extreme quantiles cannot be reliably estimated due to insufficient data points in the tails of the distribution.

Resources

Research Papers

Quantile Regression in Machine Learning: A Survey
Anshul Kumar et al.Jan 1, 2023

The mean regression methodology may not be successful in developing precise mathematical models when working with data that has a significant number of outliers. The reason for this is that the outliers have a considerable impact on the mean value, making it untrustworthy for the intended usage. A different model termed Quantile Regression (QR) has become more popular as a solution to this problem. Quantile regression is more resistant to outliers and can handle data with a wide range of distributions. It has emerged as a promising approach for practical applications, offering a more comprehensive view than mean regression. In this paper, we delve deeper into this strategy, with a particular focus on its application in various fields, including the prediction of wind power as well as its usage in finance and economics research. Rather than just providing a technical explanation of the theory or a comprehensive analysis of recent developments, we begin by examining important applications to illustrate the usefulness of the approach. We then discuss these topics in further detail with a brief introduction to their mathematics. Finally, we provide an overview of current research topics that make substantial use of this method, later concluding why we need it.

Software Packages

statsmodels
Jun 12, 2011

Statsmodels: statistical modeling and econometrics in Python

Documentations

Tutorial for conformalized quantile regression (CQR) — MAPIE 0.8.5 ...
Mapie DevelopersJan 1, 2021
Quantile Regression Forest — sklearn_quantile 0.1.1 documentation
Sklearn-quantile DevelopersJan 1, 2021
Quantile machine learning models for python — sklearn_quantile ...
Sklearn-quantile DevelopersJan 1, 2021

Tags