Description

UMAP (Uniform Manifold Approximation and Projection) is a non-linear dimensionality reduction technique that creates 2D or 3D visualisations of high-dimensional data by constructing a mathematical model of the data's underlying manifold structure. Unlike t-SNE, UMAP preserves both local neighbourhood relationships and global topology more effectively, using techniques from topological data analysis and Riemannian geometry. This approach often produces more interpretable cluster layouts while maintaining meaningful distances between clusters, making it particularly valuable for exploratory data analysis and understanding complex dataset structures.

Example Use Cases

Explainability

Analysing single-cell RNA sequencing data to visualise how different cell types cluster based on gene expression patterns, revealing developmental trajectories and identifying previously unknown cell subtypes in tissue samples.

Exploring customer segmentation by reducing hundreds of behavioural and demographic features to 2D space, showing how different customer groups relate to each other and identifying transition zones where customers might move between segments.

Limitations

  • Hyperparameter choices (n_neighbors, min_dist, metric) significantly influence the embedding structure and can lead to very different interpretations of the same data.
  • While preserving global structure better than t-SNE, distances in the reduced space still don't directly correspond to distances in the original feature space.
  • Performance can be sensitive to the choice of distance metric, which may not be obvious for complex or mixed data types.
  • Like other manifold learning techniques, it assumes the data lies on a lower-dimensional manifold, which may not hold for all datasets.

Resources

lmcinnes/umap
Software Package
UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
Research PaperLeland McInnes, John Healy, and James MelvilleFeb 9, 2018
Uniform Manifold Approximation and Projection (UMAP) and its Variants: Tutorial and Survey
DocumentationBenyamin Ghojogh et al.Aug 25, 2021
How UMAP Works — umap 0.5.8 documentation
Tutorial

Tags

Applicable Models:
Data Requirements:
Data Type:
Evidence Type:
Expertise Needed:
Explanatory Scope:
Lifecycle Stage:
Technique Type: