Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information

project = "UQ Decision Tree"
copyright = "2026, UQ Tree Contributors"
author = "UQ Tree Contributors"
copyright = "2026, Moussa Kassem Sbeyti, Peter Steinbach, Alina Bazarova, Athar Khodabakhsh, Leon Tim Engelbert Sievers, and Contributors"
author = "Moussa Kassem Sbeyti, Peter Steinbach, Alina Bazarova, Athar Khodabakhsh, Leon Tim Engelbert Sievers, and Contributors"
release = "0.1.0"

# -- General configuration ---------------------------------------------------
Expand Down Expand Up @@ -45,7 +45,7 @@

html_theme = "sphinx_rtd_theme"
html_static_path = ["_static"]
html_title = "A Decision Tree for Practitioners Needing Uncertainty Quantification"
html_title = "A Decision Tree for Uncertainty Quantification"

# -- Options for LaTeX output ------------------------------------------------

Expand Down
64 changes: 64 additions & 0 deletions docs/decision-tree.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,69 @@
# Decision Tree

## The Very First Steps

While UQ methods are potentially powerful, their effectiveness and suitability for a particular use case are highly dependent on the model setup and the intended goals of the study. In this section, we present a series of guiding questions along with their explanations, intended to help readers assess in advance whether UQ is likely to benefit their research. As a general guidance we note, that it is important to collect all available prior knowledge about the problem and clearly define the expected role and objectives of uncertainty quantification.

### Purpose and Use Cases

**Is Uncertainty Quantification necessary, and if so, what for?**

This question is essential because the value of uncertainty estimates depends on the context: in high-stakes or exploratory applications, UQ provides crucial information about confidence, risk, and decision-making reliability. On the other hand, the absence of a clearly defined use-case is a problem for UQ, since it is not universally necessary or equally valuable across all tasks. A number of issues need to be considered when working with UQ methods: increase in model complexity and computational cost, problem-specific choice of UQ method, difficulty in evaluation, to name just a few.

**How complex is the problem under consideration?**

Evaluating the problem complexity can help balance the benefits of UQ against its computational and methodological costs. For example, simple well-characterized tasks may not benefit from UQ, while complex problems with high-dimensional, nonlinear, or noisy data often generate significant predictive uncertainty. However, of note, overly complex models can produce very wide uncertainty estimates, which may lead to inconclusive results.

**What is the significance of the UQ metric for your application?**

Determining the importance of the UQ metric is essential because it establishes whether uncertainty estimates will meaningfully impact decisions and contribute to the results interpretation.

### Data Considerations

**Is real-world data available for your application?**

Without real-world data, uncertainty estimates may be unreliable, as results obtained from simulated or toy data can differ significantly from the true uncertainty present in the system, undermining the practical usefulness of UQ.

**Are there known issues or limitations with the data, such as missing values, noise, bias, or measurement errors?**

Data quality issues can directly compromise the reliability of uncertainty estimates. Recognizing and addressing such problems is essential to ensure that UQ reflects the true uncertainty in predictions rather than artifacts of flawed data.

**Is ground truth or benchmark uncertainty available for your problem, and how do the estimated uncertainty values compare to these references?**

Benchmark or reference uncertainty values, when available, provide a useful point of comparison for assessing the reliability of predicted uncertainty, though UQ methods can still offer qualitative insights even without such references.

### Uncertainty Types

**Is it possible to distinguish between the sources of uncertainty?**

This question is typically applicable to advanced projects, in which the practitioner is familiar with the factors contributing to uncertainty in the model or system. Two primary sources of uncertainty are aleatoric (inherent data noise) and epistemic (model or knowledge uncertainty). The ability to distinguish between them is crucial for selecting appropriate UQ methods, interpreting predictions correctly, and making informed decisions. If this distinction is not possible, the interpretability, actionable insight, and overall usefulness of the UQ results may be limited.

### How Much Effort is the Intervention in My Project?

#### Method Complexity

- Is your method complex?
- Are you trying to explicitly model densities?
- Was the method fully Bayesian?
- Is it a post-hoc method? Is it approximate Bayesian?
- Is your task Mean/Variance estimation?
- Do you apply variational inference?
- Are you using CNN?

#### Explicit Modeling

- Is it model-directed?
- How many parameters did the model have?
- Is your likelihood tractable?
- Were your priors restrictive?
- Were there issues with the model complexity?
- How does the quality of the model impact UQ?

#### Computation

- How much time did you have to train?
- Is the method computationally expensive?

## The UQ Decision Framework

This section presents our decision tree framework for selecting appropriate uncertainty quantification methods.
Expand Down
4 changes: 3 additions & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# A Decision Tree for Practitioners Needing Uncertainty Quantification for Their Deep Learning Project
# A Decision Tree for Uncertainty Quantification

Welcome to the documentation for the UQ Decision Tree paper.

Expand All @@ -14,6 +14,7 @@ caption: Contents
introduction
methodology
decision-tree
method-cards
case-studies
conclusions
references
Expand All @@ -24,6 +25,7 @@ references
- [Introduction](introduction.md)
- [Methodology](methodology.md)
- [Decision Tree](decision-tree.md)
- [Method Cards](method-cards.md)
- [Case Studies](case-studies.md)
- [Conclusions](conclusions.md)

Expand Down
35 changes: 10 additions & 25 deletions docs/introduction.md
Original file line number Diff line number Diff line change
@@ -1,35 +1,20 @@
# Introduction

## Background
Forecasting and parameter estimation lie at the core of modern Machine Learning (ML) systems that support real-world decision-making and policy formation. While substantial progress has been made in improving the predictive performance of data-driven and Deep Learning (DL) models, point estimates alone remain an incomplete representation of model outputs. This has motivated the development of uncertainty quantification (UQ) methods, as well as the adaptation of classical approaches to modern machine-learning architectures, with the goal of characterizing uncertainty arising from data, model assumptions, and limited generalization. In recent years, uncertainty quantification has become a prominent topic within the ML research community, with dedicated workshops on UQ at flagship conferences such as ICML and ICLR, UQ-related main conference talks and significant numbers of accepted papers at NeurIPS, as well as domain-specific survey articles.

Uncertainty quantification (UQ) in deep learning has become increasingly important as machine learning models are deployed in high-stakes applications such as medical diagnosis, autonomous driving, and financial forecasting.
This increasing focus on uncertainty quantification signals a maturation of machine learning from accuracy-driven prediction toward trustworthy decision support. By explicitly modeling uncertainty, ML systems become more reliable, interpretable, and suitable for deployment in real-world and high-stakes settings.

## Motivation

While numerous uncertainty quantification methods exist, practitioners often struggle to determine which approach is most suitable for their specific use case. This paper aims to provide a practical guide for selecting appropriate UQ methods based on project requirements, constraints, and characteristics.
This work would provide relevant guidance for various ML researchers having different backgrounds and use-cases to take the benefit of a structure to quantify their model/data uncertainties.

## Scope
Many practitioners have working ML systems and face various real-life cases when they might want to explore beyond point estimates. This is our audience who want to explore UQ methods and add them to their project, for instance:

This work focuses on:
1. **Researchers working on tabular data** who train ML models and want to provide reliable uncertainty estimates for their domain expert collaborators.
2. **Researchers working on time-series data** who use Neural Networks for deploying on small systems/products.
3. **Researchers from domain sciences** such as Natural Sciences, who use data science for different data formats such as image, numerical analysis, text, etc., and besides experimental errors and uncertainties require to add the uncertainty from ML tooling.
4. **Researchers with quantitative backgrounds** who want to use DL and explore suitable paths for including UQ into their inferences.

- **Epistemic uncertainty**: Uncertainty due to lack of knowledge or data
- **Aleatoric uncertainty**: Uncertainty inherent in the data or process
- **Practical considerations**: Computational cost, implementation complexity, and interpretability
## How to Use This Document

## Target Audience

This guide is designed for:

- Machine learning engineers deploying models in production
- Data scientists working on safety-critical applications
- Researchers seeking to incorporate uncertainty quantification in their work
- Technical managers making decisions about UQ implementation

## Organization of This Paper

The remainder of this paper is organized as follows:

- **Methodology**: Our approach to developing the decision tree
- **Decision Tree**: The core decision tree framework
- **Case Studies**: Real-world examples demonstrating the use of the decision tree
- **Conclusions**: Summary and future directions
To guide the reader through this document, we recommend beginning with the **The Very First Steps** section (in [Decision Tree](decision-tree.md)), which poses a set of motivating questions and discusses the necessity and practical utility of uncertainty quantification methods across different scenarios. Second, we survey a series of uncertainty quantification methods in the **Method Cards** section, with the aim of supporting method selection for particular problem settings.
89 changes: 89 additions & 0 deletions docs/method-cards.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Method Cards

This section provides detailed descriptions of uncertainty quantification methods to support method selection for particular problem settings. Each method is evaluated across multiple dimensions to help practitioners choose the most appropriate approach for their use case.

## Method Overview Table

The following table provides a quick overview of each method's characteristics. Ratings are indicated with stars (★), where more stars indicate better performance in that dimension.

| Method Name | Easy to Apply | Data Compatibility | Task Compatibility | Ease of Integration | Computationally Cheap |
|-------------|---------------|-------------------|-------------------|---------------------|----------------------|
| MVE/Loss Attenuation | ★★★★☆ | ★★★★☆ | ★★★☆☆ | ★★★★☆ | ★★★★★ |

## MVE/Loss Attenuation

### Abstract

Mean Variance Estimation (MVE) is a technique where a neural network is modified to output both a mean prediction $\mu(x)$ and a variance estimate $\sigma^2(x)$, modeling the target as a Gaussian distribution.

By training with negative log-likelihood (NLL) loss instead of standard MSE for the example of regression, the network learns heteroscedastic (input-dependent) aleatoric uncertainty.

A key property is "loss attenuation": the model automatically downweights high-uncertainty samples during training, making it robust to label noise.

**Key Resources:**

- **Original Paper:** [Estimating the Mean and Variance of the Target Probability Distribution (Nix & Weigend, ICNN 1994)](https://ieeexplore.ieee.org/document/374138)
- **Key Work:** [What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? (Kendall & Gal, NeurIPS 2017)](https://arxiv.org/abs/1703.04977)
- **Implementation:** Straightforward in PyTorch/TensorFlow, double the output size to account for variance and use Gaussian NLL loss.

### Easy to Apply ★★★★☆

MVE is conceptually simple and requires only minor architectural modifications: extending the output and changing the loss function to Gaussian NLL. No specialized libraries are needed.

However, training can be sensitive to hyperparameters in more complex deep learning tasks such as object detection where you have to assign a weight to the loss. A "warm-up" period (training only the mean first) is often necessary to avoid convergence issues where the variance inflates to explain poor mean estimates.

### Data Compatibility ★★★★☆

MVE works with any continuous regression target and is compatible with various input types (images, tabular, time series). It naturally handles heteroscedastic data where noise varies across inputs. Not directly applicable to classification without modifications. Requires sufficient data to reliably learn the variance function.

### Task Compatibility ★★★☆☆

Primarily designed for regression tasks. It has been successfully extended to localization in object detection. The method captures aleatoric (data) uncertainty but does not capture epistemic (model) uncertainty. For that, MVE is typically combined with ensembles or MC Dropout. OOD detection is limited as the model may predict low variance for OOD inputs if they resemble training data. Calibration of variance estimates may require post-hoc adjustment, especially for small datasets or complex architectures.

### Ease of Integration ★★★★☆

MVE integrates easily into existing pipelines. No retraining of the full architecture is needed. One can often fine-tune an existing model. Works with any architecture where the user has access to the final layer. Can be combined with other UQ methods (ensembles, dropout) for comprehensive uncertainty estimation. It cannot be combined with evidential deep learning directly.

### Computationally Cheap ★★★★★

Very efficient. It works in real-time and only doubles the parameters in the last layer. It only requires a single forward pass during inference, adding no overhead compared to standard networks. Training cost is essentially identical to standard training. This makes MVE one of the most computationally attractive UQ methods available.

### Caveats

- **Only captures aleatoric uncertainty:** MVE estimates data noise, not model uncertainty. Must be combined with other methods (e.g., ensembles, MC Dropout) to capture epistemic uncertainty.
- **Training instability:** Without proper initialization or warm-up periods, the variance can inflate to "explain away" prediction errors, compromising mean accuracy.
- **Overconfidence on OOD data:** The model may predict low variance for out-of-distribution inputs that superficially resemble training data.
- **Gaussian assumption:** Assumes normally distributed errors, which may not hold for all datasets. Heavy-tailed or multimodal noise distributions are not well captured.
- **Calibration required:** Raw variance estimates may need post-hoc calibration to provide reliable uncertainty quantification.

## Additional Methods

### Method Name Template

#### Abstract

[Description of the method and its key characteristics]

#### Easy to Apply

[Rating and explanation of how easy it is to implement and use]

#### Data Compatibility

[Rating and explanation of what types of data the method works with]

#### Task Compatibility

[Rating and explanation of what tasks the method is suitable for]

#### Ease of Integration

[Rating and explanation of how easily it integrates with existing systems]

#### Computationally Cheap

[Rating and explanation of computational costs]

#### Caveats

[List of limitations and considerations]
14 changes: 12 additions & 2 deletions docs/references.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
# References

## Recent Survey Articles

1. **Lopez, R., et al.** (2025). Uncertainty Quantification in Machine Learning. *PMLR*, Volume 287.

2. **Shi, Y., et al.** (2025). Advances in Uncertainty Quantification for Deep Learning. *Journal of Machine Learning Research*.

3. **Xu, X., et al.** (2025). Uncertainty quantification methods in machine learning: A comprehensive review. *Information Sciences*, 115817.

## Key Papers on Uncertainty Quantification

### Foundational Work
Expand Down Expand Up @@ -34,9 +42,11 @@

### Aleatoric and Epistemic Uncertainty

12. **Kendall, A., & Gal, Y.** (2017). What uncertainties do we need in Bayesian deep learning for computer vision? *Advances in Neural Information Processing Systems* (NeurIPS).
12. **Kendall, A., & Gal, Y.** (2017). What uncertainties do we need in Bayesian deep learning for computer vision? *Advances in Neural Information Processing Systems* (NeurIPS). [arXiv:1703.04977](https://arxiv.org/abs/1703.04977)

13. **Nix, D. A., & Weigend, A. S.** (1994). Estimating the mean and variance of the target probability distribution. *IEEE International Conference on Neural Networks* (ICNN). [IEEE Xplore](https://ieeexplore.ieee.org/document/374138)

13. **Malinin, A., & Gales, M.** (2018). Predictive uncertainty estimation via prior networks. *Advances in Neural Information Processing Systems* (NeurIPS).
14. **Malinin, A., & Gales, M.** (2018). Predictive uncertainty estimation via prior networks. *Advances in Neural Information Processing Systems* (NeurIPS).

### Applications and Case Studies

Expand Down