migrate this latex document to sphinx doc. no need to keep the old documents do not touch the docs/model_cards/ folder.
\documentclass{article}
\usepackage{graphicx} % Required for inserting images
\usepackage{multirow}
\usepackage{amsfonts}
\usepackage{hyperref}
\usepackage{amssymb}
\usepackage{stackengine}
\usepackage{scalerel}
\usepackage{xcolor}
\newcommand\openbigstar[1][0.7]{%
\scalerel*{%
\stackinset{c}{-.125pt}{c}{}{\scalebox{#1}{\color{white}{$\bigstar$}}}{%
$\bigstar$}%
}{\bigstar}
}
% PS: we can change the title later
\title{A decision tree for Uncertainty Quantification}
\author{Moussa Kassem Sbeyti, Peter Steinbach, Alina Bazarova, Athar Khodabakhsh, Leon Tim Engelbert Sievers, <your name>}
\date{January 2026}
\begin{document}
\maketitle
\section{Introduction}
% - UQ ubiqitous in ML and stats literature
Forecasting and parameter estimation lie at the core of modern Machine Learning (ML) systems that support real-world decision-making and policy formation. While substantial progress has been made in improving the predictive performance of data-driven and Deep Learning (DL) models, point estimates alone remain an incomplete representation of model outputs. This has motivated the development of uncertainty quantification (UQ) methods, as well as the adaptation of classical approaches to modern machine-learning architectures, with the goal of characterizing uncertainty arising from data, model assumptions, and limited generalization. In recent years, uncertainty quantification has become a prominent topic within the ML research community, with dedicated workshops on UQ at flagship conferences such as ICML and ICLR, UQ-related main conference talks and significant numbers of accepted papers at NeurIPS, as well as domain-specific survey articles (\cite{pmlr-v287-lopez25a,Shi2025,XU2025115817}).
This increasing focus on uncertainty quantification signals a maturation of machine learning from accuracy-driven prediction toward trustworthy decision support. By explicitly modeling uncertainty, ML systems become more reliable, interpretable, and suitable for deployment in real-world and high-stakes settings.
% - many papers published (Till's account of neurips'25)
% - sign of maturity of ML as a tool -> trustworthiness
\subsection{Motivation}
This work would provide relevant guidance for various ML researchers having different backgrounds and use-cases to take the benefit of a structure to quantify their model/data uncertainties.
\begin{itemize}
\item Many practitioners have working ML systems
\item Examples of real life cases when a person potentially might want to explore beyond point estimates
\item This is our audience who want to explore UQ methods and add them to their project for instance,
1) Researchers working on tabular data and train ML model on data and want to provide reliable uncertainty estimates for their domain expert collaborators. 2) Researchers working on time-series data and use Neural Networks for deploying on small systems/products. 3) Researchers from domain such as Natural Sciences, who use data science for different data formats such as image, numerical analysis, text, etc., and besides experimental errors and uncertainties require to add the uncertainty from ML tooling. 4) Researcher with quantitative backgrounds who want to use DL and explore suitable paths for including UQ into their inferences.
\end{itemize}
\subsection{How to use this paper}
To guide the reader through this manuscript, we recommend beginning with Section~\nameref{sec:first}, which poses a set of motivating questions and discusses the necessity and practical utility of uncertainty quantification methods across different scenarios. Second, we survey a series of uncertainty quantification methods in Section~\nameref{sec:methods}, with the aim of supporting method selection for particular problem settings.
% - first go to "the very first steps", pick answers\\
% - second select methods from "method cards" alongside the dimensions XYZ\\
% - how the methods are sorted: difficulty, field, datatype mb?\\
\section{The very first steps}\label{sec:first}
While UQ methods are potentially powerful, their effectiveness and suitability for a particular use case are highly dependent on the model setup and the intended goals of the study. In this section, we present a series of guiding questions along with their explanations, intended to help readers assess in advance whether UQ is likely to benefit their research. As a general guidance we note, that it is important to collect all available prior knowledge about the problem and clearly define the expected role and objectives of uncertainty quantification.
%\subsubsection {\textbf{Purpose/Use-cases}}
\paragraph{Is Uncertainty Quantification necessary, and if so, what for?} This question is essential because the value of uncertainty estimates depends on the context: in high-stakes or exploratory applications, UQ provides crucial information about confidence, risk, and decision-making reliability. On the other hand, the absence of a clearly defined use-case is a problem for UQ, since it is not universally necessary or equally valuable across all tasks. A number of issues need to be considered when working with UQ methods: increase in model complexity and computational cost, problem-specific choice of UQ method, difficulty in evaluation, to name just a few.
\paragraph{How complex is the problem under consideration?} Evaluating the problem complexity can help balance the benefits of UQ against its computational and methodological costs. For example, simple well-characterized tasks may not benefit from UQ, while complex problems with high-dimensional, nonlinear, or noisy data often generate significant predictive uncertainty. However, of note, overly complex models can produce very wide uncertainty estimates, which may lead to inconclusive results.
\paragraph{What is the significance of the UQ metric for your application?} Determining the importance of the UQ metric is essential because it establishes whether uncertainty estimates will meaningfully impact decisions and contribute to the results interpretation.
% \begin{itemize}
% \item Do you have use-cases? what task did you need uncertainty for? What type of task did you tackle?
% \item Why you problem is complex?
% \item How important is UQ metric for your application?
% \end{itemize}
%\subsubsection {\textbf{Data}}
\paragraph{Is real-world data available for your application?} Without real-world data, uncertainty estimates may be unreliable, as results obtained from simulated or toy data can differ significantly from the true uncertainty present in the system, undermining the practical usefulness of UQ.
\paragraph{Are there known issues or limitations with the data, such as missing values, noise, bias, or measurement errors?} Data quality issues can directly compromise the reliability of uncertainty estimates. Recognizing and addressing such problems is essential to ensure that UQ reflects the true uncertainty in predictions rather than artifacts of flawed data.
\paragraph{Is ground truth or benchmark uncertainty available for your problem, and how do the estimated uncertainty values compare to these references?} Benchmark or reference uncertainty values, when available, provide a useful point of comparison for assessing the reliability of predicted uncertainty, though UQ methods can still offer qualitative insights even without such references.
% \begin{itemize}
% \item do you have real data for your use-case?
% %\item did you use a public data set?
% %\item How big is you dataset?
% \item was there a problem with the data?
% %\item Were the observations high dimensional?
% %\item How many dimensions did your input/observations have?
% %\item If you have UQ on network, would you not have to check the data?
% %\item Which kind of data did you use?
% %\item Are you classifying images?
% %\item Which type of images did you use?
% \item Do you have ground truth? Do you have unceratinty labels?
% \item can you give a benchmark UQ values? if the observe a bigger UQ, then we can not trust it.
% %\item how to find the boundary of the UQ?
% %\item Is it a classification/regression/segmentation model?
% %\item Do you have more than two classes?
% %\item Is it supervised?
% %\item Have you tried Self-supervised method?
% %\item What is your regression target?
% \end{itemize}
%\subsubsection {\textbf{Uncertainty type}}
\paragraph{Is it possible to distinguish between the sources of uncertainty?} This questions are typically applicable to advanced projects, in which the practitioner is familiar with the factors contributing to uncertainty in the model or system. Two primary sources of uncertainty are aleatoric (inherent data noise) and epistemic (model or knowledge uncertainty). The ability to distinguish between them is crucial for selecting appropriate UQ methods, interpreting predictions correctly, and making informed decisions. If this distinction is not possible, the interpretability, actionable insight, and overall usefulness of the UQ results may be limited.
% \begin{itemize}
% \item Is it possible to distinguish between the two sources of uncertainty?
% \item Did you focus on epistemic or aleatoric uncertainty?
% \item Were there trainable uncertainties directly on the weights
% \end{itemize}
% - prior knowledge and expectations
% - downstream use of UQ: academic rigor, risk management, decision taking, OOD detection
% - different types of uncertainties (data versus model capacity) are hard to disentangle, if so, this happens in well advanced projects only
\subsection{How much effort is the intervention in my project?}
\subsubsection{\textbf{Method}}
\begin{itemize}
\item Is your method complex?
\item Are you trying to explicitly modelling densities?
\item Was the method fully Bayesian?
\item Is it a post-hoc method? Is it approximate Bayesian?
\item Is your task Mean/Variance estimation?
\item Do you apply variational inference?
\item Are you using CNN?
\end{itemize}
\subsubsection{\textbf{Explicit Modeling}}
\begin{itemize}
\item Is it model-directed?
\item How many parameters did the model have?
\item Is your likelihood tractable?
\item Were your priors restrictive?
\item Were there issues with the model complexity?
\item How quality of the model impact UQ?
\end{itemize}
\subsubsection{\textbf{Computation}}
\begin{itemize}
\item How much time did you have to train?
\item Is the method computationally expensive?
\end{itemize}
% - Am I willing to start a new codebase for that if necessary?
% - Am I willing to learn the theory behind or am I to use a blackbox?
\section{Method Cards}\label{sec:methods}
% \subsection{Method Name}
% \subsubsection{Abstract}
% \subsubsection{Easy to apply}
% \subsubsection{Data Compatibility}
% \subsubsection{Task Compatibility}
% \subsubsection{Ease of integration}
% \subsubsection{Computationally Cheap}
% \subsubsection{Caveats}
\begin{table}[h]
\centering
\caption{Idea is to have a table for all methods for the overview, let me know if you dislike it}
\resizebox{\textwidth}{!}{%
\begin{tabular}{|c|c|c|c|c|c|}
\hline
\multirow{2}{*}{\shortstack{Method \\ Name}} & \multirow{2}{*}{\shortstack{Easy to \\ Apply}} & \multirow{2}{*}{\shortstack{Data \\ Compatibility}} & \multirow{2}{*}{\shortstack{Task \\ Compatibility}} & \multirow{2}{*}{\shortstack{Ease of \\ Integration}} & \multirow{2}{*}{\shortstack{Computationally \\ Cheap}} \\
& & & & & \\[0.5ex]
\hline
MVE/Loss Attenuation & $\bigstar\bigstar\bigstar\bigstar\openbigstar $ & & & & \\
\hline
\end{tabular}
}
\end{table}
\subsection{MVE/Loss Attenuation}
\subsubsection{Abstract}
Mean Variance Estimation (MVE) is a technique where a neural network is modified to output both a mean prediction \(\mu(x)\) and a variance estimate \(\sigma^2(x)\), modeling the target as a Gaussian distribution. By training with negative log-likelihood (NLL) loss instead of standard MSE for the example of regression, the network learns heteroscedastic (input-dependent) aleatoric uncertainty. A key property is "loss attenuation": the model automatically downweights high-uncertainty samples during training, making it robust to label noise.
\begin{itemize}
\item **Original Paper:** [Estimating the Mean and Variance of the Target Probability Distribution (Nix \& Weigend, ICNN 1994)](https://ieeexplore.ieee.org/document/374138)
\item **Key Work:** [What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? (Kendall \& Gal, NeurIPS 2017)](https://arxiv.org/abs/1703.04977)
\item **Implementation:** Straightforward in PyTorch/TensorFlow, double the output size to account for variance and use Gaussian NLL loss.
\end{itemize}
\subsubsection{Easy to apply $\bigstar\bigstar\bigstar\bigstar\openbigstar$}
MVE is conceptually simple and requires only minor architectural modifications: extending the output and changing the loss function to Gaussian NLL. No specialized libraries are needed. However, training can be sensitive to hyperparameters in more complex deep learning tasks such as object detection where you have to assign a weight to the loss, and a "warm-up" period (training only the mean first) is often necessary to avoid convergence issues where the variance inflates to explain poor mean estimates.
\subsubsection{Data Compatibility $\bigstar\bigstar\bigstar\bigstar\openbigstar$}
MVE works with any continuous regression target and is compatible with various input types (images, tabular, time series). It naturally handles heteroscedastic data where noise varies across inputs. Not directly applicable to classification without modifications. Requires sufficient data to reliably learn the variance function.
\subsubsection{Task Compatibility $\bigstar\bigstar\bigstar\openbigstar\openbigstar$}
Primarily designed for regression tasks. It has been successfully extended to localization in object detection. The method captures aleatoric (data) uncertainty but does not capture epistemic (model) uncertainty. For that, MVE is typically combined with ensembles or MC Dropout. OOD detection is limited as the model may predict low variance for OOD inputs if they resemble training data. Calibration of variance estimates may require post-hoc adjustment, especially for small datasets or complex architectures.
\subsubsection{Ease of integration $\bigstar\bigstar\bigstar\bigstar\openbigstar$}
MVE integrates easily into existing pipelines. No retraining of the full architecture is needed. One can often fine-tune an existing model. Works with any architecture where the user has access to the final layer. Can be combined with other UQ methods (ensembles, dropout) for comprehensive uncertainty estimation. It cannot be combined with evidential deep learning directly.
\subsubsection{Computationally Cheap $\bigstar\bigstar\bigstar\bigstar\bigstar$}
Very efficient. It works in real-time and only doubles the parameters in the last layer. It only requires a single forward pass during inference, adding no overhead compared to standard networks. Training cost is essentially identical to standard training. This makes MVE one of the most computationally attractive UQ methods available.
\subsubsection{Caveats}
\begin{itemize}
\item **Only captures aleatoric uncertainty:** MVE estimates data noise, not model uncertainty. Must be combined with other methods (e.g., ensembles, MC Dropout) to capture epistemic uncertainty.
\item **Training instability:** Without proper initialization or warm-up periods, the variance can inflate to "explain away" prediction errors, compromising mean accuracy.
\item **Overconfidence on OOD data:** The model may predict low variance for out-of-distribution inputs that superficially resemble training data.
\item **Gaussian assumption:** Assumes normally distributed errors, which may not hold for all datasets. Heavy-tailed or multimodal noise distributions are not well captured.
\item **Calibration required:** Raw variance estimates may need post-hoc calibration to provide reliable uncertainty quantification.
\end{itemize}
\subsection{Method Name}
\subsubsection{Abstract}
\subsubsection{Easy to apply}
\subsubsection{Data Compatibility}
\subsubsection{Task Compatibility}
\subsubsection{Ease of integration}
\subsubsection{Computationally Cheap}
\subsubsection{Caveats}
\section{Discussion/Conclusion}
\bibliographystyle{plain}
\bibliography{literature.bib}
\end{document}
migrate this latex document to sphinx doc. no need to keep the old documents do not touch the docs/model_cards/ folder.