[Feature] <title>[Feature] Implement Bayesian (MCMC) parameter estimation for the Standard Tofts Model

### Feature description

### Background & Motivation
Hello OSIPI team! I am highly interested in contributing to `osipy`, particularly regarding the blood-brain barrier module and robust parameter reconstruction initiatives. As I am preparing a proposal for GSoC 2026, I have been reviewing the current optimization methods in the repository. 

Currently, the parameter estimation (e.g., for $K^{trans}$ and $v_e$) relies primarily on deterministic least-squares fitting. While standard, these methods can be highly sensitive to noise in DCE-MRI data and do not naturally provide uncertainty quantification. I would like to propose—and volunteer to implement—a Bayesian framework to address this.

### Proposed Solution
I propose adding a probabilistic parameter estimation module using Markov Chain Monte Carlo (MCMC) sampling. This will allow users to define physiological priors and obtain full posterior distributions for permeability parameters, improving the robustness of the fits.



### Proposed Implementation Workflow

If given the green light, I plan to execute this in four distinct phases:

**Phase 1: Architecture & Dependency Alignment **
* Open a discussion in the draft PR regarding the preferred probabilistic backend. My initial thought is to use `emcee` as it is lightweight and widely used in scientific Python, or `PyMC` if a more comprehensive framework is desired.
* Review the OSIPI CAPLEX lexicon to ensure all new variable names (e.g., posteriors, priors) match the official nomenclature.

**Phase 2: Core Mathematical Implementation **
* Implement a Bayesian log-likelihood wrapper around the existing Standard Tofts Model forward equation:
    $$C_t(t) = K^{trans} \int_0^t C_p(\tau) e^{-k_{ep}(t-\tau)} d\tau$$
* Build a function to accept user-defined prior distributions for $K^{trans}$ and $v_e$ (e.g., uniform or truncated normal bounds based on known human physiology).
* Integrate the MCMC sampler to return both the Maximum A Posteriori (MAP) estimates and the raw sampler chains.

**Phase 3: Validation & Testing **
* Generate synthetic noisy tissue concentration curves using `osipy`'s forward models.
* Write unit tests in the `tests/` directory to ensure the MCMC sampler correctly recovers the ground-truth parameters within an acceptable margin of error.
* Ensure all new code passes `pre-commit` hooks (black, flake8) and maintains the project's >95% code coverage requirement via `check_coverage.py`.

**Phase 4: Documentation & Tutorial **
* Add comprehensive docstrings to the new classes/functions.
* Create a `tutorial_bayesian_tofts.ipynb` notebook in the examples folder. This notebook will visually compare a standard least-squares fit against the Bayesian fit on noisy data, plotting the posterior corner plots to visualize parameter uncertainty.

---
I am looking forward to any feedback on this approach. Please let me know if I should adjust the scope or if there is a specific branch I should target for this work!

**Even if I start early it will take me at least 5-6 do finish if things go well**

### Describe the solution


**Phase 1: Architecture & Dependency Alignment **
* Open a discussion in the draft PR regarding the preferred probabilistic backend. My initial thought is to use `emcee` as it is lightweight and widely used in scientific Python, or `PyMC` if a more comprehensive framework is desired.
* Review the OSIPI CAPLEX lexicon to ensure all new variable names (e.g., posteriors, priors) match the official nomenclature.

**Phase 2: Core Mathematical Implementation **
* Implement a Bayesian log-likelihood wrapper around the existing Standard Tofts Model forward equation:
    $$C_t(t) = K^{trans} \int_0^t C_p(\tau) e^{-k_{ep}(t-\tau)} d\tau$$
* Build a function to accept user-defined prior distributions for $K^{trans}$ and $v_e$ (e.g., uniform or truncated normal bounds based on known human physiology).
* Integrate the MCMC sampler to return both the Maximum A Posteriori (MAP) estimates and the raw sampler chains.

**Phase 3: Validation & Testing **
* Generate synthetic noisy tissue concentration curves using `osipy`'s forward models.
* Write unit tests in the `tests/` directory to ensure the MCMC sampler correctly recovers the ground-truth parameters within an acceptable margin of error.
* Ensure all new code passes `pre-commit` hooks (black, flake8) and maintains the project's >95% code coverage requirement via `check_coverage.py`.

**Phase 4: Documentation & Tutorial **
* Add comprehensive docstrings to the new classes/functions.
* Create a `tutorial_bayesian_tofts.ipynb` notebook in the examples folder. This notebook will visually compare a standard least-squares fit against the Bayesian fit on noisy data, plotting the posterior corner plots to visualize parameter uncertainty.

### Describe alternatives

Instead of full MCMC sampling, we can add a regularization framework to the existing optimization pipeline. By incorporating prior physiological distributions as penalty terms in the objective function, we can perform Maximum A Posteriori (MAP) estimation. This stabilizes the fitting of $K^{trans}$ and $v_e$ against noise without sacrificing computational speed.

The workflow for this: -

**Phase 1: Architecture **
* Review the current `osipy` fitting infrastructure.
* Design an extensible `Prior` class (or module) that allows users to define expected physiological means and variances for parameters (e.g., Gaussian or Log-Normal priors). 
* Ensure the approach relies strictly on existing dependencies (`numpy` and `scipy`), avoiding the need for external probabilistic libraries.

**Phase 2: Core Mathematical Implementation **
* Modify the standard Sum of Squared Errors (SSE) objective function to include a regularization term based on the priors. The new objective function to minimize will look like:
  $$\text{Cost} = \sum_{t} \left( C_{data}(t) - C_{model}(t, \theta) \right)^2 - \lambda \sum_{k} \log(P(\theta_k))$$
  *(Where $\theta$ represents the Tofts parameters, $P(\theta)$ is the prior probability, and $\lambda$ is a weighting factor).*
* Integrate this custom cost function with `scipy.optimize.minimize` (using bounds-supported methods like L-BFGS-B or trust-constr).

**Phase 3: Validation & Testing **
* Generate synthetic noisy tissue concentration curves using the Tofts forward model.
* Write unit tests in `tests/` comparing the standard unpenalized fit vs. the MAP fit at high noise levels, proving that the MAP estimator stays closer to ground-truth values.
* Ensure >95% code coverage and verify `pre-commit` hooks pass.

**Phase 4: Documentation & Tutorial **
* Update docstrings to CAPLEX standards.
* Create a `tutorial_MAP_tofts.ipynb` notebook demonstrating how users can apply physiological priors to stabilize their fits on noisy data.

**This approach will take me a little longer than the other one but it will work well too if given the permission to work on this**  

### Additional context

_No response_

### Are you working on this?

Yes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] <title>[Feature] Implement Bayesian (MCMC) parameter estimation for the Standard Tofts Model #76

Feature description

Background & Motivation

Proposed Solution

Proposed Implementation Workflow

Describe the solution

Describe alternatives

Additional context

Are you working on this?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature] <title>[Feature] Implement Bayesian (MCMC) parameter estimation for the Standard Tofts Model #76

Description

Feature description

Background & Motivation

Proposed Solution

Proposed Implementation Workflow

Describe the solution

Describe alternatives

Additional context

Are you working on this?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions