Feature description
Background & Motivation
Hello OSIPI team! I am highly interested in contributing to osipy, particularly regarding the blood-brain barrier module and robust parameter reconstruction initiatives. As I am preparing a proposal for GSoC 2026, I have been reviewing the current optimization methods in the repository.
Currently, the parameter estimation (e.g., for $K^{trans}$ and $v_e$) relies primarily on deterministic least-squares fitting. While standard, these methods can be highly sensitive to noise in DCE-MRI data and do not naturally provide uncertainty quantification. I would like to propose—and volunteer to implement—a Bayesian framework to address this.
Proposed Solution
I propose adding a probabilistic parameter estimation module using Markov Chain Monte Carlo (MCMC) sampling. This will allow users to define physiological priors and obtain full posterior distributions for permeability parameters, improving the robustness of the fits.
Proposed Implementation Workflow
If given the green light, I plan to execute this in four distinct phases:
**Phase 1: Architecture & Dependency Alignment **
- Open a discussion in the draft PR regarding the preferred probabilistic backend. My initial thought is to use
emcee as it is lightweight and widely used in scientific Python, or PyMC if a more comprehensive framework is desired.
- Review the OSIPI CAPLEX lexicon to ensure all new variable names (e.g., posteriors, priors) match the official nomenclature.
**Phase 2: Core Mathematical Implementation **
- Implement a Bayesian log-likelihood wrapper around the existing Standard Tofts Model forward equation:
$$C_t(t) = K^{trans} \int_0^t C_p(\tau) e^{-k_{ep}(t-\tau)} d\tau$$
- Build a function to accept user-defined prior distributions for $K^{trans}$ and $v_e$ (e.g., uniform or truncated normal bounds based on known human physiology).
- Integrate the MCMC sampler to return both the Maximum A Posteriori (MAP) estimates and the raw sampler chains.
**Phase 3: Validation & Testing **
- Generate synthetic noisy tissue concentration curves using
osipy's forward models.
- Write unit tests in the
tests/ directory to ensure the MCMC sampler correctly recovers the ground-truth parameters within an acceptable margin of error.
- Ensure all new code passes
pre-commit hooks (black, flake8) and maintains the project's >95% code coverage requirement via check_coverage.py.
**Phase 4: Documentation & Tutorial **
- Add comprehensive docstrings to the new classes/functions.
- Create a
tutorial_bayesian_tofts.ipynb notebook in the examples folder. This notebook will visually compare a standard least-squares fit against the Bayesian fit on noisy data, plotting the posterior corner plots to visualize parameter uncertainty.
I am looking forward to any feedback on this approach. Please let me know if I should adjust the scope or if there is a specific branch I should target for this work!
Even if I start early it will take me at least 5-6 do finish if things go well
Describe the solution
**Phase 1: Architecture & Dependency Alignment **
- Open a discussion in the draft PR regarding the preferred probabilistic backend. My initial thought is to use
emcee as it is lightweight and widely used in scientific Python, or PyMC if a more comprehensive framework is desired.
- Review the OSIPI CAPLEX lexicon to ensure all new variable names (e.g., posteriors, priors) match the official nomenclature.
**Phase 2: Core Mathematical Implementation **
- Implement a Bayesian log-likelihood wrapper around the existing Standard Tofts Model forward equation:
$$C_t(t) = K^{trans} \int_0^t C_p(\tau) e^{-k_{ep}(t-\tau)} d\tau$$
- Build a function to accept user-defined prior distributions for $K^{trans}$ and $v_e$ (e.g., uniform or truncated normal bounds based on known human physiology).
- Integrate the MCMC sampler to return both the Maximum A Posteriori (MAP) estimates and the raw sampler chains.
**Phase 3: Validation & Testing **
- Generate synthetic noisy tissue concentration curves using
osipy's forward models.
- Write unit tests in the
tests/ directory to ensure the MCMC sampler correctly recovers the ground-truth parameters within an acceptable margin of error.
- Ensure all new code passes
pre-commit hooks (black, flake8) and maintains the project's >95% code coverage requirement via check_coverage.py.
**Phase 4: Documentation & Tutorial **
- Add comprehensive docstrings to the new classes/functions.
- Create a
tutorial_bayesian_tofts.ipynb notebook in the examples folder. This notebook will visually compare a standard least-squares fit against the Bayesian fit on noisy data, plotting the posterior corner plots to visualize parameter uncertainty.
Describe alternatives
Instead of full MCMC sampling, we can add a regularization framework to the existing optimization pipeline. By incorporating prior physiological distributions as penalty terms in the objective function, we can perform Maximum A Posteriori (MAP) estimation. This stabilizes the fitting of $K^{trans}$ and $v_e$ against noise without sacrificing computational speed.
The workflow for this: -
**Phase 1: Architecture **
- Review the current
osipy fitting infrastructure.
- Design an extensible
Prior class (or module) that allows users to define expected physiological means and variances for parameters (e.g., Gaussian or Log-Normal priors).
- Ensure the approach relies strictly on existing dependencies (
numpy and scipy), avoiding the need for external probabilistic libraries.
**Phase 2: Core Mathematical Implementation **
- Modify the standard Sum of Squared Errors (SSE) objective function to include a regularization term based on the priors. The new objective function to minimize will look like:
$$\text{Cost} = \sum_{t} \left( C_{data}(t) - C_{model}(t, \theta) \right)^2 - \lambda \sum_{k} \log(P(\theta_k))$$
(Where $\theta$ represents the Tofts parameters, $P(\theta)$ is the prior probability, and $\lambda$ is a weighting factor).
- Integrate this custom cost function with
scipy.optimize.minimize (using bounds-supported methods like L-BFGS-B or trust-constr).
**Phase 3: Validation & Testing **
- Generate synthetic noisy tissue concentration curves using the Tofts forward model.
- Write unit tests in
tests/ comparing the standard unpenalized fit vs. the MAP fit at high noise levels, proving that the MAP estimator stays closer to ground-truth values.
- Ensure >95% code coverage and verify
pre-commit hooks pass.
**Phase 4: Documentation & Tutorial **
- Update docstrings to CAPLEX standards.
- Create a
tutorial_MAP_tofts.ipynb notebook demonstrating how users can apply physiological priors to stabilize their fits on noisy data.
This approach will take me a little longer than the other one but it will work well too if given the permission to work on this
Additional context
No response
Are you working on this?
Yes
Feature description
Background & Motivation
Hello OSIPI team! I am highly interested in contributing to
osipy, particularly regarding the blood-brain barrier module and robust parameter reconstruction initiatives. As I am preparing a proposal for GSoC 2026, I have been reviewing the current optimization methods in the repository.Currently, the parameter estimation (e.g., for$K^{trans}$ and $v_e$ ) relies primarily on deterministic least-squares fitting. While standard, these methods can be highly sensitive to noise in DCE-MRI data and do not naturally provide uncertainty quantification. I would like to propose—and volunteer to implement—a Bayesian framework to address this.
Proposed Solution
I propose adding a probabilistic parameter estimation module using Markov Chain Monte Carlo (MCMC) sampling. This will allow users to define physiological priors and obtain full posterior distributions for permeability parameters, improving the robustness of the fits.
Proposed Implementation Workflow
If given the green light, I plan to execute this in four distinct phases:
**Phase 1: Architecture & Dependency Alignment **
emceeas it is lightweight and widely used in scientific Python, orPyMCif a more comprehensive framework is desired.**Phase 2: Core Mathematical Implementation **
**Phase 3: Validation & Testing **
osipy's forward models.tests/directory to ensure the MCMC sampler correctly recovers the ground-truth parameters within an acceptable margin of error.pre-commithooks (black, flake8) and maintains the project's >95% code coverage requirement viacheck_coverage.py.**Phase 4: Documentation & Tutorial **
tutorial_bayesian_tofts.ipynbnotebook in the examples folder. This notebook will visually compare a standard least-squares fit against the Bayesian fit on noisy data, plotting the posterior corner plots to visualize parameter uncertainty.I am looking forward to any feedback on this approach. Please let me know if I should adjust the scope or if there is a specific branch I should target for this work!
Even if I start early it will take me at least 5-6 do finish if things go well
Describe the solution
**Phase 1: Architecture & Dependency Alignment **
emceeas it is lightweight and widely used in scientific Python, orPyMCif a more comprehensive framework is desired.**Phase 2: Core Mathematical Implementation **
**Phase 3: Validation & Testing **
osipy's forward models.tests/directory to ensure the MCMC sampler correctly recovers the ground-truth parameters within an acceptable margin of error.pre-commithooks (black, flake8) and maintains the project's >95% code coverage requirement viacheck_coverage.py.**Phase 4: Documentation & Tutorial **
tutorial_bayesian_tofts.ipynbnotebook in the examples folder. This notebook will visually compare a standard least-squares fit against the Bayesian fit on noisy data, plotting the posterior corner plots to visualize parameter uncertainty.Describe alternatives
Instead of full MCMC sampling, we can add a regularization framework to the existing optimization pipeline. By incorporating prior physiological distributions as penalty terms in the objective function, we can perform Maximum A Posteriori (MAP) estimation. This stabilizes the fitting of$K^{trans}$ and $v_e$ against noise without sacrificing computational speed.
The workflow for this: -
**Phase 1: Architecture **
osipyfitting infrastructure.Priorclass (or module) that allows users to define expected physiological means and variances for parameters (e.g., Gaussian or Log-Normal priors).numpyandscipy), avoiding the need for external probabilistic libraries.**Phase 2: Core Mathematical Implementation **
(Where $\theta$ represents the Tofts parameters, $P(\theta)$ is the prior probability, and $\lambda$ is a weighting factor).
scipy.optimize.minimize(using bounds-supported methods like L-BFGS-B or trust-constr).**Phase 3: Validation & Testing **
tests/comparing the standard unpenalized fit vs. the MAP fit at high noise levels, proving that the MAP estimator stays closer to ground-truth values.pre-commithooks pass.**Phase 4: Documentation & Tutorial **
tutorial_MAP_tofts.ipynbnotebook demonstrating how users can apply physiological priors to stabilize their fits on noisy data.This approach will take me a little longer than the other one but it will work well too if given the permission to work on this
Additional context
No response
Are you working on this?
Yes