Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a degree of freedom (dof) flag (--n-independent-echos) to allow users to set the dof used in fstat calculations #1177

Merged
merged 39 commits into from
Mar 21, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
554faa9
DOF changes made
katielamar Feb 23, 2025
e940df1
passing style test
katielamar Feb 23, 2025
05acdc9
lint and pytest passing
katielamar Feb 24, 2025
2673ea2
echo_dof update ready
katielamar Feb 24, 2025
bd63bf7
Adding --n-independent-echos flag. See Issue #1168
katielamar Feb 24, 2025
eb83997
Rerun tests
katielamar Feb 25, 2025
cfde9a4
rerun tests 2
katielamar Feb 25, 2025
6551f05
updating documentation
katielamar Feb 25, 2025
731737a
added adaptive mask warning
katielamar Feb 27, 2025
7a5df43
updated getfbound calls to use echo_dof
katielamar Mar 3, 2025
c209c6a
adding tests. simplifying getfbounds
handwerkerd Mar 3, 2025
b57dbb7
added more tests. fixed t2smap CLI. fixed adaptive mask warnings.
handwerkerd Mar 4, 2025
eb4691a
fixed bug I created in calc_kappa_elbow and calc_rho_elbow
handwerkerd Mar 4, 2025
53c46d0
fixed comments
handwerkerd Mar 6, 2025
ab07715
Merge pull request #1 from handwerkerd/EPTI
katielamar Mar 8, 2025
deb431a
fixing echo_dof fstat map test. t One test still failing
katielamar Mar 8, 2025
a126d7f
Update tedana/tests/test_metrics.py
handwerkerd Mar 10, 2025
a5964cf
Update tedana/tests/test_metrics.py
handwerkerd Mar 10, 2025
6b846cf
Update tedana/tests/test_metrics.py
handwerkerd Mar 11, 2025
4dedd74
Update tedana/selection/tedpca.py
handwerkerd Mar 13, 2025
c591eaa
Update tedana/utils.py
handwerkerd Mar 13, 2025
b7a832a
Update tedana/utils.py
handwerkerd Mar 13, 2025
9881a66
Update tedana/utils.py
handwerkerd Mar 13, 2025
44c46b2
Update tedana/utils.py
handwerkerd Mar 13, 2025
1e3c973
Update tedana/utils.py
handwerkerd Mar 13, 2025
d57b297
Update tedana/utils.py
handwerkerd Mar 13, 2025
c804136
aligned with PR and fixed independAnt
handwerkerd Mar 13, 2025
0ab073a
changed echo_dof to n_independent_echos
handwerkerd Mar 13, 2025
3622878
Merge branch 'EPTI' into EPTI
handwerkerd Mar 13, 2025
a59cc42
Merge pull request #2 from handwerkerd/EPTI
katielamar Mar 14, 2025
65f381e
fixing style and concat error
katielamar Mar 14, 2025
8ced854
Update tedana/utils.py
handwerkerd Mar 14, 2025
2c0c514
Update tedana/utils.py
handwerkerd Mar 17, 2025
b369db0
Update tedana/selection/selection_nodes.py
handwerkerd Mar 17, 2025
47fdad7
Update tedana/workflows/t2smap.py
handwerkerd Mar 18, 2025
3225528
cleaned up f_thresh in generate_metrics
handwerkerd Mar 18, 2025
6a2092e
Merge branch 'ME-ICA:main' into EPTI
handwerkerd Mar 18, 2025
ca6d74a
Merge pull request #3 from handwerkerd/EPTI
katielamar Mar 20, 2025
e0e751a
Update tedana/metrics/collect.py
handwerkerd Mar 21, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -123,3 +123,4 @@ ENV/

# vim swap files
*.swp

2 changes: 1 addition & 1 deletion docs/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ The clear failure cases are extreme. That is getting less than 1/5 the number of
compared to time points or having nearly as many components as time points.
We are working on identifying why this happens and adding better solutions.
Our current guess is that most of the above methods assume data are
independant and identically distributed (IID),
independent and identically distributed (IID),
and signal leakage from in-slice and multi-slice accelleration may violate this assumption.

We have one option that is generally useful and is also a partial solution.
Expand Down
8 changes: 8 additions & 0 deletions tedana/decomposition/pca.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ def tedpca(
adaptive_mask,
io_generator,
tes,
n_independent_echos=None,
algorithm="aic",
kdaw=10.0,
rdaw=1.0,
Expand All @@ -79,6 +80,10 @@ def tedpca(
The output generation object for this workflow
tes : :obj:`list`
List of echo times associated with `data_cat`, in milliseconds
n_independent_echos : :obj:`int`, optional
Number of independent echoes to use in goodness of fit metrics (fstat).
Primarily used for EPTI acquisitions.
If None, number of echoes will be used. Default is None.
algorithm : {'kundu', 'kundu-stabilize', 'mdl', 'aic', 'kic', float}, optional
Method with which to select components in TEDPCA. PCA
decomposition with the mdl, kic and aic options are based on a Moving Average
Expand Down Expand Up @@ -355,6 +360,7 @@ def tedpca(
mixing=comp_ts,
adaptive_mask=adaptive_mask,
tes=tes,
n_independent_echos=n_independent_echos,
io_generator=io_generator,
label="PCA",
external_regressors=None,
Expand All @@ -377,6 +383,7 @@ def tedpca(
component_table, metric_metadata = kundu_tedpca(
component_table,
n_echos,
n_independent_echos,
kdaw,
rdaw,
stabilize=False,
Expand All @@ -385,6 +392,7 @@ def tedpca(
component_table, metric_metadata = kundu_tedpca(
component_table,
n_echos,
n_independent_echos,
kdaw,
rdaw,
stabilize=True,
Expand Down
14 changes: 12 additions & 2 deletions tedana/metrics/collect.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ def generate_metrics(
mixing: npt.NDArray,
adaptive_mask: npt.NDArray,
tes: Union[List[int], List[float], npt.NDArray],
n_independent_echos: int = None,
io_generator: io.OutputGenerator,
label: str,
external_regressors: Union[pd.DataFrame, None] = None,
Expand All @@ -53,6 +54,10 @@ def generate_metrics(
For more information on thresholding, see `make_adaptive_mask`.
tes : list
List of echo times associated with `data_cat`, in milliseconds
n_independent_echos : int
Number of independent echoes to use in goodness of fit metrics (fstat).
Primarily used for EPTI acquisitions.
If None, number of echoes will be used. Default is None.
io_generator : tedana.io.OutputGenerator
The output generator object for this workflow
label : str in ['ICA', 'PCA']
Expand Down Expand Up @@ -122,6 +127,10 @@ def generate_metrics(
# Ensure that echo times are in an array, rather than a list
tes = np.asarray(tes)

# use either the inputted number of indie echoes or the total number of echoes
# to calculate the threshold for f tests
f_thresh, _, _ = getfbounds(n_independent_echos or len(tes))

# Get reference image from io_generator
ref_img = io_generator.reference_img

Expand Down Expand Up @@ -196,6 +205,7 @@ def generate_metrics(
mixing=mixing,
adaptive_mask=adaptive_mask,
tes=tes,
n_independent_echos=n_independent_echos,
)
metric_maps["map FT2"] = m_t2
metric_maps["map FS0"] = m_s0
Expand Down Expand Up @@ -224,7 +234,7 @@ def generate_metrics(

if "map FT2 clusterized" in required_metrics:
LGR.info("Calculating T2* F-statistic maps")
f_thresh, _, _ = getfbounds(len(tes))

metric_maps["map FT2 clusterized"] = dependence.threshold_map(
maps=metric_maps["map FT2"],
mask=mask,
Expand All @@ -234,7 +244,7 @@ def generate_metrics(

if "map FS0 clusterized" in required_metrics:
LGR.info("Calculating S0 F-statistic maps")
f_thresh, _, _ = getfbounds(len(tes))

metric_maps["map FS0 clusterized"] = dependence.threshold_map(
maps=metric_maps["map FS0"],
mask=mask,
Expand Down
15 changes: 13 additions & 2 deletions tedana/metrics/dependence.py
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,7 @@ def calculate_f_maps(
mixing: np.ndarray,
adaptive_mask: np.ndarray,
tes: np.ndarray,
n_independent_echos=None,
f_max: float = 500,
) -> typing.Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]:
"""Calculate pseudo-F-statistic maps for TE-dependence and -independence models.
Expand All @@ -153,6 +154,10 @@ def calculate_f_maps(
"good signal". Limited to masked voxels.
tes : (E) array_like
Echo times in milliseconds, in the same order as the echoes in data_cat.
n_independent_echos : int
Number of independent echoes to use in goodness of fit metrics (fstat).
Primarily used for EPTI acquisitions.
If None, number of echoes will be used. Default is None.
f_max : float, optional
Maximum F-statistic, used to crop extreme values. Values in the
F-statistic maps greater than this value are set to it.
Expand Down Expand Up @@ -201,7 +206,10 @@ def calculate_f_maps(
pred_s0 = x1[:j_echo, :] * np.tile(coeffs_s0, (j_echo, 1))
sse_s0 = (comp_betas[:j_echo] - pred_s0) ** 2
sse_s0 = sse_s0.sum(axis=0) # (S,) prediction error map
f_s0 = (alpha - sse_s0) * (j_echo - 1) / (sse_s0)
if n_independent_echos is None or n_independent_echos >= j_echo:
f_s0 = (alpha - sse_s0) * (j_echo - 1) / (sse_s0)
else:
f_s0 = (alpha - sse_s0) * (n_independent_echos - 1) / (sse_s0)
f_s0[f_s0 > f_max] = f_max
f_s0_maps[mask_idx, i_comp] = f_s0[mask_idx]

Expand All @@ -212,7 +220,10 @@ def calculate_f_maps(
pred_t2 = x2[:j_echo] * np.tile(coeffs_t2, (j_echo, 1))
sse_t2 = (comp_betas[:j_echo] - pred_t2) ** 2
sse_t2 = sse_t2.sum(axis=0)
f_t2 = (alpha - sse_t2) * (j_echo - 1) / (sse_t2)
if n_independent_echos is None or n_independent_echos >= j_echo:
f_t2 = (alpha - sse_t2) * (j_echo - 1) / (sse_t2)
else:
f_t2 = (alpha - sse_t2) * (n_independent_echos - 1) / (sse_t2)
f_t2[f_t2 > f_max] = f_max
f_t2_maps[mask_idx, i_comp] = f_t2[mask_idx]

Expand Down
1 change: 1 addition & 0 deletions tedana/selection/component_selector.py
Original file line number Diff line number Diff line change
Expand Up @@ -505,6 +505,7 @@ def select(
f"Step {self.current_node_idx_}: Running function {node['functionname']} "
f"with parameters: {all_params}"
)

# run the decision node function
self = fcn(self, **params, **kwargs)

Expand Down
24 changes: 22 additions & 2 deletions tedana/selection/selection_nodes.py
Original file line number Diff line number Diff line change
Expand Up @@ -714,10 +714,20 @@ def calc_kappa_elbow(
This also means the kappa elbow should be calculated before those two other functions
are called
"""
if (
"n_independent_echos" in selector.cross_component_metrics_.keys()
and selector.cross_component_metrics_["n_independent_echos"]
):
n_independent_echos = selector.cross_component_metrics_["n_independent_echos"]
else:
# Number of independent echoes is the number of echoes if not otherwise specified
n_independent_echos = selector.cross_component_metrics_["n_echos"]

outputs = {
"decision_node_idx": selector.current_node_idx_,
"node_label": None,
"n_echos": selector.cross_component_metrics_["n_echos"],
"n_independent_echos": n_independent_echos,
"used_metrics": {"kappa"},
"calc_cross_comp_metrics": [
"kappa_elbow_kundu",
Expand Down Expand Up @@ -777,7 +787,7 @@ def calc_kappa_elbow(
outputs["varex_upper_p"],
) = kappa_elbow_kundu(
selector.component_table_,
selector.cross_component_metrics_["n_echos"],
n_independent_echos,
comps2use=comps2use,
)
selector.cross_component_metrics_["kappa_elbow_kundu"] = outputs["kappa_elbow_kundu"]
Expand Down Expand Up @@ -846,10 +856,20 @@ def calc_rho_elbow(
f"It is {rho_elbow_type} "
)

if (
"n_independent_echos" in selector.cross_component_metrics_.keys()
and selector.cross_component_metrics_["n_independent_echos"]
):
n_independent_echos = selector.cross_component_metrics_["n_independent_echos"]
else:
# DOF is number of echoes if not otherwise specified
n_independent_echos = selector.cross_component_metrics_["n_echos"]

outputs = {
"decision_node_idx": selector.current_node_idx_,
"node_label": None,
"n_echos": selector.cross_component_metrics_["n_echos"],
"n_independent_echos": n_independent_echos,
"calc_cross_comp_metrics": [
elbow_name,
"rho_allcomps_elbow",
Expand Down Expand Up @@ -904,7 +924,7 @@ def calc_rho_elbow(
outputs["elbow_f05"],
) = rho_elbow_kundu_liberal(
selector.component_table_,
selector.cross_component_metrics_["n_echos"],
n_independent_echos,
rho_elbow_type=rho_elbow_type,
comps2use=comps2use,
subset_comps2use=subset_comps2use,
Expand Down
25 changes: 16 additions & 9 deletions tedana/selection/selection_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -580,7 +580,7 @@ def getelbow(arr, return_val=False):
return k_min_ind


def kappa_elbow_kundu(component_table, n_echos, comps2use=None):
def kappa_elbow_kundu(component_table, n_independent_echos, comps2use=None):
"""
Calculate an elbow for kappa.

Expand All @@ -592,8 +592,10 @@ def kappa_elbow_kundu(component_table, n_echos, comps2use=None):
Component metric table. One row for each component, with a column for
each metric. The index should be the component number.
Only the 'kappa' column is used in this function
n_echos : :obj:`int`
The number of echos in the multi-echo data
n_independent_echos : :obj:`int`
Number of independent echoes to use in goodness of fit metrics (fstat).
Typically the number of echos in the multi-echo data
May be a lower value for EPTI acquisitions.
comps2use : :obj:`list[int]`
A list of component indices used to calculate the elbow
default=None which means use all components
Expand Down Expand Up @@ -633,7 +635,7 @@ def kappa_elbow_kundu(component_table, n_echos, comps2use=None):
kappas2use = component_table.loc[comps2use, "kappa"].to_numpy()

# low kappa threshold
_, _, f01 = getfbounds(n_echos)
_, _, f01 = getfbounds(n_independent_echos)
# get kappa values for components below a significance threshold
kappas_nonsig = kappas2use[kappas2use < f01]

Expand Down Expand Up @@ -670,7 +672,11 @@ def kappa_elbow_kundu(component_table, n_echos, comps2use=None):


def rho_elbow_kundu_liberal(
component_table, n_echos, rho_elbow_type="kundu", comps2use=None, subset_comps2use=-1
component_table,
n_independent_echos,
rho_elbow_type="kundu",
comps2use=None,
subset_comps2use=-1,
):
"""
Calculate an elbow for rho.
Expand All @@ -684,8 +690,10 @@ def rho_elbow_kundu_liberal(
Component metric table. One row for each component, with a column for
each metric. The index should be the component number.
Only the 'kappa' column is used in this function
n_echos : :obj:`int`
The number of echos in the multi-echo data
n_independent_echos : :obj:`int`
Number of independent echoes to use in goodness of fit metrics (fstat).
Typically the number of echos in the multi-echo data
May be a lower value for EPTI acquisitions.
rho_elbow_type : :obj:`str`
The algorithm used to calculate the rho elbow. Current options are
'kundu' and 'liberal'.
Expand Down Expand Up @@ -753,8 +761,7 @@ def rho_elbow_kundu_liberal(
].tolist()

# One rho elbow threshold set just on the number of echoes
elbow_f05, _, _ = getfbounds(n_echos)

elbow_f05, _, _ = getfbounds(n_independent_echos)
# One rho elbow threshold set using all componets in comps2use
rhos_comps2use = component_table.loc[comps2use, "rho"].to_numpy()
rho_allcomps_elbow = getelbow(rhos_comps2use, return_val=True)
Expand Down
11 changes: 9 additions & 2 deletions tedana/selection/tedpca.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,9 @@
F_MAX = 500


def kundu_tedpca(component_table, n_echos, kdaw=10.0, rdaw=1.0, stabilize=False):
def kundu_tedpca(
component_table, n_echos, n_independent_echos=None, kdaw=10.0, rdaw=1.0, stabilize=False
):
"""Select PCA components using Kundu's decision tree approach.

Parameters
Expand All @@ -25,6 +27,10 @@ def kundu_tedpca(component_table, n_echos, kdaw=10.0, rdaw=1.0, stabilize=False)
variance explained. Component number should be the index.
n_echos : :obj:`int`
Number of echoes in dataset.
n_independent_echos : int
Number of independent echoes to use in goodness of fit metrics (fstat).
Primarily used for EPTI acquisitions.
If None, number of echoes will be used. Default is None.
kdaw : :obj:`float`, optional
Kappa dimensionality augmentation weight. Must be a non-negative float,
or -1 (a special value). Default is 10.
Expand Down Expand Up @@ -59,8 +65,9 @@ def kundu_tedpca(component_table, n_echos, kdaw=10.0, rdaw=1.0, stabilize=False)
+ 1
]
varex_norm_cum = np.cumsum(component_table["normalized variance explained"])
n_independent_echos = n_independent_echos or n_echos
fmin, fmid, fmax = getfbounds(n_independent_echos)

fmin, fmid, fmax = getfbounds(n_echos)
if int(kdaw) == -1:
lim_idx = (
utils.andb([component_table["kappa"] < fmid, component_table["kappa"] > fmin]) == 2
Expand Down
14 changes: 8 additions & 6 deletions tedana/stats.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,24 +11,26 @@
RepLGR = logging.getLogger("REPORT")


def getfbounds(n_echos):
def getfbounds(n_independent_sources):
"""
Get F-statistic boundaries based on number of echos.

Parameters
----------
n_echos : :obj:`int`
Number of echoes
n_independent_sources : :obj:`int`
The number of independent sources to calculate DOF for goodness of fit metrics (fstat).
Typically the number of echos in the multi-echo data
May be a lower value for EPTI acquisitions.

Returns
-------
fmin, fmid, fmax : :obj:`float`
F-statistic thresholds for alphas of 0.05, 0.025, and 0.01,
respectively.
"""
f05 = stats.f.ppf(q=(1 - 0.05), dfn=1, dfd=(n_echos - 1))
f025 = stats.f.ppf(q=(1 - 0.025), dfn=1, dfd=(n_echos - 1))
f01 = stats.f.ppf(q=(1 - 0.01), dfn=1, dfd=(n_echos - 1))
f05 = stats.f.ppf(q=(1 - 0.05), dfn=1, dfd=(n_independent_sources - 1))
f025 = stats.f.ppf(q=(1 - 0.025), dfn=1, dfd=(n_independent_sources - 1))
f01 = stats.f.ppf(q=(1 - 0.01), dfn=1, dfd=(n_independent_sources - 1))
return f05, f025, f01


Expand Down
3 changes: 3 additions & 0 deletions tedana/tests/test_integration.py
Original file line number Diff line number Diff line change
Expand Up @@ -126,11 +126,13 @@ def test_integration_five_echo(skip_integration):
suffix = ".sm.nii.gz"
datalist = [prepend + str(i + 1) + suffix for i in range(5)]
echo_times = [15.4, 29.7, 44.0, 58.3, 72.6]
# adding n_independent_echos=4 to test workflow code using n_independent_echos is executed
tedana_cli.tedana_workflow(
data=datalist,
tes=echo_times,
ica_method="robustica",
n_robust_runs=4,
n_independent_echos=4,
out_dir=out_dir,
tedpca=0.95,
fittype="curvefit",
Expand Down Expand Up @@ -632,6 +634,7 @@ def test_integration_t2smap(skip_integration):
+ [str(te) for te in echo_times]
+ ["--out-dir", out_dir, "--fittype", "curvefit"]
+ ["--masktype", "dropout", "decay"]
+ ["--n-independent-echos", "4"]
)
t2smap_cli._main(args)

Expand Down
Loading