Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --ctab parameter to ica_reclassify #1184

Closed
wants to merge 4 commits into from

Conversation

tsalo
Copy link
Member

@tsalo tsalo commented Mar 9, 2025

Closes #1182.

Changes proposed in this pull request:

  • Add a --ctab parameter to ica_reclassify that accepts a component table.
  • If --manacc, --manrej, and --ctab are not provided, then raise an error (was being raised if --manacc and --manrej weren't provided).

@tsalo tsalo added the enhancement issues describing possible enhancements to the project label Mar 9, 2025
@tsalo tsalo marked this pull request as ready for review March 9, 2025 17:16
@tsalo tsalo requested review from handwerkerd and eurunuela March 10, 2025 12:58
Copy link
Collaborator

@eurunuela eurunuela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I do think we should update the docs before merging this PR in.

@tsalo
Copy link
Member Author

tsalo commented Mar 10, 2025

Is ica_reclassify documented in detail outside of the argparse-based documentation in https://tedana.readthedocs.io/en/stable/usage.html#running-the-ica-reclassify-workflow?

@tsalo
Copy link
Member Author

tsalo commented Mar 10, 2025

Ironically, it looks like the documentation on using RICA is outdated and actually matches what this PR proposes:

Once the .tsv file containing the result of manual component classification is obtained, it is necessary to re-run the tedana workflow (see Running the ica_reclassify workflow) passing the manual_classification.tsv file with the --ctab option.

@eurunuela
Copy link
Collaborator

Is ica_reclassify documented in detail outside of the argparse-based documentation in https://tedana.readthedocs.io/en/stable/usage.html#running-the-ica-reclassify-workflow?

You may be right. I think that may be the only documentation we have. For some reason I thought I remembered a different section.

@handwerkerd
Copy link
Member

Sorry I didn't notice the open issue a few days ago before you started to work on the PR. I do have some concerns about this.

  • The key aspect of inputting a list of accepted and rejected component numbers through the selector object is that it updates the component status table tsv & the processed steps in the decision tree json. By just swapping out the base component table, you're making substantial changes to classifications without logging the changes.

selector.select(

  • This PR is replacing one component table with another. One minor risk of that is one could remove all existing information from the table (i.e. other metrics, existing classification tags, etc). This put the onus on whoever is making the new component table not to mess up existing info.

I have two alternate proposals to address the above:

  1. Instead of adding ctab, add an option for a classification tag. Assuming the goal is reject components that are rejected by AROMA, you could make a list of all those component numbers and submit them through --manreg You'd also pass --rej-tag AROMA so that the reason for the rejection would be more clear. The strength of this approach is that the edits would be very light & it would fit within the existing approach. The weakness of this approach is that, if AROMA is also saving multiple metrics for each component, those would be stored in a different component table.

  2. If AROMA has other metrics that you want to save in a combined component table, then the ctab approach you have here is better, but the input could just be the AROMA results. One would need code to include the union of all columns in the component table dataframe (which I assume you already wrote to make the combined input you're providing to ctab). If there's a matching column label for a metric, throw an error if the values don't match. The classification column could be converted to a list of components to accept or reject (with a way to add classification tags) & the existing code could be used to update classifications & track provenance.

Thoughts?

@handwerkerd
Copy link
Member

handwerkerd commented Mar 11, 2025

Adding more details to my proposal 2 above:

  • The inputted component table need to have a columns precisely labelled Component, classification, and classification_tags
  • Component values are ICA_00, ICA_01, etc and need to have the exact number of rows and values as the original output to tedana
  • classification values must be accepted or rejected (The current full system permits more options, but I don't think it's needed here)
    • Convert the rejected values to a list of component numbers for the --manrej options & run the selector
    • Are there current scenarios where we want one accept to supersede another rejection? That is, if tedana rejects & AROMA accepts, that should remain rejected and we should not currently give an option to use the --manacc option when combining tables. This is my preference. If one is doing complex decisions that balance two acceptance processes, that should be a single decision tree within tedana.
    • I briefly checked the code and I think, if a list of components in --manrej only includes already rejected components, it should still run, but I'd want to confirm.
  • classification_tags can be any relevant labels in a comma separated list (i.e. rejected AROMA accepted AROMA)
    • The union of classification tags from both tables will be included with the combined table (i.e. a component's tag could be Likely BOLD, accepted AROMA)
    • This can be done by adding an optional classification_tags option to add_manual with the default being manual reclassify
    • def add_manual(self, indices: List[int], classification: str):
  • Other columns are allowed.
    • If a column has the same label as one in tedana & the row values are identical to tedana's component table to 6 decimals, then keep the tedana values. Otherwise throw an error
    • If it's a new column label, merge it into the combined component table

I'm including a lot of details, because I wanted to specify everything, but most of the mechanics for the above already exists in the code (plus adding ctab that's already in this PR). The only non-trivial new thing would be a function to merge columns in two component tables that checks for problems. If you decide to follow this approach, I can help/contribute.

@tsalo
Copy link
Member Author

tsalo commented Mar 11, 2025

By just swapping out the base component table, you're making substantial changes to classifications without logging the changes.

I hadn't considered that this PR would ignore the status table from the previous run. For my use case, I don't need that status table- it's still available in the original tedana directory, and I'm not planning to delete that. I just want to apply a component table + mixing matrix to my data to get an updated orthogonalized mixing matrix that I can use for later denoising. It would be nice to have some of the other bells and whistles (esp. an HTML report with component plots), but that's not a dealbreaker. It might just be easier for me to write a short script to handle this for my data.

Instead of adding ctab, add an option for a classification tag.

AROMA does include its own metrics and has multiple classification tags that may apply to each component, so I think this would end up being cluttered at the command line and would lose some information.

If AROMA has other metrics that you want to save in a combined component table, then the ctab approach you have here is better, but the input could just be the AROMA results.

I think this is a solid idea. I suppose, with that approach, one could provide multiple component tables, and even have an option for the decision strategy (e.g., reject if any component table rejects, accept if any component table accepts, or apply a rejection threshold based on the percentage of component tables that reject the component).

@handwerkerd
Copy link
Member

handwerkerd commented Mar 11, 2025

For my use case, I don't need that status table- it's still available in the original tedana directory, and I'm not planning to delete that. I just want to apply a component table + mixing matrix to my data to get an updated orthogonalized mixing matrix that I can use for later denoising. It would be nice to have some of the other bells and whistles (esp. an HTML report with component plots), but that's not a dealbreaker. It might just be easier for me to write a short script to handle this for my data.

If that's your main goal, then I'd recommend just sending the new list of components to reject to tedana and adding functionality to have an AROMA rejected classification tag. This is a really easy change.

You'll still have the component table you created, but it wouldn't mess with how the component table created by tedana interacts with other parts. If you want a fully combined component table, then we should do something slightly more expansive that merges the two component tables within tedana.

[update]
I just checked the key points of the code and confirmed that just adding options to ica_reclassify for --acctag and --rejtag to specify classification tags for listed components would be very easy & would be the the first step in a better integration of a --ctab input. @tsalo, if this would fit your current use case, I can create another PR.

With this option, your workflow would be:

  1. Run AROMA
  2. Run tedana with AROMA's mixing matrix as input.
  3. Instead of merging 2 component tables, extract the list of rejected components from AROMA & run ica_reclassify --maxrej [component list] --rejtag 'AROMA rejected
  • Unless you have a reason to need a single combined component table, this seems like similar or less effort from the pipeline creation side.
  • If AROMA has multiple classification tags for rejections, then you'd either run ica_reclassify multiple times or we can allow a list of labels as input to --manrej (which would be more work to parse).

@tsalo
Copy link
Member Author

tsalo commented Mar 21, 2025

We discussed this a bit on Mattermost, so I'll try to summarize our conversation.

For most users, allowing ica_reclassify to apply a single tag to the rejected and/or accepted components will probably be enough. For my use case, it's just not enough information. To be honest, I find the current classification tags not particularly informative: a component being "unlikely BOLD" isn't useful, given that the component was rejected, so it has to be "unlikely BOLD". With my AROMA+tedana combination, I want to know which particular criteria resulted in each classification, so I need the AROMA rationales. Each component may be rejected for multiple reasons, and the reasons will vary across components, which would be difficult to translate to command line arguments. I recognize that most users won't want that level of information, which is part of why we moved away from the rationale codes to broader classification tags, so I'll stick with my custom code for this step.

Basically, I'll close this PR in favor of a followup PR from @handwerkerd that adds classification tag parameters, and I'll use custom code for my AROMA+tedana denoising.

@tsalo tsalo closed this Mar 21, 2025
@eurunuela
Copy link
Collaborator

I think your custom code could still be useful for others in the community. Maybe we could add a link to it in our docs if you think that's a good idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement issues describing possible enhancements to the project
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Accept --ctab argument in ica_reclassify
3 participants