Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: move_next, move_publish break on updated dataset but no new version is generated #245

Open
zoidy opened this issue Dec 10, 2021 · 2 comments
Assignees
Labels
bug Something isn't working p1 Issues affecting production, multiple users

Comments

@zoidy
Copy link
Collaborator

zoidy commented Dec 10, 2021

Describe the bug
When a dataset is updated, not all changes trigger the generation of a new version of the Dataset in ReDATA (see the Figshare documentation for what does and doesn't trigger new versions).

When a dataset is updated and a new version is not triggered, putting the dataset into the curation workflow generates a v02 folder however trying to work with LD-Cool-P commands move_next, move_publish etc. still treat it as v1, causing conflicts with existing v1 folders for that deposit.

Solution approach is currently unclear. Perhaps we may need to modify the folder naming convention for versions. E.g., using major and minor versions: v01, v01.1, v02, ...

Note: there is a request_number field in the curation metadata json that can be examined. The request_number always increments by one every time the file is submitted for curation, no matter if a new version is generated or not.

Version information

  • LD_Cool-P version: 1.1.6

Additional note on versioning
When creating the folder structure when the dataset is pulled down using get_data, LD_Cool-P generates the version by simply adding 1 to the existing version, not taking into consideration that not all curation reviews result in a new version as stated above. This behavior would need to be modified at the same time this bug is addressed. See this line of code. For the time being, this behavior of simply adding 1 is beneficial since it allows the data to be pulled down, even though the other commands don't work.

@zoidy zoidy added bug Something isn't working p1 Issues affecting production, multiple users labels Dec 10, 2021
@yhan818 yhan818 self-assigned this Apr 16, 2022
@yhan818
Copy link
Contributor

yhan818 commented Apr 16, 2022

See https://help.figshare.com/article/can-i-edit-or-delete-my-research-after-it-has-been-made-public#:~:text=Figshare%20supports%20versioning%20for%20both,different%20between%20items%20and%20collections
"What is a new version?" in Figshare doc.

alias move_next="$ldcoolp_root/ldcoolp/scripts/perform_move --config $ldcoolp_config --direction next --article_id "
alias move_back="$ldcoolp_root/ldcoolp/scripts/perform_move --config $ldcoolp_config --direction back --article_id "
alias move_publish="$ldcoolp_root/ldcoolp/scripts/perform_move --config $ldcoolp_config --direction publish --article_id "

So perform_move is the script to debug. It has to do with current folder. When there is no new version generated, it must try to overwrite the current version (basically a replace) (either success or fail ).

Giving some of these actions (e.g. except changing "title", "author", updating/deleting files) will not generate a new version. To getting the JSON response from Figshare (https://docs.figsh.com/#account_institution_curation).
There are two fields updated: "request_number" and "modified_date".

@zoidy
Copy link
Collaborator Author

zoidy commented May 25, 2023

Update: the current practice has been to not use LD-Cool-P at all when a dataset will not generate a new version. Instead, we add the updated information to the same folder of the existing version, and manually copy a DepositReview template, renaming it with a .1 appended. This works because updates that do not generate a new version never involve changes to the Title, Authors, or Files. Unless the change is large enough (e.g. a significant update to the description), it's not worth generating a new readme (which would sidestep this issue).

Note: there is no way to tell whether an item in curation will generate a new version or not without a manual inspection of the changes.

For example: assume a dataset is at v1. A user corrects a small spelling error in the description and submits the dataset for review.

  1. The item is received as an update to the dataset in the curation dashboard
  2. The curator must inspect it carefully to see what the changes are and whether those changes will trigger a new version
  3. If the changes trigger a new version and the changes are small, proceed to the next step. If the changes are significant, do not continue with this process. Instead, edit the existing readme file and upload it to the dataset. This will trigger a new version and one can proceed with LD Cool P as normal.
  4. Go to the v01 folder in the curation server. Manually create a copy of the existing v01 review report and increase the version in the file name to v01.1.
  5. Record the curation process in the report as usual.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working p1 Issues affecting production, multiple users
Projects
None yet
Development

No branches or pull requests

2 participants