Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Describe how to handle duplicate datasets with delete-from-index #6

Closed
mobb opened this issue May 17, 2018 · 0 comments
Closed

Describe how to handle duplicate datasets with delete-from-index #6

mobb opened this issue May 17, 2018 · 0 comments

Comments

@mobb
Copy link
Contributor

mobb commented May 17, 2018

related to: EDIorg/ECC#10

A public search turns up duplicate datasets, which are not intended. Deprecated dataset scope.docid-A.rev is supposed to be replaced by dataset scope.docid-B.rev

the solution (to the duplicate) is technically called a"delete" although datasets are only "deleted-from-index" (they are not actually deleted, only archived).

pasta can handle revisions between the same docid, but not across docids. So if there is to be a trace from B back to A, the housekeeping task has to be handled by the site.

This process needs to get written up:
The site has two steps:

  1. make the linkage between a deprecated A and replacement B.
  2. request that A be deleted-from-index (I think this is a manual process for pasta devs to do)

TBD:

  • where in metadata?
    • human readable?
    • machine readable?
  • how is reason described?
  • use cases: add examples and and list of reasons why B is likely to be deprecating A.
@mobb mobb closed this as completed Oct 9, 2019
@mobb mobb transferred this issue from EDIorg/data-package-best-practices Oct 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant