Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for NIST Public Data Reposity DOIs #442

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jat255
Copy link

@jat255 jat255 commented Nov 7, 2024

This PR implements a DOI downloader for the NIST public data repository at https://data.nist.gov

This is my first attempt at developing on pooch at all, so please let me know if there are any changes needed, but I have included tests, and it is working so far as I can tell. Happy to iterate as needed.

Relevant issues/PRs:

Fixes #441

@jat255
Copy link
Author

jat255 commented Nov 12, 2024

@santisoler is there anything else needed for this PR to get a review? (apologies if you're not the right ping, but I saw you had the most recent merge activity)

@santisoler
Copy link
Member

Hi @jat255. Thanks for opening this PR. And yes, it's ok to ping me here.

Sorry for the delay. These past weeks were too busy for me.

I'll take a look at this and come back later. But again, thanks for taking the time to open this PR.

cc @leouieda

@santisoler
Copy link
Member

In the meantime. Do you have a link to the API specification of the NIST data repository?

In order to makes our lives easier while maintaining this feature, it's important to have a reference of the responses that the API supports, so we can avoid relying on responses that might change in the future.

@jat255
Copy link
Author

jat255 commented Nov 26, 2024

No worries. Thanks for the status update. The API information is here, though the request response is not particularly thoroughly specified, it appears: https://data.nist.gov/rmm/#/operations/record

I'll open a request with the team in charge of this project to see if they can fix that and get a better "example" response shown on that page. (EDIT: see usnistgov/oar-pdr#353 to track this)

To see an example response, you can enter a valid ID in the box on the page (such as mds2-3408) to see what the API returns.

@jat255
Copy link
Author

jat255 commented Jan 6, 2025

@santisoler @leouieda

Would it be possible to get this reviewed?

Copy link
Member

@santisoler santisoler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jat255! Thanks again for this PR. I think it's looking good, and I think we can merge it shortly. I left a few comments below, please check them out and let me know what do you think.

The other thing I think we should do before merging this PR is adding the NIST Repos to the list of supported DOI services to the documentation. For example, we should include it to the docstring of pooch.DOIDownloader (see https://www.fatiando.org/pooch/dev/api/generated/pooch.DOIDownloader.html#pooch.DOIDownloader).

If you like to include an example that downloads a file from NIST, that would be great! But that won't be needed to merge this PR.

Let me know if you need any help with this. Looking forward to merge it!

Comment on lines +1257 to +1263

Notes
-----
After Zenodo migrated to InvenioRDM on Oct 2023, their API changed. The
checksums for each file listed in the API reference is now an md5 sum.

This method supports both the legacy and the new API.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These notes were intended only for the Zenodo downloader, so we can remove them from here.

Suggested change
Notes
-----
After Zenodo migrated to InvenioRDM on Oct 2023, their API changed. The
checksums for each file listed in the API reference is now an md5 sum.
This method supports both the legacy and the new API.

downloader(to_download, outfile, None)


def test_populate_registry_nist_pdr(tmp_path):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The registry population requires us to ping the server, so let's decorate this test with @pytest.mark.network:

Suggested change
def test_populate_registry_nist_pdr(tmp_path):
@pytest.mark.network
def test_populate_registry_nist_pdr(tmp_path):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for the NIST Public Data Repository in DOI downloader
2 participants