Skip to content

Conversation

@hkershaw-brown
Copy link
Member

This pull request adds a module data.py for managing download of example data to allow people to run the examples without the need to clone the repo or manually download the example files.
The example data is too big for pypi so is stored on Zenodo: https://zenodo.org/records/18135062

The get_example_data function:

  • Try data directory (local installs)
  • Environment variable (can be used to point to example data)
  • Zenodo (downloads example data to users home directory)

fixes #61
fixes #108

Setup.py has been removed and pyproject.toml updated. The package can still be installed in editable mode (pip install -e .)
fixes #110

@codecov-commenter
Copy link

codecov-commenter commented Jan 5, 2026

Codecov Report

❌ Patch coverage is 0% with 90 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.12%. Comparing base (c1056bf) to head (765d239).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
src/pydartdiags/data.py 0.00% 90 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #119      +/-   ##
==========================================
- Coverage   69.75%   64.12%   -5.64%     
==========================================
  Files           4        5       +1     
  Lines        1025     1115      +90     
==========================================
  Hits          715      715              
- Misses        310      400      +90     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hkershaw-brown
Copy link
Member Author

test package at https://test.pypi.org/project/pydartdiags/0.6.2/
@hkershaw-brown test fresh pip install.

@hkershaw-brown hkershaw-brown marked this pull request as draft January 6, 2026 14:15
@hkershaw-brown hkershaw-brown marked this pull request as ready for review January 6, 2026 14:30
Copy link
Collaborator

@mjs2369 mjs2369 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've read through all the code and everything looks correct; I've been testing the various methods of getting the data with a small tester program that simply calls get_example_data on a file and prints the result.

All of the methods to get the data if it's already downloaded are working great:
# 1. Check development location (for contributors/developers)
# 2. Check environment variable
# 3. Check cache directory

But the automatic download is failing for me with:
Error downloading data: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1081)>

It does not make it to the "Download complete" print in 158 so it has to be failing in the URL request line:
urllib.request.urlretrieve(ZENODO_RECORD_URL, archive_file)

I have the certifi install in my py-dart env though:

(py-dart) masmith@CISL-DUNCAN pydartdiags % python -m certifi
/Users/masmith/Desktop/py-dart/lib/python3.14/site-packages/certifi/cacert.pem

I'm using the packaging branch on my editable install, so I have all the code changes as well.

@hkershaw-brown
Copy link
Member Author

Thanks for the review Marlee, I am trying to reproduce the error and am unable to.

VPN, no VPN, on Derecho.
install pyDARTdiags
install the docs requirements.txt

With the error do you get the message about a manual download? (I will break the download code to check the message appears).
except Exception as e:
print(f"Error downloading data: {e}")
print(f"\nManual download instructions:")
print(f"1. Download from: {ZENODO_DOI}")
print(f"2. Extract to: {cache_dir}")
raise

@hkershaw-brown
Copy link
Member Author

Just noticed you are on python 3.14, I will check this.

@mjs2369
Copy link
Collaborator

mjs2369 commented Jan 13, 2026

@hkershaw-brown I do get the message about a manual download.

Here is the full message:

(py-dart) masmith@CISL-DUNCAN pydartdiags % python3 data_tester.py                
package_dir: /Users/masmith/Desktop/pyDARTdiags
Data file 'obs_seq.final.1000' not found locally.
Downloading all example data from Zenodo...
archive_file:  /Users/masmith/.pydartdiags/18135062.zip
Downloading data from Zenodo (https://doi.org/10.5281/zenodo.18135062)...
This may take a few minutes (approx. 85 MB)...
Error downloading data: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1081)>

Manual download instructions:
1. Download from: https://doi.org/10.5281/zenodo.18135062
2. Extract to: /Users/masmith/.pydartdiags/data
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/urllib/request.py", line 1321, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
    ~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
              encode_chunked=req.has_header('Transfer-encoding'))
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/http/client.py", line 1358, in request
    self._send_request(method, url, body, headers, encode_chunked)
    ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/http/client.py", line 1404, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
    ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/http/client.py", line 1353, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
    ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/http/client.py", line 1113, in _send_output
    self.send(msg)
    ~~~~~~~~~^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/http/client.py", line 1057, in send
    self.connect()
    ~~~~~~~~~~~~^^
  File "/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/http/client.py", line 1499, in connect
    self.sock = self._context.wrap_socket(self.sock,
                ~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^
                                          server_hostname=server_hostname)
                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/ssl.py", line 455, in wrap_socket
    return self.sslsocket_class._create(
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        sock=sock,
        ^^^^^^^^^^
    ...<5 lines>...
        session=session
        ^^^^^^^^^^^^^^^
    )
    ^
  File "/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/ssl.py", line 1076, in _create
    self.do_handshake()
    ~~~~~~~~~~~~~~~~~^^
  File "/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/ssl.py", line 1372, in do_handshake
    self._sslobj.do_handshake()
    ~~~~~~~~~~~~~~~~~~~~~~~~~^^
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1081)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/masmith/Desktop/pyDARTdiags/src/pydartdiags/data_tester.py", line 6, in <module>
    data_file = get_example_data("obs_seq.final.1000")
  File "/Users/masmith/Desktop/pyDARTdiags/src/pydartdiags/data.py", line 108, in get_example_data
    download_all_data()
    ~~~~~~~~~~~~~~~~~^^
  File "/Users/masmith/Desktop/pyDARTdiags/src/pydartdiags/data.py", line 158, in download_all_data
    urllib.request.urlretrieve(ZENODO_RECORD_URL, archive_file)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/urllib/request.py", line 212, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
                            ~~~~~~~^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/urllib/request.py", line 187, in urlopen
    return opener.open(url, data, timeout)
           ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/urllib/request.py", line 487, in open
    response = self._open(req, data)
  File "/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/urllib/request.py", line 504, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
                              '_open', req)
  File "/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/urllib/request.py", line 464, in _call_chain
    result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/urllib/request.py", line 1369, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
           ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                        context=self._context)
                        ^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/urllib/request.py", line 1324, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1081)>

@hkershaw-brown
Copy link
Member Author

I cannot reproduce the error, do you think the message is sufficient for people having SSL problems?

@mjs2369
Copy link
Collaborator

mjs2369 commented Jan 13, 2026

I cannot reproduce the error, do you think the message is sufficient for people having SSL problems?

Yes I think that is fine. The manual download is very easy. I'll approve this

data.py module for managing download of example data.
The example data is too big for pypi so is stored on Zenodo

Try data directory (local installs)
Envinment variable (can be used to point to example data)
Zenodo (downloads example data to users home directory)

fixes #108
@hkershaw-brown hkershaw-brown merged commit 4b69d48 into main Jan 14, 2026
2 checks passed
@hkershaw-brown hkershaw-brown deleted the packaging branch January 14, 2026 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

setup.py superseded by pyproject.toml Example scripts require repository data data to include in the pypi package

4 participants