Skip to content

Conversation

@sbarhin
Copy link

@sbarhin sbarhin commented Oct 30, 2025

Fixes

Description

This PR implements automation for Museum Victoria data fetching as discussed in issue #215. The implementation follows the established patterns from existing fetch scripts.
This purpose of this file is to fetch all the records from the Museum Victoria API, then saving the necessary response fields needed for the next phase (processing phase).

  • Fetches data for all record types (article, item, specimen, species) from the Museum Victoria API
  • Prepares and saves meaningful responses into a csv file under the data/2025Q4/1-fetch directory
  • Next actions will be to process and report the data once the fetching script is approved by reviewers

Checklist

  • I have read and understood the Developer Certificate of Origin (DCO), below, which covers the contents of this pull request (PR).
  • My pull request doesn't include code or content generated with AI.
  • My pull request has a descriptive title (not a vague title like Update index.md).
  • My pull request targets the default branch of the repository (main or master).
  • My commit messages follow best practices.
  • My code follows the established code style of the repository.
  • I added or updated tests for the changes I made (if applicable).
  • I added or updated documentation (if applicable).
  • I tried running the project locally and verified that there are no
    visible errors.

Developer Certificate of Origin

For the purposes of this DCO, "license" is equivalent to "license or public domain dedication," and "open source license" is equivalent to "open content license or public domain dedication."

Developer Certificate of Origin
Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

@sbarhin sbarhin requested review from a team as code owners October 30, 2025 15:49
@sbarhin sbarhin requested review from TimidRobot and possumbilities and removed request for a team October 30, 2025 15:49
@sbarhin
Copy link
Author

sbarhin commented Oct 30, 2025

@oree-xx I have opened a new pull request. I guess that is much better

@oree-xx
Copy link
Contributor

oree-xx commented Oct 30, 2025

@sbarhin ohh okay great.

@TimidRobot TimidRobot self-assigned this Oct 31, 2025
@TimidRobot TimidRobot changed the title Add musuems_fetch.py Add Museum Victoria fetch Oct 31, 2025
@sbarhin
Copy link
Author

sbarhin commented Oct 31, 2025

@TimidRobot There are several file changes in my PR, this is due to pulling from the main branch where I believe you merged a certain PR. These changes have taken effect in my branch hence those file changes in my PR.

@TimidRobot
Copy link
Member

@TimidRobot There are several file changes in my PR, this is due to pulling from the main branch where I believe you merged a certain PR. These changes have taken effect in my branch hence those file changes in my PR.

Please revisit the documentation on keeping a branch/fork synchronized with upstream and on resolving merge conflicts. This PR won't be reviewed or merged while these issue are present.

@sbarhin
Copy link
Author

sbarhin commented Oct 31, 2025

@TimidRobot There are several file changes in my PR, this is due to pulling from the main branch where I believe you merged a certain PR. These changes have taken effect in my branch hence those file changes in my PR.

Please revisit the documentation on keeping a branch/fork synchronized with upstream and on resolving merge conflicts. This PR won't be reviewed or merged while these issue are present.

I will do that please. Thank you

@sbarhin
Copy link
Author

sbarhin commented Oct 31, 2025

@TimidRobot I believe we are good now

@TimidRobot TimidRobot changed the title Add Museum Victoria fetch Add Museums Victoria fetch Nov 1, 2025
@TimidRobot
Copy link
Member

Please synchronize your fork/branch and add your data source to sources.md

@TimidRobot
Copy link
Member

The script takes too long to run (I canceled after 10 minutes).

Please add a --limit option so that it can be developed and tested without taking the full time.

@sbarhin
Copy link
Author

sbarhin commented Nov 2, 2025

The script takes too long to run (I canceled after 10 minutes).

Please add a --limit option so that it can be developed and tested without taking the full time.

@TimidRobot Please should the --limit apply individual record types or the total number of records altogether?

@sbarhin

This comment was marked as outdated.

@sbarhin

This comment was marked as outdated.

@TimidRobot
Copy link
Member

The script takes too long to run (I canceled after 10 minutes).
Please add a --limit option so that it can be developed and tested without taking the full time.

@TimidRobot Please should the --limit apply individual record types or the total number of records altogether?

@sbarhin Which implementation will satisfies the stated goal?

@TimidRobot
Copy link
Member

@sbarhin force-pushed the museums branch from b7d41b6 to 1c644a4 5 days ago

@sbarhin force-pushed the museums branch from 79e6ef5 to 6ee4793 3 days ago

@sbarhin force-pushed the museums branch from aa8774c to 6ee4793

@sbarhin force pushes are generally a worst practice and should be avoided when unnecessary.

@sbarhin
Copy link
Author

sbarhin commented Nov 5, 2025

The script takes too long to run (I canceled after 10 minutes).
Please add a --limit option so that it can be developed and tested without taking the full time.

@TimidRobot Please should the --limit apply individual record types or the total number of records altogether?

@sbarhin Which implementation will satisfies the stated goal?

I think per individual record type will do

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In review

Development

Successfully merging this pull request may close these issues.

Add Museum Victoria as data source

3 participants