Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add button to DAGS to override values of today's logs #331

Closed
alexglasertpx opened this issue Jan 9, 2025 · 4 comments · Fixed by #332
Closed

Add button to DAGS to override values of today's logs #331

alexglasertpx opened this issue Jan 9, 2025 · 4 comments · Fixed by #332
Assignees

Comments

@alexglasertpx
Copy link
Contributor

alexglasertpx commented Jan 9, 2025

Overview
Following on from digital-land/technical-documentation#211 add a button to DAGS if DM wish to override the latest values in log.csv. There may be an instance where a new endpoint is added after the overnight runs so this option will add the result to log.csv

Tech Approach

  1. Add button to DAGS
  2. if button set to True. then download all of the json files associated with today's endpoints and replace the rows for today in log.csv and today's json files
  3. Output error if button is set to False and there are new endpoints (need to refine since we don't want the overnight runs to error so have check that says if nothing in log for today then run collection and update log as normal).
  4. Once happy then reset regenerate-log-override button to False.
  5. If an endpoint isn't fetched that was previously for that day, generate an error.
  6. If no collection has been run today, then we process as normal.

Workflow expected to be like this

Acceptance Criteria/Tests
Ensure that the log files are updated (even if the outputs are the same)
Add a new json file(s) / edit a current json file and see what happens to the log.csv file
Set button to False & add new json file(s), see if we get an error

@alexglasertpx
Copy link
Contributor Author

alexglasertpx commented Jan 21, 2025

Checklist for this ticket:

  • Create Mural for workflow
  • Incorporate force-refetch parameter into fetch code to ensure that latest json files are downloaded
  • Incorporate overwrite-today parameters into load_csv code to ensure that log file doesn't download latest values
  • Add force-refetch parameter to makerules (make collect)
  • Add 'overwrite-today` parameter (make collection) same as above but different context,
  • Add on new button to Airflow
  • Check in Dev that collections are working correctly (run once when 'fetch_todays_logs' are set to false and once when set to True).
  • Check what logs look like when REGENERATE_LOG_OVERRIDE is set to True or False and if there are differences
  • Check results when running on dev, will likely need to compare results from dev and results locally

@alexglasertpx
Copy link
Contributor Author

Moving to 'Blocked' as there are a few issues with dev that we are waiting to resolve and will need help from other members of the team to ensure that environmental variables are being passed from Airflow, etc into the Collect class.

@alexglasertpx
Copy link
Contributor Author

Looks like the dev issues have resolved themselves. Have 'split' the parameter from REGENERATE_TODAYS_LOGS into two components, force_refetch and overwrite_today. They are the same value but they do different things, so easier for the user to read. The former refetches the json files that are needed to create today's logs, whereas the latter is used to ensure we don't load in the values of the logs relating to today (and potentially duplicate endpoints in the logs).

@CarlosCoelhoSL CarlosCoelhoSL transferred this issue from digital-land/technical-documentation Jan 29, 2025
@CarlosCoelhoSL CarlosCoelhoSL mentioned this issue Jan 29, 2025
8 tasks
@CarlosCoelhoSL CarlosCoelhoSL moved this from In Development to In Review / QA in Infrastructure Feb 3, 2025
@CarlosCoelhoSL
Copy link
Contributor

Waiting on the collection issues to be fixed before merging the following PRs:
#332
digital-land/makerules#73
digital-land/airflow-dags#27

@CarlosCoelhoSL CarlosCoelhoSL moved this from In Review / QA to Blocked in Infrastructure Feb 3, 2025
@github-project-automation github-project-automation bot moved this from Blocked to Done - Consider for Weeknotes in Infrastructure Feb 4, 2025
@Ben-Hodgkiss Ben-Hodgkiss moved this from Done - Consider for Weeknotes to Done - This Period in Infrastructure Feb 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done - This Period
Development

Successfully merging a pull request may close this issue.

2 participants