-
Notifications
You must be signed in to change notification settings - Fork 4
WIP Issue #29 - Add batch capability for running analytics over historical results (e.g. after uploading data from SD cards) #30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… over historical results (e.g. after uploading data from SD cards)
… over historical results (e.g. after uploading data from SD cards)
src/core_decision_flask_app.py
Outdated
| # Return 404 (Not Found) if the record IDs are invalid | ||
| if (date is None) : | ||
| logger.error('Missing parameters : '+DATE_PARAMETER+' : '+str(date)) | ||
| abort(404) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment - @Gaurav-Ramakrishna @upkarlidder
Should this error be
- 400 - bad request (@Gaurav-Ramakrishna says this)
- 422 - Unprocessable Entity
FYI @JSegrave-IBM
src/core_decision_flask_app.py
Outdated
| def batch_run_analytics_by_date(): | ||
|
|
||
| try: | ||
| date = request.args.get(DATE_PARAMETER) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JSegrave-IBM should we validate the date here? We just check for None.
upkarlidder
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JSegrave-IBM added some comments.
Can we also write a unit test or something to test it out before pushing it into master?
…e-checking and 400 return code.
… records in the batch window (also refactored batch_run_analytics to be a little clearer - it takes a start & end time now (rather than a schedule of individual minutes)
|
@copilot can you rebase this PR and see if it's compatible with the latest changes? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds batch analytics capabilities to process historical sensor data, enabling analytics to be run on previously collected data (e.g., from SD card uploads). The implementation includes a new REST endpoint, batch processing methods in the analytics engine, and documentation updates.
Key changes:
- Added
/batch_run_analytics_by_dateREST endpoint for triggering batch processing by date - Implemented
batch_run_analyticsandbatch_run_analytics_by_datemethods for processing historical data - Added overwrite capability to prevent duplicate analytics results during batch processing
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| src/core_decision_flask_app.py | Added new REST endpoint to trigger batch analytics processing for a specific date |
| src/GasExposureAnalytics.py | Implemented batch processing methods with overwrite functionality and reduced logging frequency |
| README.md | Updated documentation to include test step before running the application |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Remove any pre-existing analytic results before writing new ones. | ||
| if overwrite : | ||
| with self._db_engine.connect() as connection: # 'with' auto-closes the connection | ||
| connection.execute("DELETE FROM " + ANALYTICS_TABLE + " where " + TIMESTAMP_COL + " = '" + timestamp_key.isoformat() + "'") |
Copilot
AI
Oct 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQL injection vulnerability: The DELETE statement uses string concatenation with timestamp_key.isoformat() without parameterization. Although timestamp_key is generated internally, use parameterized queries for consistency and safety. Replace with: connection.execute(text('DELETE FROM :table WHERE :col = :ts'), {'table': ANALYTICS_TABLE, 'col': TIMESTAMP_COL, 'ts': timestamp_key.isoformat()})
| # Given a date, find the start and end times for the sensor records on that date | ||
| start_time, end_time = None, None | ||
| if self._from_db : | ||
| sql = ("SELECT MIN("+TIMESTAMP_COL+") AS start_time, MAX("+TIMESTAMP_COL+") AS end_time FROM "+SENSOR_LOG_TABLE+" WHERE DATE("+TIMESTAMP_COL+") = '"+date+"'") |
Copilot
AI
Oct 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQL injection vulnerability: The date parameter from user input is directly concatenated into the SQL query without parameterization. Use parameterized queries instead: sql = text('SELECT MIN(:col) AS start_time, MAX(:col) AS end_time FROM :table WHERE DATE(:col) = :date') and bind the parameters.
|
|
||
| # Return 400 (Bad Request) if the supplied date is invalid. | ||
| if (date_str is None) : | ||
| logger.error('Missing parameters : '+DATE_PARAMETER+' : '+date_str) |
Copilot
AI
Oct 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logging error contains incorrect information: When date_str is None, the log message attempts to concatenate None with strings, which will work but display 'None' incorrectly. The message should indicate the parameter is missing without attempting to display its value. Change to: logger.error('Missing required parameter: ' + DATE_PARAMETER)
| logger.error('Missing parameters : '+DATE_PARAMETER+' : '+date_str) | |
| logger.error('Missing required parameter: ' + DATE_PARAMETER) |
| # Setting commit=False prevents unit tests from writing to the database. | ||
| def batch_run_analytics (self, start_time=None, end_time=None, commit=True) : | ||
|
|
||
| # Log that a batch procss is starting & what it will look at and what it will over-write. |
Copilot
AI
Oct 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'procss' to 'process' in the comment above this line.
| # Log that a batch procss is starting & what it will look at and what it will over-write. | |
| # Log that a batch process is starting & what it will look at and what it will over-write. |
Issue #29 - Add batch capability for running analytics over historical results (e.g. after uploading data from SD cards)