[BUG] fix auto_format_on_load crash for RangeIndex time series (#413
Open
eyasir2047 wants to merge 1 commit intosktime:mainfrom
Open
[BUG] fix auto_format_on_load crash for RangeIndex time series (#413eyasir2047 wants to merge 1 commit intosktime:mainfrom
eyasir2047 wants to merge 1 commit intosktime:mainfrom
Conversation
…e#390) The format_data_handle method assumed all indexes are datetime-like, calling y.index.freq, pd.infer_freq(), and pd.date_range() unconditionally. This crashed with 'RangeIndex object has no attribute freq' for valid integer-indexed time series data. Fix: - Add guard to skip datetime-specific frequency inference for RangeIndex and non-datetime indexes, setting 'Integer' as the frequency label - Guard the explicit freq assignment (step 5) against integer indexes - Guard metadata frequency access against RangeIndex Tests: - test_format_data_handle_rangeindex_no_crash: direct unit test - test_format_data_handle_rangeindex_metadata: verify 'Integer' label - test_load_data_source_rangeindex_end_to_end: full pipeline test
Contributor
|
@eyasir2047 You made a duplicate PR to a already linked PR. Please check the issue if there are any attached linked PR's |
Author
|
@biru-codeastromer Thanks for letting me know! I'll take a look at the linked PR and close this one if it already covers the fix. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reference Issues/PRs
Fixes #390
What does this implement/fix? Explain your changes.
I ran into this while testing load_data_source with a simple dict-based dataset (no datetime column). The auto-format step was crashing internally with 'RangeIndex' object has no attribute 'freq' — even though the request returned success=True, the formatting silently failed behind the scenes.
The root cause is in format_data_handle(): it calls y.index.freq, pd.infer_freq() and pd.date_range() without checking whether the index is actually datetime-like. A plain integer/RangeIndex doesn't have a .freq attribute, so it blows up.
The fix is a simple guard at the start of the frequency inference block — if the index is a RangeIndex, we skip all the datetime logic and just label the frequency as "Integer". Nothing else changes for datetime-indexed data.
Does your contribution introduce a new dependency? If yes, which one?
No, just uses pd.RangeIndex which is already available.
What should a reviewer concentrate their feedback on?
Mostly the guard condition around line 790 in executor.py— want to make sure it covers all the non-datetime index cases cleanly.
Any other comments?
Added 3 tests covering the unit-level fix, metadata output, and the full end-to-end load_data_sourcecall. Full test suite passes, ruff and format checks are clean.
PR checklist
For all contributions
[x] I've added unit tests and made sure they pass locally.