Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(pandas): use arrays for values and indices in time_features #143

Merged
merged 3 commits into from
Jan 16, 2025

Conversation

elephaint
Copy link
Contributor

@elephaint elephaint commented Jan 16, 2025

The following example fails without the fix:

import pandas as pd
from utilsforecast.feature_engineering import time_features

df_cv = pd.DataFrame({'DATE': {0: '2022-02-01',
  1: '2022-02-01',
  2: '2022-02-01',
  3: '2022-02-01',
  4: '2022-02-01'},
 'STORE_ID': {0: '51a91370-5e6e-11ea-bdc8-4f8d0b878ff3:Store',
  1: '8f4e06d0-6260-11eb-a46e-abb30e01a929:Store',
  2: '9c1da2e0-6020-11eb-8f38-59a0094a41b9:Store',
  3: 'c4df2260-60ef-11eb-a4b6-e18772606843:Store',
  4: '1433b940-9475-11ea-9faa-41f6926ddb7f:Store'},
 'NET_SALES': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0}})

id_col = "STORE_ID"
time_col = "DATE"
target_col = "NET_SALES"

freq = "D"
h=28  # Short Term Foreasting
df_cv[time_col] = pd.to_datetime(df_cv[time_col])

df, X_df = time_features(
    df=df_cv.head(),
    freq=freq,
    features=["week", "dayofweek"],  # gives error
    id_col=id_col,
    time_col=time_col,
)

which leads to the error:

ValueError: cannot reindex on an axis with duplicate labels

which seems due to the weekly created feature not being an index.

Versions:

  • utilsforecast 0.2.10
  • pandas 2.2.3

@elephaint elephaint added the bug Something isn't working label Jan 16, 2025
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@elephaint elephaint requested a review from jmoralez January 16, 2025 18:01
@jmoralez jmoralez changed the title [FIX] Time feature weekly bug fix: use arrays for values and indices in time_features Jan 16, 2025
@jmoralez jmoralez added fix bug fix and removed bug Something isn't working labels Jan 16, 2025
@jmoralez jmoralez changed the title fix: use arrays for values and indices in time_features fix(pandas): use arrays for values and indices in time_features Jan 16, 2025
@jmoralez jmoralez enabled auto-merge (squash) January 16, 2025 18:19
@jmoralez jmoralez merged commit fd00d55 into main Jan 16, 2025
21 checks passed
@jmoralez jmoralez deleted the fix/time_feature_weekly branch January 16, 2025 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix bug fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants