Skip to content

fix: raise error for missing explicit exogenous columns#441

Open
Saloni-0465 wants to merge 1 commit intosktime:mainfrom
Saloni-0465:fix/validate-exog-columns
Open

fix: raise error for missing explicit exogenous columns#441
Saloni-0465 wants to merge 1 commit intosktime:mainfrom
Saloni-0465:fix/validate-exog-columns

Conversation

@Saloni-0465
Copy link
Copy Markdown
Contributor

@Saloni-0465 Saloni-0465 commented May 4, 2026

Reference Issues/PRs

N/A

What does this implement/fix? Explain your changes.

This PR fixes silent dropping of explicitly requested exogenous columns during data conversion.

Previously, if a config specified exog_columns=["promo", "holiday"] but only promo existed in the loaded data, to_sktime_format would silently use the available column and ignore the missing one. This could mislead users or LLM agents into thinking all requested covariates were used.

This PR updates DataSourceAdapter.to_sktime_format to validate explicit exog_columns before constructing X. If any requested exogenous column is missing, it raises a clear ValueError listing the missing columns and available columns. When all requested columns exist, their order is preserved exactly.

Does your contribution introduce a new dependency? If yes, which one?

No.

What should a reviewer concentrate their feedback on?

  • Whether missing explicit exog_columns should fail fast rather than warn
  • Clarity of the error message returned to users and agents
  • Whether the validation belongs in DataSourceAdapter.to_sktime_format

Any other comments?

This is a small data-contract safety fix for custom data workflows. It helps prevent agents from making forecasting decisions under the false assumption that missing exogenous variables were used.

Validation:

  • python -m compileall src/sktime_mcp/data/base.py tests/test_data_exog_column_validation.py
  • Manual check: missing holiday in exog_columns raises a clear ValueError
  • Manual check: Executor.load_data_source returns success=False with error_type="ValueError"
  • Manual check: valid explicit exogenous columns are preserved in order

PR checklist

For all contributions

  • I've added myself to the list of contributors.
  • Optionally, I've updated CODEOWNERS.
  • I've added unit tests and made sure they pass locally where possible.

For new estimators

  • Not applicable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant