error when HORIZON=1 #1040

bachng2017 · 2022-11-04T13:42:35Z

Your Environment

Python version: python 3.8.8
Operating system: oraclelinux:8 container
Lightwood version: 22.10.4.2 (compiled from code)

Describe your issue

when HORIZON=1 some predictor creation stopped with error

INFO:lightwood-490943:The block ICP is now running its analyze() method
Traceback (most recent call last):
  File "./mindsdb/integrations/handlers/lightwood_handler/lightwood_handler/functions.py", line 150, in run_learn
    run_fit(predictor_id, df, company_id)
  File "./mindsdb/utilities/functions.py", line 57, in wrapper
    return func(*args, **kwargs)
  File "./mindsdb/integrations/handlers/lightwood_handler/lightwood_handler/functions.py", line 118, in run_fit
    raise e
  File "./mindsdb/integrations/handlers/lightwood_handler/lightwood_handler/functions.py", line 86, in run_fit
    predictor.learn(df)
  File "/home/7538911/.local/lib/python3.8/site-packages/lightwood/helpers/log.py", line 30, in wrap
    result = f(predictor, *args, **kw)
  File "/tmp/9dfe41087bab7a240661225931a1369d5c36e65fc4183ac91667568302975776.py", line 315, in learn
    self.analyze_ensemble(enc_train_test)
  File "/home/7538911/.local/lib/python3.8/site-packages/lightwood/helpers/log.py", line 30, in wrap
    result = f(predictor, *args, **kw)
  File "/tmp/9dfe41087bab7a240661225931a1369d5c36e65fc4183ac91667568302975776.py", line 277, in analyze_ensemble
    self.model_analysis, self.runtime_analyzer = model_analyzer(data=encoded_test_data,train_data=encoded_train_data,stats_info=self.statistical_analysis,tss=self.problem_definition.timeseries_settings,accuracy_functions=self.accuracy_functions,predictor=self.ensemble,target=self.target,dtype_dict=self.dtype_dict,analysis_blocks=self.analysis_blocks,ts_analysis=self.ts_analysis)
  File "/home/7538911/.local/lib/python3.8/site-packages/lightwood/analysis/analyze.py", line 83, in model_analyzer
    runtime_analyzer = block.analyze(runtime_analyzer, **kwargs)
  File "/home/7538911/.local/lib/python3.8/site-packages/lightwood/analysis/nc/calibrate.py", line 150, in analyze
    output['icp']['__default'].calibrate(icp_df.values, y)
  File "/home/7538911/.local/lib/python3.8/site-packages/lightwood/analysis/nc/icp.py", line 102, in calibrate
    cal_scores = self.nc_function.score(self.cal_x, self.cal_y)
  File "/home/7538911/.local/lib/python3.8/site-packages/lightwood/analysis/nc/nc.py", line 409, in score
    err = self.err_func.apply(prediction, y)
  File "/home/7538911/.local/lib/python3.8/site-packages/lightwood/analysis/nc/nc.py", line 223, in apply
    return np.abs(prediction - y)
TypeError: unsupported operand type(s) for -: 'list' and 'float'

How can we replicate it?

run home-sale predict tutorial but using HORIZON=1

The text was updated successfully, but these errors were encountered:

bachng2017 · 2022-11-05T00:41:48Z

a little bit debug showed that, when HORIZON=1 the pd.abs try to compare [ list(double) ] with [ double ], which causes the error. And when HORIZON=2 for example, it correctly compares [ (double,double) ] to [ (double,double)]

paxcema · 2022-11-06T07:56:35Z

Hi @bachng2017, thanks for reporting

I've added an extra test to explicitly check this in #1041. However, tests are passing, which means the issue is in the handler upstream. I'll move this issue to the mindsdb repo and we can continue the conversation there.

bachng2017 · 2022-11-06T15:06:36Z

hello @paxcema , is this really an upstream issue ? the following test code (copied and modified from your commit) will reproduce the error

data = pd.read_csv('tests/data/house_sales.csv')
gby = ['bedrooms', 'type']
target = 'MA'
order_by = 'saledate'
window = 8
horizon = 1

train, _, test = stratify(data, pct_train=0.8, pct_dev=0, pct_test=0.2, stratify_on=gby, seed=1,
                          reshuffle=False)
jai = json_ai_from_problem(train,
                           ProblemDefinition.from_dict({'target': target,
                                                        'time_aim': 30,  # short time aim
                                                        'timeseries_settings': {
                                                            'group_by': gby,
                                                            'horizon': horizon,
                                                            'order_by': order_by,
                                                            'window': window
                                                        }}))
                                                        
# this will reproduce the error 
jai.model['args']['submodels'][0] = {
  "module": "SkTime"
} 
                                                             
code = code_from_json_ai(jai)
pred = predictor_from_code(code)

# Test with inferring mode, check timestamps are further into the future than test dates
test['__mdb_forecast_offset'] = 1
train_and_check_time_aim(pred, train, ignore_time_aim=True)
preds = pred.predict(test)

paxcema · 2022-11-06T15:32:19Z

Whoops, reopening 😅 - I can definitely replicate this. Let me get to the bottom of it tomorrow and we will get the fix deployed shortly. Thanks!

bachng2017 · 2022-11-07T02:00:52Z

FYI, this might be a fix. It solved the above test case but not sure about the other

diff --git a/lightwood/mixer/sktime.py b/lightwood/mixer/sktime.py
index 8f16380c..895aa49a 100644
--- a/lightwood/mixer/sktime.py
+++ b/lightwood/mixer/sktime.py
@@ -308,7 +308,11 @@ class SkTime(BaseMixer):
         for true_idx, (idx, _) in enumerate(series.items()):
             start_idx = max(0, true_idx)
             end_idx = start_idx + self.horizon
-            ydf['prediction'].loc[idx] = all_preds[start_idx:end_idx]
+            if self.horizon > 1 :
+                ydf['prediction'].loc[idx] = all_preds[start_idx:end_idx]
+            else:
+                ydf['prediction'].loc[idx] = all_preds[start_idx]
+
         return ydf

     def _call_default(self, ydf, data, idxs):

paxcema · 2022-11-07T15:22:18Z

Okay, it turns out the error is expected, but it is not obvious at all why.

The reason is that the sktime mixer does not support horizon=1 yet. You can check this is by design if you check the dispatch rules in the JsonAI generation phase, in particular this bit. However, it does raise two interesting points:

Can we add support for horizon=1 in sktime? I can't think of a reason why not, and evidently we are not impeding users to try it out right now (hence the error you got), so we should at least try. I will open a new issue for this, plus a branch that explores the proposed fix you've posted above (unless you want to work on it yourself, which if so please let us know and we will assign you).
Exactly how did you get the error in MindsDB by following the tutorial? You mentioned "run home-sale predict tutorial but using HORIZON=1", did you by any chance also force the sktime mixer via USING syntax? Can you post the complete SQL query?

bachng2017 · 2022-11-07T23:42:46Z

thanks for the feedback on this and mindsdb #3984

yes, we also somehow had observed that . And actually we tried another fix that introduce is_single_ts (comparing to the existed is_multi_ts ) but it was a little bit complicated so we replaced it with the mentioned patch.

The reason is that the sktime mixer does not support horizon=1 ye

anyway it's good that the issue is aware now

for (1), please go ahead :) Because i have only limited information about this issue only but not for the whole architecture so our proposed patch is very limited.
for (2), yes we might unconsciously forced sktime. But as far as I remember, we ran into the problem firstly with some other dataset and configuration including HORIZON=1 then we realized that sktime was related to the cause. For now, we could not reproduce the issue with only HORZION=1.

paxcema · 2024-01-22T21:13:53Z

Closing via #1202.

bachng2017 added the bug Something isn't working label Nov 4, 2022

paxcema self-assigned this Nov 4, 2022

paxcema mentioned this issue Nov 6, 2022

[test] Improve time series test #1041

Merged

paxcema closed this as completed Nov 6, 2022

paxcema mentioned this issue Nov 6, 2022

[bug] Lightwood handler errors for time series tasks with horizon=1 mindsdb/mindsdb#3983

Closed

paxcema reopened this Nov 6, 2022

paxcema mentioned this issue Jan 22, 2024

[fix] Check horizon in SkTime mixer #1202

Merged

paxcema closed this as completed Jan 22, 2024

paxcema added the wontfix Will not fix / very low priority label Jan 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

error when HORIZON=1 #1040

error when HORIZON=1 #1040

bachng2017 commented Nov 4, 2022

bachng2017 commented Nov 5, 2022

paxcema commented Nov 6, 2022

bachng2017 commented Nov 6, 2022 •

edited

Loading

paxcema commented Nov 6, 2022 •

edited

Loading

bachng2017 commented Nov 7, 2022 •

edited

Loading

paxcema commented Nov 7, 2022 •

edited

Loading

bachng2017 commented Nov 7, 2022 •

edited

Loading

paxcema commented Jan 22, 2024

error when HORIZON=1 #1040

error when HORIZON=1 #1040

Comments

bachng2017 commented Nov 4, 2022

Your Environment

Describe your issue

How can we replicate it?

bachng2017 commented Nov 5, 2022

paxcema commented Nov 6, 2022

bachng2017 commented Nov 6, 2022 • edited Loading

paxcema commented Nov 6, 2022 • edited Loading

bachng2017 commented Nov 7, 2022 • edited Loading

paxcema commented Nov 7, 2022 • edited Loading

bachng2017 commented Nov 7, 2022 • edited Loading

paxcema commented Jan 22, 2024

bachng2017 commented Nov 6, 2022 •

edited

Loading

paxcema commented Nov 6, 2022 •

edited

Loading

bachng2017 commented Nov 7, 2022 •

edited

Loading

paxcema commented Nov 7, 2022 •

edited

Loading

bachng2017 commented Nov 7, 2022 •

edited

Loading