gbt dtrees always use float instead of the real data type cause unexpected predict result. #1628

cmsxbc · 2021-05-21T17:14:18Z

Describe the bug
I'm working on convert lightGBM model to daal gbt model. And the converted model predict different result which is much different
to that lightGBM predict.

And I have found that the gbt_dtrees model always use float as the code shows.

https://github.com/oneapi-src/oneDAL/blob/c6c5219c85e5bceb0392e54e653e00e8cc45e21f/cpp/daal/src/algorithms/dtrees/gbt/gbt_predict_dense_default_impl.i#L44

If i changed the typedef to typedef double ModelFPType, the result would be just as expected.

To Reproduce
Steps to reproduce the behavior:

use example model from lightgbm lightgbm simple_example, and apply this change for generate a 200 iterations model to make error more easy to be observed.

index 9af83008..debb339c 100644
--- a/examples/python-guide/simple_example.py                                                                          
+++ b/examples/python-guide/simple_example.py
@@ -35,9 +35,10 @@ print('Starting training...')
 # train                                                   
 gbm = lgb.train(params,                                   
                 lgb_train,                                
-                num_boost_round=20,
+                num_boost_round=340,
                 valid_sets=lgb_eval,
-                early_stopping_rounds=5)
+                verbose_eval=False,
+                early_stopping_rounds=100)

convert the model and predict

d4p_model = daal4py.get_gbt_model_from_lightgbm(gbm)
d4p_y_pred = daal4py.gbt_regression_prediction().compute(X_test, d4p_model).prediction.reshape(-1)
print("The rmse of daal4py's prediction is:", mean_squared_error(y_test, d4p_y_pred))

print('are preds of daal4py and lightGBM equal:', (d4p_y_pred == y_pred).all())
print('The rmse of daal4py vs lightGBM is:', mean_squared_error(d4p_y_pred, y_pred) ** 0.5)

the release package will see two pred result is not equal.
apply the patch

index 6b4ffc3af..16aac9b18 100644
--- a/cpp/daal/src/algorithms/dtrees/gbt/gbt_predict_dense_default_impl.i
+++ b/cpp/daal/src/algorithms/dtrees/gbt/gbt_predict_dense_default_impl.i
@@ -41,7 +41,7 @@ namespace prediction
 {
 namespace internal
 {
-typedef float ModelFPType;
+typedef double ModelFPType;
 typedef uint32_t FeatureIndexType;
 const FeatureIndexType VECTOR_BLOCK_SIZE = 64;

all works well.

Expected behavior
Use real data type, or the reason why the float is the one~

Output/Screenshots
Up one is the unexpected result, and bottom is what I recompiled with typedef double ModelFPType.

My patch:

Environment:

OS: ArchLInux
Compiler: gcc 11.1.0
Version: 2021.2.2

The text was updated successfully, but these errors were encountered:

ShvetsKS · 2021-05-28T11:03:52Z

@cmsxbc thanks for reporting this bug! Sorry for long response!
could you share the actual mean_squared_error difference for float vs double usage?
currently we are working on performance affection check.

cmsxbc · 2021-05-28T12:14:18Z

@ShvetsKS Sorry for my poor English.... I'm not sure what's real meaning for "actual mean_squared_error difference for float vs double usage"

For the model , which is from lightGBM example, I mentioned before, the mean_squared_error(daal4py_prediction_float, lightGBM_prediction) is 3.367366706112064e-06 using float. And when change to double, mean_squared_error(daal4py_prediction_double, lightGBM_prediction) is just 0.

And mean_squared_error(daal4py_prediction_float, daal4py_prediction_double) is also 3.367366706112064e-06 which is the result of mean_squared_error(daal4py_prediction_float, lightGBM_prediction)

Actually, I found this error from a model built with thousands of features and thousands of iterations.
But I'm afraid I can't provide actual value because of "Security policy for data" of my company.

What I can provide is that sqrt(mean_squared_error(d4p_prediction_float, lightGBM_prediction)) / lightGBM_prediction is approximately equal to 0.2 , and d4p_prediction_double is absolutely equal to lightGBM_prediction.
And we can tolerate sqrt(mean_squared_error(d4p_prediction, lightGBM_prediction)) / lightGBM_prediction < 1e-9

cmsxbc · 2021-07-15T06:25:37Z

@ShvetsKS Sorry for bothering, is there any progress? T_T

ShvetsKS · 2021-07-15T09:47:15Z

@cmsxbc currently we are working on appropriate solution to save support of "old" trained models with float fields. Sorry for slow fix delivery.

cmsxbc · 2021-10-06T15:34:24Z

@ShvetsKS Sorry again, but it have took almost another season.... T_T

razdoburdin · 2024-12-10T14:28:15Z

Hi all.

I reproduced the behavior with the latest version of the library. The accuracy difference has now decreased to approximately 2e-8 compared to the initially reported ~2e-3. We plan to close this issue, as the degradation no longer appears significant.

PetrovKP assigned ShvetsKS May 24, 2021

napetrov added the bug label Feb 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gbt dtrees always use float instead of the real data type cause unexpected predict result. #1628

gbt dtrees always use float instead of the real data type cause unexpected predict result. #1628

cmsxbc commented May 21, 2021 •

edited

Loading

ShvetsKS commented May 28, 2021

cmsxbc commented May 28, 2021 •

edited

Loading

cmsxbc commented Jul 15, 2021

ShvetsKS commented Jul 15, 2021

cmsxbc commented Oct 6, 2021

razdoburdin commented Dec 10, 2024

gbt dtrees always use float instead of the real data type cause unexpected predict result. #1628

gbt dtrees always use float instead of the real data type cause unexpected predict result. #1628

Comments

cmsxbc commented May 21, 2021 • edited Loading

ShvetsKS commented May 28, 2021

cmsxbc commented May 28, 2021 • edited Loading

cmsxbc commented Jul 15, 2021

ShvetsKS commented Jul 15, 2021

cmsxbc commented Oct 6, 2021

razdoburdin commented Dec 10, 2024

cmsxbc commented May 21, 2021 •

edited

Loading

cmsxbc commented May 28, 2021 •

edited

Loading