Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: scitype warning #405

Closed
Moelf opened this issue Feb 4, 2025 · 5 comments
Closed

[BUG]: scitype warning #405

Moelf opened this issue Feb 4, 2025 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@Moelf
Copy link

Moelf commented Feb 4, 2025

What happened?

maybe related to #390

atlas_template = TemplateStructure{(:B, :E)}(
     ((; B, E), (x,)) -> B(x)^E(log(x))
)

sr_model2 = SRRegressor(
    niterations=200,
    binary_operators=[+, -, *],
    # unary_operators=[log],
    seed=20241206,
    expression_type=TemplateExpression,
    expression_options=(; structure=atlas_template),
    deterministic=true,
    parallelism=:serial
)

sr_mach2 = machine(sr_model2, X, ys)

fit!(sr_mach2)


┌ Warning: The number and[/or](https://jiling-notebook-1.notebook.af.uchicago.edu/or) types of data arguments do not match what the specified model
│ supports. Suppress this type check by specifying `scitype_check_level=0`.
│ 
│ Run `@doc SymbolicRegression.SRRegressor` to learn more about your model's requirements.
│ 
│ Commonly, but non exclusively, supervised models are constructed using the syntax
│ `machine(model, X, y)` or `machine(model, X, y, w)` while most other models are
│ constructed with `machine(model, X)`.  Here `X` are features, `y` a target, and `w`
│ sample or class weights.
│ 
│ In general, data in `machine(model, data...)` is expected to satisfy
│ 
│     scitype(data) <: MLJ.fit_data_scitype(model)
│ 
│ In the present case:
│ 
│ scitype(data) = Tuple{Table{AbstractVector{ScientificTypesBase.Continuous}}, AbstractVector{ScientificTypesBase.Continuous}}
│ 
│ fit_data_scitype(model) = Union{Tuple{Union{Table{<:Union{AbstractVector{<:ScientificTypesBase.Continuous}, AbstractVector{<:Count}}}, AbstractMatrix{<:ScientificTypesBase.Continuous}}, AbstractVector{<:Unknown}}, Tuple{Union{Table{<:Union{AbstractVector{<:ScientificTypesBase.Continuous}, AbstractVector{<:Count}}}, AbstractMatrix{<:ScientificTypesBase.Continuous}}, AbstractVector{<:Unknown}, AbstractVector{<:Union{ScientificTypesBase.Continuous, Count}}}}
└ @ MLJBase ~/.julia/packages/MLJBase/7nGJF/src/machines.jl:237
[ Info: Training machine(SRRegressor(defaults = nothing, ), ).

But the X and ys are simply:

X = (xs = [110.0, 111.94630872483222, 113.89261744966443, 115.83892617449665, 117.78523489932886, 119.73154362416108, 121.67785234899328, 123.6241610738255, 125.57046979865771, 127.51677852348993, 129.46308724832215, 131.40939597315437, 133.35570469798657, 135.3020134228188, 137.248322147651, 139.19463087248323, 141.14093959731542, 143.08724832214764, 145.03355704697987, 146.9798657718121, 148.9261744966443, 150.8724832214765, 152.81879194630872, 154.76510067114094, 156.71140939597316, 158.65771812080536, 160.60402684563758, 162.5503355704698, 164.49664429530202, 166.44295302013424, 168.38926174496643, 170.33557046979865, 172.28187919463087, 174.2281879194631, 176.1744966442953, 178.1208053691275, 180.06711409395973, 182.01342281879195, 183.95973154362417, 185.90604026845637, 187.8523489932886, 189.7986577181208, 191.74496644295303, 193.69127516778522, 195.63758389261744, 197.58389261744966, 199.53020134228188, 201.4765100671141, 203.4228187919463, 205.36912751677852, 207.31543624161074, 209.26174496644296, 211.20805369127515, 213.15436241610738, 215.1006711409396, 217.04697986577182, 218.99328859060404, 220.93959731543623, 222.88590604026845, 224.83221476510067, 226.7785234899329, 228.7248322147651, 230.6711409395973, 232.61744966442953, 234.56375838926175, 236.51006711409397, 238.45637583892616, 240.40268456375838, 242.3489932885906, 244.29530201342283, 246.24161073825502, 248.18791946308724, 250.13422818791946, 252.08053691275168, 254.0268456375839, 255.9731543624161, 257.91946308724835, 259.86577181208054, 261.81208053691273, 263.758389261745, 265.7046979865772, 267.6510067114094, 269.5973154362416, 271.5436241610738, 273.48993288590606, 275.43624161073825, 277.38255033557044, 279.3288590604027, 281.2751677852349, 283.22147651006713, 285.1677852348993, 287.1140939597315, 289.06040268456377, 291.00671140939596, 292.9530201342282, 294.8993288590604, 296.8456375838926, 298.79194630872485, 300.73825503355704, 302.6845637583893, 304.6308724832215, 306.5771812080537, 308.5234899328859, 310.4697986577181, 312.4161073825503, 314.36241610738256, 316.30872483221475, 318.255033557047, 320.2013422818792, 322.1476510067114, 324.09395973154363, 326.0402684563758, 327.9865771812081, 329.93288590604027, 331.87919463087246, 333.8255033557047, 335.7718120805369, 337.71812080536915, 339.66442953020135, 341.61073825503354, 343.5570469798658, 345.503355704698, 347.4496644295302, 349.3959731543624, 351.3422818791946, 353.28859060402687, 355.23489932885906, 357.18120805369125, 359.1275167785235, 361.0738255033557, 363.02013422818794, 364.96644295302013, 366.9127516778523, 368.8590604026846, 370.80536912751677, 372.751677852349, 374.6979865771812, 376.6442953020134, 378.59060402684565, 380.53691275167785, 382.48322147651004, 384.4295302013423, 386.3758389261745, 388.32214765100673, 390.2684563758389, 392.2147651006711, 394.16107382550337, 396.10738255033556, 398.0536912751678, 400.0],)
ys = [235.0, 230.0, 233.0, 195.0, 209.0, 148.0, 169.0, 153.0, 157.0, 146.0, 147.0, 130.0, 131.0, 114.0, 118.0, 105.0, 120.0, 107.0, 92.0, 91.0, 89.0, 100.0, 95.0, 67.0, 77.0, 65.0, 70.0, 83.0, 66.0, 57.0, 58.0, 46.0, 53.0, 55.0, 55.0, 59.0, 48.0, 38.0, 50.0, 45.0, 36.0, 45.0, 31.0, 37.0, 38.0, 33.0, 32.0, 35.0, 31.0, 37.0, 20.0, 27.0, 40.0, 34.0, 26.0, 26.0, 21.0, 25.0, 29.0, 40.0, 26.0, 31.0, 21.0, 23.0, 25.0, 22.0, 20.0, 22.0, 23.0, 11.0, 19.0, 38.0, 18.0, 18.0, 23.0, 22.0, 19.0, 21.0, 14.0, 19.0, 13.0, 16.0, 13.0, 8.0, 11.0, 11.0, 11.0, 20.0, 14.0, 12.0, 9.0, 7.0, 8.0, 15.0, 14.0, 14.0, 14.0, 13.0, 7.0, 13.0, 9.0, 10.0, 7.0, 12.0, 5.0, 11.0, 10.0, 8.0, 8.0, 8.0, 7.0, 8.0, 6.0, 12.0, 10.0, 5.0, 7.0, 6.0, 6.0, 5.0, 6.0, 8.0, 6.0, 9.0, 5.0, 5.0, 6.0, 7.0, 3.0, 8.0, 6.0, 3.0, 7.0, 5.0, 6.0, 7.0, 7.0, 8.0, 6.0, 10.0, 6.0, 4.0, 6.0, 3.0, 4.0, 3.0, 4.0, 3.0, 4.0, 7.0]

This

Version

1.6.0

Operating System

Linux

Interface

Jupyter Notebook

Relevant log output

Extra Info

No response

@Moelf Moelf added the bug Something isn't working label Feb 4, 2025
@MilesCranmer
Copy link
Owner

Yes it’s definitely related to #390. There’s no way in MLJ to have conditional scitype constraints at the moment. And if the default is too broad apparently it messes up the integration tests (cc @ablaom). Not sure if there’s a workaround, @ablaom?

@Moelf
Copy link
Author

Moelf commented Feb 4, 2025

I guess right now it's just a warning and it doesn't stop the SR from conduting the business, so maybe a no-fix for now.

Given how simple my data is, I wonder if this warning will ALWAYS come up?

@MilesCranmer
Copy link
Owner

I'm not sure what's going on but if I run your example, I don't see any warning. Weird.

@ablaom
Copy link

ablaom commented Feb 6, 2025

edited

When I run the example I do see the warning, as reported.

Here is an isolation of the underlying issue:

julia> target_scitype(sr_model2)
AbstractVector{<:Unknown} (alias for AbstractArray{<:Unknown, 1})

julia> target_scitype(SRRegressor)
AbstractVector{<:Continuous} (alias for AbstractArray{<:Continuous, 1})

The fact that these are different is atypical, and there is something in the design of MLJ that tacitly assumes these are the same. For some cases of Unknown scitype the checks are supposed to be ignored, which, if I remember correctly, was something we were leveraging in engineering a hack to fit some non-standard behaviour of SymbolicRegression (regression with non-numerical targets) into our mould. I'll have a look if I can make this Unknown detection more robust. But this is all looking like an awful hack...

@ablaom
Copy link

ablaom commented Feb 8, 2025

@Moelf This issue appears resolved in the in the latest version of SymbolicRegression 1.7.0, released an hour ago.

Status `/private/var/folders/4n/gvbmlhdc8xj973001s6vdyw00000gq/T/jl_sAb1Yb/Project.toml`
  [add582a8] MLJ v0.20.7
  [8254be44] SymbolicRegression v1.7.0

@Moelf Moelf closed this as completed Feb 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants