Skip to content

ONNX wrapping a kv-cached model #382

@benraha

Description

@benraha

Describe the bug

I'm trying to wrap a kv-cached Classifier in ONNX, (I managed to do it for default fit_mode, just like the example). I want it to receive one line as input, and infer it using the cached kv-cache in real time.

I get this annoying trace:

RuntimeError: Cannot insert a Tensor that requires grad as a constant. Consider making it a parameter or input, or detaching the gradient

Coming from:

 File "TabPFN/src/tabpfn/model/transformer.py", line 543, in _forward
    embedded_y = self.y_encoder(
....
File "TabPFN/src/tabpfn/model/encoders.py", line 459, in forward
    out = self._transform(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "TabPFN/src/tabpfn/model/encoders.py", line 400, in _transform
    return (self.layer(x),)
            ^^^^^^^^^^^^^
  File "tabpfn/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "tabpfn/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "tabpfn/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1741, in _slow_forward
    result = self.forward(*input, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "tabpfn/lib/python3.11/site-packages/torch/nn/modules/linear.py", line 125, in forward
    return F.linear(input, self.weight, self.bias)

I'm not very familiar with PyTorch so I would appreciate some guidance. Online, this error occurs mostly when a layer in the model is not inheriting from nn.module, but it doesn't seem to be the case here.

Steps/Code to Reproduce

Along these lines, I know I'm missing the preprocessing stage here, but this should work:

If there's interest, I can supply an exact piece of code.


class TabPFNModelWrapperWithTrainData(nn.Module):
    def __init__(self, classifier):
        super().__init__()
        self.classifier = classifier

    def forward(
        self,
        X
    ):

    return self.classifier.forward(X, use_inference_mode=True)


classifier = TabPFNClassifier(
    model_path="tabpfn-v2-classifier.ckpt",
    n_estimators=1,
    device="cpu",
    random_state=42,
    fit_mode="fit_with_cache",
    memory_saving_mode=False,
)
classifier.fit(X_df, Y_df)

with torch.no_grad():
        X = torch.randn(
            (1, X_df.shape[1]),
            generator=torch.Generator().manual_seed(42),
        )
        X.requires_grad = False

        torch.onnx.export(
            TabPFNModelWrapperWithTrainData(
                classifier
            ).eval(),
            (X, []),
            f=file_name,
            input_names=[
                "X",
            ],
            output_names=["output"],
            opset_version=17,  # using 17 since we use torch>=2.1
        )


Expected Results

I want the onnx model to work :)

Actual Results

RuntimeError: Cannot insert a Tensor that requires grad as a constant. Consider making it a parameter or input, or detaching the gradient while wrapping the model

Versions

Building with the main branch

Metadata

Metadata

Assignees

Labels

bug 💣Something isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions