[Bug] Torchvision convnext Core Dump and the engine file larger than 3GB #17546
Labels
needs-triage
PRs or issues that need to be investigated by maintainers to find the right assignees to address it
type: bug
Expected behavior
This model should run successfully and the generated engine file should be close to the actual size of the model.
Actual behavior
When loading the model using the torch front-end, the engine file generated after compilation is too large, and core dump will occur when loading and executing from the local machine. The onnx front-end does not have this problem.
Environment
OS: "Ubuntu 20.04.6 LTS"
CUDA SDK version: 12.2
TVM version: 7ae7ea8
GPU: NVIDIA A10 24GB
Driver Version: 535.129.03
CUDA Version: 12.2
Torch Version: 2.2.1
Torchvision Version: 0.17.1
Onnx Version: 1.15.0
Steps to reproduce
I use the following code to get the engine:
and when i try to load this engine with:
Core Dump!
I noticed that the size of the Engine file is: 3.8G, But the original model size is: 339MB. And I remember that there was a limitation that models larger than 3GB could not be loaded before. I guess this might be the reason.
Strangely, when I export the model to ONNX, I don't encounter similar issues, and the generated engine file size is 340MB:
So I suspect that the Torch frontend might be incorrectly duplicating some constant weights when parsing certain Ops.
Triage
The text was updated successfully, but these errors were encountered: