Skip to content

Request: ONNX or PyTorch model versions for GPU inference #4

@generalMG

Description

@generalMG

Summary

I'm trying to use eDOCr2 for GD&T detection but encountering issues with the Keras models on modern GPUs. Would it be possible to provide ONNX or PyTorch versions of the pre-trained models?

Current Issue

The .keras model files (recognizer_gdts.keras and recognizer_dimensions_2.keras) cannot be loaded or converted on systems with:

  • NVIDIA RTX 5090 (compute capability 12.0)
  • TensorFlow 2.20.0
  • Keras 3.12.0

Errors encountered:

  1. CUDA compatibility: TensorFlow binaries don't include CUDA kernels for compute capability 12.0, requiring 30+ minute JIT compilation that fails with:

    CUDA_ERROR_INVALID_PTX
    CUDA_ERROR_INVALID_HANDLE
    
  2. Lambda layer deserialization: Models saved with Keras 3.x format contain Lambda layers with corrupted bytecode:

    ValueError: bad marshal data (unknown type code)
    
  3. Memory corruption: Double free errors when attempting to load models:

    free(): double free detected in tcache 2
    

Attempted Solutions

I've tried:

  • Converting to ONNX using tf2onnx
  • Converting to PyTorch via ONNX
  • Using TensorFlow 2.15/2.16 with Python 3.11
  • Forcing CPU execution with CUDA_VISIBLE_DEVICES=-1
  • Registering custom objects (_transform, _meshgrid, _repeat)
  • Enabling unsafe deserialization with keras.config.enable_unsafe_deserialization()

All attempts fail due to the issues above.

Request

Could you provide:

  1. ONNX versions of the models (.onnx files)
  2. PyTorch versions of the models (.pt or .pth files)
  3. Alternative: Instructions on how to rebuild/convert the models from scratch, or provide the training script used to create the models

Having ONNX/PyTorch versions would enable:

  • GPU inference on modern GPUs (RTX 40xx/50xx series)
  • Better compatibility across frameworks
  • Easier deployment in production environments
  • Use with ONNX Runtime (supports more hardware accelerators)
  • Avoid TensorFlow dependency issues

Environment

  • GPU: NVIDIA RTX 5090 (compute capability 12.0)
  • Python: 3.11.13 / 3.12.11
  • TensorFlow: 2.20.0
  • Keras: 3.12.0
  • OS: Linux (WSL2)

Workaround Needed

Currently, the only option is to run on CPU (extremely slow) or use an older GPU with compute capability < 12.0, which limits the usability of this excellent project.

Thank you for your consideration and for maintaining this valuable tool for engineering drawing OCR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions