This guide provides detailed instructions on how to export models for Executorch and execute them on the OpenVINO backend. The examples demonstrate how to export a model, load a model, prepare input tensors, execute inference, and save the output results.
Below is the layout of the examples/openvino
directory, which includes the necessary files for the example applications:
examples/openvino
├── README.md # Documentation for examples (this file)
└── aot_optimize_and_infer.py # Example script to export and execute models
Follow the instructions of Prerequisites and Setup in backends/openvino/README.md
to set up the OpenVINO backend.
The python script called aot_optimize_and_infer.py
allows users to export deep learning models from various model suites (TIMM, Torchvision, Hugging Face) to a openvino backend using Executorch. Users can dynamically specify the model, input shape, and target device.
-
--suite
(required): Specifies the model suite to use. Supported values:timm
(e.g., VGG16, ResNet50)torchvision
(e.g., resnet18, mobilenet_v2)huggingface
(e.g., bert-base-uncased). NB: Quantization and validation is not supported yet.
-
--model
(required): Name of the model to export. Examples:- For
timm
:vgg16
,resnet50
- For
torchvision
:resnet18
,mobilenet_v2
- For
huggingface
:bert-base-uncased
,distilbert-base-uncased
- For
-
--input_shape
(optional): Input shape for the model. Provide this as a list or tuple. Examples:[1, 3, 224, 224]
(Zsh users: wrap in quotes)(1, 3, 224, 224)
-
--export
(optional): Save the exported model as a.pte
file. -
--model_file_name
(optional): Specify a custom file name to save the exported model. -
--batch_size
: Batch size for the validation. Default batch_size == 1. The dataset length must be evenly divisible by the batch size. -
--quantize
(optional): Enable model quantization. --dataset argument is requred for the quantization.huggingface
suite does not supported yet. -
--validate
(optional): Enable model validation. --dataset argument is requred for the validation.huggingface
suite does not supported yet. -
--dataset
(optional): Path to the imagenet-like calibration dataset. -
--infer
(optional): Execute inference with the compiled model and report average inference timing. -
--num_iter
(optional): Number of iterations to execute inference. Default value for the number of iterations is1
. -
--warmup_iter
(optional): Number of warmup iterations to execute inference before timing begins. Default value for the warmup iterations is0
. -
--input_tensor_path
(optional): Path to the raw tensor file to be used as input for inference. If this argument is not provided, a random input tensor will be generated. -
--output_tensor_path
(optional): Path to the raw tensor file which the output of the inference to be saved. -
--device
(optional) Target device for the compiled model. Default isCPU
. Examples:CPU
,GPU
python aot_optimize_and_infer.py --export --suite timm --model vgg16 --input_shape "[1, 3, 224, 224]" --device CPU
python aot_optimize_and_infer.py --export --suite torchvision --model resnet50 --input_shape "(1, 3, 256, 256)" --device GPU
python aot_optimize_and_infer.py --export --suite huggingface --model bert-base-uncased --input_shape "(1, 512)" --device CPU
python aot_optimize_and_infer.py --export --suite timm --model vgg16 --input_shape [1, 3, 224, 224] --device CPU --validate --dataset /path/to/dataset
python aot_optimize_and_infer.py --export --suite timm --model vgg16 --input_shape [1, 3, 224, 224] --device CPU --validate --dataset /path/to/dataset --quantize
python aot_optimize_and_infer.py --suite torchvision --model inception_v3 --infer --warmup_iter 10 --num_iter 100 --input_shape "(1, 3, 256, 256)" --device CPU
-
Input Shape in Zsh: If you are using Zsh, wrap
--input_shape
in quotes or use a tuple:--input_shape '[1, 3, 224, 224]' --input_shape "(1, 3, 224, 224)"
-
Model Compatibility: Ensure the specified
model_name
exists in the selectedsuite
. Use the corresponding library's documentation to verify model availability. -
Output File: The exported model will be saved as
<MODEL_NAME>.pte
in the current directory. -
Dependencies:
- Python 3.8+
- PyTorch
- Executorch
- TIMM (
pip install timm
) - Torchvision
- Transformers (
pip install transformers
)
-
Model Not Found: If the script raises an error such as:
ValueError: Model <MODEL_NAME> not found
Verify that the model name is correct for the chosen suite.
-
Unsupported Input Shape: Ensure
--input_shape
is provided as a valid list or tuple.
Build the backend libraries and executor runner by executing the script below in <executorch_root>/backends/openvino/scripts
folder:
./openvino_build.sh
The executable is saved in <executorch_root>/cmake-out/backends/openvino/
Now, run the example using the executable generated in the above step. The executable requires a model file (.pte
file generated in the aot step), and optional number of inference executions.
cd ../../cmake-out/backends/openvino
./openvino_executor_runner \
--model_path=<path_to_model> \
--num_executions=<iterations>
--model_path
: (Required) Path to the model serialized in.pte
format.--num_executions
: (Optional) Number of times to run inference (default: 1).
Run inference with a given model for 10 iterations:
./openvino_executor_runner \
--model_path=model.pte \
--num_executions=10