VLM edge inference on iOS, for monitoring & surveillance
Clone the repository, then run the bootstrap script:
git clone git@github.com:massimopiazza/veesto.git
cd veesto
./scripts/bootstrap-model-assets.shIf you already have the repository cloned, run ./scripts/bootstrap-model-assets.sh again after git pull. The script is safe to re-run.
It only downloads missing files and verifies the expected layout for a local build.
The script does all of the data hydration needed for a local build:
- It downloads the missing SmolVLM ONNX files from Hugging Face into
veesto/SmolVLM-onnx/. - It downloads the Qwen3-VL 2B Instruct 8-bit bundle from Google Drive into
veesto/Qwen3VL-CoreML/. - It downloads the LiquidAI GGUF bundles into
veesto/LiquidAI-GGUF/. - It clones and builds the bundled
llama.cpp+libmtmdiOS runtime intoVendor/llama.cpp/build-apple/llama.xcframework. - It verifies that all required model assets are present in the correct folders.
If you do not want to use the bootstrap script, you can manage the downloads manually.
The ONNX model files are not included in this repository due to their size. Download them from HuggingFace and place them in the veesto/SmolVLM-onnx/ folder:
Place all three files in: veesto/SmolVLM-onnx/
Qwen3-VL 2B Instruct 8-bit assets
Place them under veesto/Qwen3VL-CoreML/ with this exact layout:
veesto/Qwen3VL-CoreML/
qwen3vl_2b_instruct_decoder_model_merged.int8.onnx
qwen3vl_2b_instruct_decoder_model_merged.int8.onnx.data
qwen3vl_2b_instruct_embed_tokens.onnx
qwen3vl_2b_instruct_tokenizer.json
qwen3vl_shared_vision_448_pal16.mlpackage/
Manifest.json
Data/com.apple.CoreML/model.mlmodel
Data/com.apple.CoreML/weights/weight.bin
The LiquidAI models are stored outside Git and are fetched by the bootstrap script through the Hugging Face CLI:
hf download LiquidAI/LFM2-VL-450M-GGUF \
--local-dir veesto/LiquidAI-GGUF \
--include "LFM2-VL-450M-Q8_0.gguf" \
--include "mmproj-LFM2-VL-450M-Q8_0.gguf"
hf download LiquidAI/LFM2.5-VL-1.6B-GGUF \
--local-dir veesto/LiquidAI-GGUF \
--include "LFM2.5-VL-1.6B-Q8_0.gguf" \
--include "mmproj-LFM2.5-VL-1.6b-Q8_0.gguf"This produces the following layout:
veesto/LiquidAI-GGUF/
LFM2-VL-450M-Q8_0.gguf
mmproj-LFM2-VL-450M-Q8_0.gguf
LFM2.5-VL-1.6B-Q8_0.gguf
mmproj-LFM2.5-VL-1.6b-Q8_0.gguf
The LiquidAI GGUF family uses a locally built llama.cpp runtime with libmtmd enabled for multimodal image support. The bootstrap script will clone the pinned upstream commit and build:
Vendor/llama.cpp/build-apple/llama.xcframework
If you want to build it manually, run:
./scripts/build-llama-runtime.sh