Image-text-to-text pipeline for transformers.js (JavaScript) #1295

zlelik · 2025-04-24T14:07:00Z

Pipeline description

Please implement or share the estimated release date (if it is in progress already) for Image-text-to-text pipeline. As stated here
https://huggingface.co/docs/transformers/en/tasks/image_text_to_text it is already implemented for python but if I try to create this pipeline in JavaScript with transformers.js v3.5.0, I got a error "unsupported pipeline". Especially I am interested in video processing.

Prerequisites

The pipeline is supported in Transformers (i.e., listed here)
The task is listed here

Additional information

If it exists already in JavaScript, please give me a link to video captioning example.

Your contribution

No, I cannot contribute much. Sorry.

xenova · 2025-04-25T17:45:50Z

Hi there 👋 Although we don't currently support the image-text-to-text pipeline, we have been adding many new models which support image+text input.

Maybe a community member would be interested in porting over the logic from the python library here! 🤗

zlelik added the new pipeline Request a new pipeline label Apr 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image-text-to-text pipeline for transformers.js (JavaScript) #1295

Image-text-to-text pipeline for transformers.js (JavaScript) #1295

zlelik commented Apr 24, 2025

xenova commented Apr 25, 2025

Image-text-to-text pipeline for transformers.js (JavaScript) #1295

Image-text-to-text pipeline for transformers.js (JavaScript) #1295

Comments

zlelik commented Apr 24, 2025

Pipeline description

Prerequisites

Additional information

Your contribution

xenova commented Apr 25, 2025