Skip to content

Image-text-to-text pipeline for transformers.js (JavaScript) #1295

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
2 tasks done
zlelik opened this issue Apr 24, 2025 · 1 comment
Open
2 tasks done

Image-text-to-text pipeline for transformers.js (JavaScript) #1295

zlelik opened this issue Apr 24, 2025 · 1 comment
Labels
new pipeline Request a new pipeline

Comments

@zlelik
Copy link

zlelik commented Apr 24, 2025

Pipeline description

Please implement or share the estimated release date (if it is in progress already) for Image-text-to-text pipeline. As stated here
https://huggingface.co/docs/transformers/en/tasks/image_text_to_text it is already implemented for python but if I try to create this pipeline in JavaScript with transformers.js v3.5.0, I got a error "unsupported pipeline". Especially I am interested in video processing.

Prerequisites

  • The pipeline is supported in Transformers (i.e., listed here)
  • The task is listed here

Additional information

If it exists already in JavaScript, please give me a link to video captioning example.

Your contribution

No, I cannot contribute much. Sorry.

@zlelik zlelik added the new pipeline Request a new pipeline label Apr 24, 2025
@xenova
Copy link
Collaborator

xenova commented Apr 25, 2025

Hi there 👋 Although we don't currently support the image-text-to-text pipeline, we have been adding many new models which support image+text input.

Maybe a community member would be interested in porting over the logic from the python library here! 🤗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new pipeline Request a new pipeline
Projects
None yet
Development

No branches or pull requests

2 participants