Q&A: Alt text capabilities project idea #120
Replies: 11 comments 19 replies
-
Hi, my name is Malik Akbar Hashemi Rafsanjani, a final year computer science student from Indonesia (Bandung, Institute of Technology) 👋👋 I am very excited to contribute to this project for Google Summer of Code (GSoC) 2024. I have several experiences on Machine Learning Engineering and Web Development from internships, competitions, and freelance projects I have explored and researched the model that we can use to generate a text description from an image. This text description can be used as the alt text for Wagtail. Some notable models are: Google's T2I models and several image-to-text models in HuggingFace But one model that I have explored in-depth is: BLIP: model for conditional and un-conditional image captioning I have tried it on my local machine and got relatively good results. The model is also capable of prompt engineering (conditional image captioning) and gives the user more flexibility and functionality. It is open-source and quite simple to use. You can also try it directly on the Hugging Face website I provide. There are also a lot of models that we can explore more as well. Is there any constraint for the model selection? Thank you so much! Here is my profile |
Beta Was this translation helpful? Give feedback.
-
@NXPY123 asks on Slack:
|
Beta Was this translation helpful? Give feedback.
-
Iqra asks on Slack:
|
Beta Was this translation helpful? Give feedback.
-
Iqra asks on Slack:
|
Beta Was this translation helpful? Give feedback.
-
Hello @thibaudcolas, I've a few doubts. 1 - Is there a finalized plan for contextual alt text in StreamField? In the RFC, it's mentioned that there's plan for adding a specific field for Alt text and a checkbox to mark decorative images in the Image model. However, I'm unsure about the plan for adding a Contextual Alt Text field. because the image chooser and image form in StreamField use the same form as the images section (So, where the contextual alt text field going to be). And, would it be a good idea to provide users with the option to add contextual alt text after they choose an image (similar to what Google Docs offers)? Something like this: 2 - What would be better in implementation of automatic alt text generation. directly generate and save alt text for the image or show a suggestion first and if user confirms then save(like what drupal do)? Thanks 😊 |
Beta Was this translation helpful? Give feedback.
-
Hello interested contributors 👋 I'm Storm, the lead mentor for this project together with support mentor Saptak (@SaptakS) A little about me: I hail from The Netherlands, Europe and have been a developing sites with Django and Wagtail since 2018. In 2021 I joined the Wagtail Core team and the accessibility subteam. All I do, I do as a volunteer. I'm self-employed and don't receive any monetary compensation for my volunteer work. A little backgroundMy accessibility team member pledge for 2024 is to improve the default text alternatives for images in Wagtail because the current defaults aren't so great. I started the accessible image model working group to discuss what an ideal text alternative solution would look like when integrated into Wagtail. This ideal solution has since expanded in scope to include AI-generated text alternatives and changes to the editor experience. There is more development work than we volunteers can handle, which is why we are very grateful (and exited!) to have support from Google in the form of Google Summer of Code ❤️ Where we are atThe accessible image model working group has had a couple of meeting and we've made good progress discussing this topic. If you are interested in all the details, you can find us in #a11y-image-model-working-group and our meeting notes can be found here: meeting notes on google docs Here is a condensed version of what we discussed, you could consider this our wishlist:
The ideal candidate
Please note: we are unlikely to choose a candidate with little Django/Wagtail experience. You'll be working with Wagtail internals and Django, which makes this is a very important factor. Your proposalI hope the above bullet points give you some guidance as to what we would like to see in your proposal. The optional proposal template provided also has some great pointers as to what a good proposal should include: https://wagtail.org/gsoc-template/ Like Thibaud mentioned in the opening post, it is at our discretion whether we review your draft proposal. We are volunteers with day jobs, reviewing all drafts sent our way is difficult. Thank you for your understanding! Good luck! We are exited to see what you come up with. |
Beta Was this translation helpful? Give feedback.
-
Karthik asked on Slack Q: What I understand still now is that the project idea targets two Wagtail projects. wagtail/wagtail (addition of alt-text field, support for contextual alt-text and support for decorative images) and second part of the idea, i.e. use of ai for this, which is in the scope of wagtail/wagtail-ai. (correct me if I'm wrong). A: Yes, adding support for contextual alt text is in the scope of wagtail/wagtail. We are hoping to add support for generating alt text to give alt text suggestions. Using the surrounding context of the page the image is used is not necessarily in scope, just a factual description of the image (e.g. 'A snow-capped mountain in the distance') is enough for us right now. As mentioned in the 'wishlist' above, this should probably be implemented in the form of a backend that can provide Wagtail with AI-generated responses. The actual implementation for querying a specific AI service / model should probably be in the form of an extra package that can interface with Wagtail. This makes AI-generated alt text an optional feature in Wagtail We would love to see a reference implementation (that, for example, queries a popular online service) developed as part of the project. But only if it fits the timeline. Q: Per my research and Gasman's comments on RFC: Contextual alt text. We are looking for something that is a single field at the model level and translates to multiple values at the database level (sort of a one-to-many relation) (extra context: this is discussing Images linked through a A: we are quite unsure what an acceptable implementation would look like. Because of this, we'd rather focus on the StreamField block implementation which has a more straightforward implementation. After that is implemented (and at that point GSoC has likely end) we review and focus our attention again on the Q: how will we update the existing architecture, which uses the title as the default alt-text? A: the short answer is: we are not sure! Maybe the title field should be renamed? Maybe it should be removed? We'll leave it up to you to come up with something acceptable. You might want to consider different solutions. We'll be likely to choose an option that has minimal breaking changes and/or a clear upgrade path for site implementors. |
Beta Was this translation helpful? Give feedback.
-
@thibaudcolas where can I find the thread and google form for Low-carbon accessible project templates |
Beta Was this translation helpful? Give feedback.
-
Oh here I go. Well, I am disturbing you on slack with my questions but yeah, now I will ask here |
Beta Was this translation helpful? Give feedback.
-
Is there any discussion about Changing page type idea or any news about it |
Beta Was this translation helpful? Give feedback.
-
Just wanted to say thank you to everyone who submitted a proposal for this project! We received 16 in total, by far the most of any project idea this year. Lots of different approaches in there, some focused on the AI aspects, some more or less leaving that out, and lots in-between. From here – we’ll be firming up our line-up of mentors, and reviewing all proposals. Final results will be announced by Google on May 1st at 18:00 UTC. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
👋 Please use this discussion thread to ask questions about the Alt text capabilities project idea.
Asking questions
Please use this discussion thread.
Proposals
It’s at mentors’ discretion whether they review any draft proposals or only final ones. To send your proposal for draft review (no promises), use: https://wagtail.org/gsoc-proposal/. We don’t mandate any specific template but we do provide one optionally as a way to get started: https://wagtail.org/gsoc-template/
For further information about the project, see:
Beta Was this translation helpful? Give feedback.
All reactions