[Feature Request] Amazon Bedrock Knowledge Bases now processes multimodal data #626
Open
1 of 2 tasks
Labels
roadmap
Determined to be implemented
Describe the solution you'd like
Support
Amazon Bedrock Knowledge Bases now processes multimodal data
https://aws.amazon.com/jp/about-aws/whats-new/2024/12/amazon-bedrock-knowledge-bases-processes-multimodal-data/
Why the solution needed
It will be able to generate answers with high accuracy from PDFs containing tables, etc. that it have not been good at until now.
When a PDF is vectorized, an image for each PDF page is stored in S3, and the S3 URI of the text extracted from the image and the S3 URI of the image are stored in a vector DB as a set.
If the search results in text extracted from the image, the text and image are passed together to LLM for an answer.
By handing over not only text but also images to the LLM, it is possible to respond with high accuracy even with complicated tables.
Additional context
Add any other context or screenshots about the feature request here.
Implementation feasibility
Are you willing to collaborate with us to discuss the solution, decide on the approach, and assist with the implementation?
The text was updated successfully, but these errors were encountered: