Description
Describe the solution you'd like
Support Amazon Bedrock Knowledge Bases now processes multimodal data
https://aws.amazon.com/jp/about-aws/whats-new/2024/12/amazon-bedrock-knowledge-bases-processes-multimodal-data/
Why the solution needed
It will be able to generate answers with high accuracy from PDFs containing tables, etc. that it have not been good at until now.
When a PDF is vectorized, an image for each PDF page is stored in S3, and the S3 URI of the text extracted from the image and the S3 URI of the image are stored in a vector DB as a set.
If the search results in text extracted from the image, the text and image are passed together to LLM for an answer.
By handing over not only text but also images to the LLM, it is possible to respond with high accuracy even with complicated tables.
Additional context
Add any other context or screenshots about the feature request here.
Implementation feasibility
Are you willing to collaborate with us to discuss the solution, decide on the approach, and assist with the implementation?
- Yes, I am able to implement the feature and create a pull request.
- No, I am unable to implement the feature, but I am open to discussing the solution.