Skip to content

[Feature Request] Amazon Bedrock Knowledge Bases now processes multimodal data #626

Open
@fsatsuki

Description

@fsatsuki

Describe the solution you'd like

Support Amazon Bedrock Knowledge Bases now processes multimodal data
https://aws.amazon.com/jp/about-aws/whats-new/2024/12/amazon-bedrock-knowledge-bases-processes-multimodal-data/

Why the solution needed

It will be able to generate answers with high accuracy from PDFs containing tables, etc. that it have not been good at until now.
When a PDF is vectorized, an image for each PDF page is stored in S3, and the S3 URI of the text extracted from the image and the S3 URI of the image are stored in a vector DB as a set.
If the search results in text extracted from the image, the text and image are passed together to LLM for an answer.
By handing over not only text but also images to the LLM, it is possible to respond with high accuracy even with complicated tables.

Additional context

Add any other context or screenshots about the feature request here.

Implementation feasibility

Are you willing to collaborate with us to discuss the solution, decide on the approach, and assist with the implementation?

  • Yes, I am able to implement the feature and create a pull request.
  • No, I am unable to implement the feature, but I am open to discussing the solution.

Metadata

Metadata

Assignees

No one assigned

    Labels

    roadmapDetermined to be implemented

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions