Skip to content

Conversation

hatMatch
Copy link

@hatMatch hatMatch commented Sep 1, 2025

Add document creation for ChatBedrockConverse

Implements document creation to add to message content. Does moderate error handling on input types though a pydantic model for the various parameters or document itself might be preferable.

Usage

# Create document
llm = ChatBedrockConverse(
    model="anthropic.claude-3-sonnet-20240229-v1:0",
)

document = ChatBedrockConverse.create_document(name="NoCite", source={"text": "we don't talk about me :("})
document_to_cite = ChatBedrockConverse.create_document(name="BigCite", source={"text": "People love talking about me"}, enable_citations=False) 

messages = [HumanMessage(content=[
    "How are my documents?", document,
    document_to_cite
])]

Copy link
Collaborator

@michaelnchin michaelnchin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution @hatMatch !

" must have a dictionary with a valid s3 uri as a dict."
)

if source.get("content") and not isinstance(source.get("content", list)):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing content source currently fails because isinstance is missing the second argument for type here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops good catch thanks!

Comment on lines 527 to 531
Args:
name: The name of the document.
source: The source of the document.
context: Info for the model to understand the document for citations.
format: The format of the document, or its extension.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an enable_citations docstring for completeness

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 4812a05

format: The format of the document, or its extension.
Returns:
Dictionary containing a properly formatted to add to message content."""
if re.match(r"[^\w\[\]\(\)-]|[\s]{2,}", name):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re.match won't work here for all cases here, as this will only matches for specific invalid chars/sequences starting from the first character (for example, No Cite won't be caught).

You should use re.search instead (+ simplify a bit):

Suggested change
if re.match(r"[^\w\[\]\(\)-]|[\s]{2,}", name):
if not re.search(r"[^A-Za-z0-9 \[\]()\-]|\s{2,}", name):

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think not was accidentally added to the front. Resolved in d757851

@hatMatch
Copy link
Author

hatMatch commented Sep 5, 2025

Looking to follow up with this sometime next week. Will be out for a few days. Cheers and thanks for looking through this!

Comment on lines 660 to 662
# Skip creating new client if passed in constructor
if self.client is None:
self.client = create_aws_client(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this was removed unintentionally in the 40241bd merge, let's put it back to fix the tests

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added it back in but I did change my setup so hoping nothing else broke.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants