Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(alloydb): Add EmbeddingStore and Loader for Google AlloyDB #102

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

averikitsch
Copy link

Issue

Closes #

Change

Mirror of LangChain Python implementation for Google AlloyDB (https://github.com/googleapis/langchain-google-alloydb-pg-python). Includes EmbeddingStore and Loader integrations.

General checklist

  • There are no breaking changes
  • I have added unit and integration tests for my change
  • I have manually run all the unit tests in all modules, and they are all green
  • I have manually run all integration tests in the module I have added/changed, and they are all green

Checklist for adding new maven module

  • I have added my new module in the root pom.xml and langchain4j-community-bom/pom.xml

Checklist for adding new embedding store integration

  • I have added a {NameOfIntegration}EmbeddingStoreIT that extends from either EmbeddingStoreIT or EmbeddingStoreWithFilteringIT
  • I have added a {NameOfIntegration}EmbeddingStoreRemovalIT that extends from EmbeddingStoreWithRemovalIT

Checklist for changing existing embedding store integration

  • I have manually verified that the {NameOfIntegration}EmbeddingStore works correctly with the data persisted using the latest released version of LangChain4j

@Martin7-1 Martin7-1 added enhancement New feature or request P3 Medium priority theme: embedding store Issues/PRs related to embedding store labels Mar 19, 2025
Copy link
Collaborator

@Martin7-1 Martin7-1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@averikitsch Thank you! Could you please change the module name and dir name?

There are some dependencies conflict, could you please check it?

Also, if the filtering funcationality is already done, maybe the AlloyDBEmbeddingStoreIT should extend EmbeddingStoreWithFilteringIT.


private double calculateRelevanceScore(double distance) {
switch (distanceStrategy.name()) {
case "EUCLIDEAN" -> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LangChain4j only support COSINE_DISTANCE for now actually...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have created a DistanceStrategy class to enable different distances. This is important to our customers in order to tune their RAG results since different embedding models work better with specific distance calculations.

Do you have concerns about us extending this support?

Note: We will also be adding indexing methods that will allow vector indexes to be added using different distance strategies.

Copy link
Collaborator

@Martin7-1 Martin7-1 Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's nice to support different distances strategy, but the score field in EmbeddingMatch is based on cosine similarity, so perhaps we need a clearer javadoc description on the search method. WDYT?

Copy link
Collaborator

@Martin7-1 Martin7-1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@averikitsch Thank you! Could you please run make format after solving all comments?

Copy link
Collaborator

@Martin7-1 Martin7-1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@averikitsch Thank you!


private static final String USER_AGENT = "langchain4j-alloydb-pg";
private static final Logger log = LoggerFactory.getLogger(AlloyDBEngine.class.getName());
private static ConnectorConfig namedConnectorConfig;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering why this field is static, could you please elaborate?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to not be static.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request P3 Medium priority theme: embedding store Issues/PRs related to embedding store
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants