-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate neo4j retriever #109
Conversation
d2cfa4f
to
4d2a7b4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vga91 Thank you!
<relativePath>../../pom.xml</relativePath> | ||
</parent> | ||
|
||
<artifactId>langchain4j-community-neo4j-retriever</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about naming it langchain4j-community-content-retriever-neo4j
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is that the naming pattern? otherwise, what other kinds of retrievers are there? structure retrievers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually no naming pattern about it... Just to tell users that what component in this module
import dev.langchain4j.model.chat.ChatLanguageModel; | ||
import dev.langchain4j.model.input.PromptTemplate; | ||
|
||
public class Neo4jContentRetrieverBuilder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about moving it to Neo4jContentRetriever
and make it a inner static class?
void shouldRetrieveContentWhenQueryIsValidAndOpenAiChatModelIsUsed() { | ||
|
||
// With | ||
ChatLanguageModel openAiChatModel = OpenAiChatModel.builder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why you need to test with OpenAiChatModel
in particular, is it different from mock's ChatLanguageModel
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean for an IT you want to use a real model/infra not just a mock.
I don't know if LC4J has also local models as part of the IT infrastructure that could be run on the GH action server?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LC4J only have embedding local model for now.
Maybe this test should move to a new Neo4jContentRetrieverIT
to do integration test and add @EnabledIfEnvironmentVariable
. And this old one should rename to Neo4jContentRetrieverTest
and use mock model. WDYT?
Some format violations:
|
import org.neo4j.driver.types.Type; | ||
import org.neo4j.driver.types.TypeSystem; | ||
|
||
public class Neo4jContentRetriever implements ContentRetriever { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably name this something like Neo4jText2CypherRetriever ?
} | ||
} | ||
|
||
private static final String NODE_PROPERTIES_QUERY = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could probably do this in one query, but that's an improvement for the future
|
||
public List<Record> executeRead(String queryString) { | ||
|
||
try (Session session = this.driver.session()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can move this to driver.executeQuery which automatically does retries
List<Record> records = graph.executeRead(cypherQuery); | ||
return records.stream() | ||
.flatMap(r -> r.values().stream()) | ||
.map(value -> NODE.isTypeOf(value) ? value.asMap().toString() : value.toString()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should probably also handle RELATIONSHIP and PATH?
|
||
public void refreshSchema() { | ||
|
||
List<String> nodeProperties = formatNodeProperties(executeRead(NODE_PROPERTIES_QUERY)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the future we should also add the enhanced schema with sample property values
} | ||
|
||
@Test | ||
void shouldReturnEmptyListWhenQueryIsInvalid() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the query invalid or does it just return empty list?
we should probably also add a test for a truly invalid query generated that fails during execution
|
||
String question = query.text(); | ||
String schema = graph.getSchema(); | ||
String cypherQuery = generateCypherQuery(schema, question); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the future we should also add the capability to re-generate and re-run failed queries, best with the question, previous query and error message from the database, for the model to fix (up to N retries)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM most of my comments are future improvements
cd690dd
to
3e12612
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vga91 Thank you!
import org.neo4j.driver.types.Type; | ||
import org.neo4j.driver.types.TypeSystem; | ||
|
||
public class Neo4jText2CypherRetriever implements ContentRetriever { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a breaking change from Neo4jContentRetriever
to Neo4jText2CypherRetriever
... Any idea to keep compatibility (marked Neo4jContentRetriever
as @Deprecated
and removed it in the future?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the @Deprecated
class, lemme know if it's ok this way
<relativePath>../../pom.xml</relativePath> | ||
</parent> | ||
|
||
<artifactId>langchain4j-community-neo4j-retriever</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually no naming pattern about it... Just to tell users that what component in this module
Maybe the IT should split into two parts:
@vga91 WDYT? |
Yes, great idea. I just split them |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue
Closes #
Change
Migration of Neo4jContentRetriever
General checklist
Checklist for adding new maven module
pom.xml
andlangchain4j-community-bom/pom.xml
Checklist for adding new embedding store integration
{NameOfIntegration}EmbeddingStoreIT
that extends from eitherEmbeddingStoreIT
orEmbeddingStoreWithFilteringIT
{NameOfIntegration}EmbeddingStoreRemovalIT
that extends fromEmbeddingStoreWithRemovalIT
Checklist for changing existing embedding store integration
{NameOfIntegration}EmbeddingStore
works correctly with the data persisted using the latest released version of LangChain4j