Skip to content

Node Similarity OOM #675

@RaoulBrueckler

Description

@RaoulBrueckler

i tried to build a new edge for similarity with a score property. for a graph with ~2M Nodes and about ~8M edges, but it reaches the memorylimit with 32GB

I tried it with this query...

MATCH (b:Broadcast)
WITH b
MATCH p=(b)-[r:IS_TAGGED_WITH]->(t:Tag)
WITH project(p) AS subgraph
CALL node_similarity.jaccard(subgraph) YIELD node1, node2, similarity
WITH node1, node2, similarity
WHERE similarity > 0
MERGE (node1)-[:IS_SIMILAR {score: similarity}]->(node2)
RETURN node1, node2, similarity;

i also tried if with just 2 nodes like that, which works and which is pretty fast 2ms but if i remove the similarity > 0 check i see that it compares everything, not just the 2 broadcast nodes, it callculate as well the similarity for the tags and genres

MATCH (b:Broadcast)
WHERE b.id IN [36195333 , 36195268]
WITH b
MATCH p=(b)-[r:IS_TAGGED_WITH]->(t:Tag)
WITH project(p) AS subgraph
CALL node_similarity.jaccard(subgraph) YIELD node1, node2, similarity
WITH node1, node2, similarity
WHERE similarity > 0
MERGE (node1)-[:IS_SIMILAR {score: similarity}]->(node2)
RETURN node1, node2, similarity;

Here is a profile export

profile_summary.json

Is there a problem in my cypher or do i need more ram? or am i missing something else?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions