Node Similarity OOM

i tried to build a new edge for similarity with a score property. for a graph with  ~2M Nodes and about ~8M edges, but it reaches the memorylimit with 32GB

I tried it with this query...

```
MATCH (b:Broadcast)
WITH b
MATCH p=(b)-[r:IS_TAGGED_WITH]->(t:Tag)
WITH project(p) AS subgraph
CALL node_similarity.jaccard(subgraph) YIELD node1, node2, similarity
WITH node1, node2, similarity
WHERE similarity > 0
MERGE (node1)-[:IS_SIMILAR {score: similarity}]->(node2)
RETURN node1, node2, similarity;
```

i also tried if with just 2 nodes like that, which works and which is pretty fast 2ms but if i remove the similarity > 0 check i see that it compares everything, not just the 2 broadcast nodes, it callculate as well the similarity for the tags and genres

```
MATCH (b:Broadcast)
WHERE b.id IN [36195333 , 36195268]
WITH b
MATCH p=(b)-[r:IS_TAGGED_WITH]->(t:Tag)
WITH project(p) AS subgraph
CALL node_similarity.jaccard(subgraph) YIELD node1, node2, similarity
WITH node1, node2, similarity
WHERE similarity > 0
MERGE (node1)-[:IS_SIMILAR {score: similarity}]->(node2)
RETURN node1, node2, similarity;
```

Here is a profile export

[profile_summary.json](https://github.com/user-attachments/files/22538500/profile_summary.json)


Is there a problem in my cypher or do i need more ram? or am i missing something else?



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Node Similarity OOM #675

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Node Similarity OOM #675

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions