Skip to content

Conversation

@epinzur
Copy link
Contributor

@epinzur epinzur commented Sep 23, 2024

This makes CassandraGraphStore compatible with existing cassio.

Note:

  • only traversal_search() has been updated to use the new link storage.

# adjacent nodes.
#
# TODO: For a big performance win, we should track which tags we've
# TODO: For a big performance win, we should track which links we've
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this may be a stale comment. Specifically, the difference_update on line 487 seems to be doing this.

else:
# don't add link search to original metadata dict
metadata = metadata.copy()
metadata[_metadata_s_link_key(link=outgoing_link)] = _metadata_s_link_value()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work? It looks like it is doing equality. But, there may be multiple outgoing links, and we need to find the nodes with one of those as an incoming link. It seems like this is maybe going the wrong direction, and also likely missing the set-equality. Are we sure this works the same?

Copy link
Contributor Author

@epinzur epinzur Sep 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when we add_nodes(), we are storing all the incoming_links for each chunk as dictionary keys in the metadata_s column. There is an arbitrary, static value set with each key, so that it can be stored in the MAP<text,text> type.

Here, when we search, we are finding all the chunks that have a matching outgoing-link key.

We are essentially doing a hybrid query on metadata keys... the values do not matter.

The code does a separate query for each outgoing link. This the same way the code in main operates.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. That's the part I missed. Could you add a comment to that effect where we store / query? It also seems like there is then a risk that the generated key collides with the user defined key? Is there a prefix or something we should tell people not to use in their metadata?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants