In this project, we have created knowledge networks by integrating multimodal knowledge from the FishBase Dataset. The knowledges associated with the world map.
pip install -r requirements.txt
python create_knowlege_graph.py
This project is migrating from a property graph (Neo4j/Cypher) to ScyllaDB using CQL (Cassandra-compatible).
The design preserves graph-like semantics using wide-column tables.
- Remove
neo4jdriver fromrequirements.txt - Add
cassandra-driver(DataStax Python driver)
- Add environment variables:
SCYLLA_CONTACT_POINTS(comma-separated list)SCYLLA_KEYSPACESCYLLA_CONSISTENCY(e.g.,LOCAL_QUORUM)
- Add a
.env.examplefile with these keys
- Create keyspace and tables (see CQL schema below)
- Optionally add
edges_by_dstor materialized views for reverse lookups
- Export Neo4j nodes and edges to CSV
- Transform CSVs to use UUIDs and
map<text,text>for properties - Bulk import using
cqlsh COPYor a Python loader
- Replace Cypher with CQL:
MERGE/CREATE→INSERT(IF NOT EXISTSwhen needed)MATCH(one hop) →SELECT ... FROM edges WHERE src=?- Multi-hop → iterative traversal in application logic
- Batch writes in
create_knowlege_graph.pyusing prepared statements
- Prefer partition-key queries over secondary indexes
- Add lookup tables (e.g.,
species_by_name) for common filters
- Write unit tests for CRUD operations on nodes and edges
- Validate counts and sample paths against Neo4j output
- Add Docker Compose for local Scylla
- Document compaction, replication, and backup strategy

