diff --git a/changelog.md b/changelog.md index d00de45..982d84a 100644 --- a/changelog.md +++ b/changelog.md @@ -4,12 +4,12 @@ - Do not automatically derive size and caption for `from_neo4j` and `from_gql_create`. Use the `size_property` and `node_caption` parameters to explicitly configure them. - Change API of integrations to only provide basic parameters. Any further configuration should happen ons the Visualization Graph object: - - `from_gds` - - Drop parameters size_property, node_radius_min_max. `Use VG.resize_nodes(property=...)` instead - - rename additional_node_properties to node_properties - - Don't derive fields from properties. Use `VG.map_properties_to_fields` instead - `from_pandas` - Drop `node_radius_min_max` parameter. `VG.resize_nodes(...)` instead + - `from_neo4j`, `from_gds`, `from_gql_create` + - Drop parameters `size_property`, `node_radius_min_max`. Use `VG.resize_nodes(property=...)` instead + - rename additional_node_properties to node_properties + - Don't derive fields from properties. Use `VG.map_properties_to_fields` instead ## New features @@ -25,7 +25,7 @@ - Validate fields of a node and relationship not only at construction but also on assignment. - Allow resizing per node property such as `VG.resize_nodes(property="score")`. -- Color nodes by label in `from_gds`. +- Color nodes by label in `from_gds` and `from_gql_create`. - Add `table` property to nodes and relationships created by `from_snowflake`. This is used as a default caption. ## Other changes diff --git a/docs/source/integration.rst b/docs/source/integration.rst index d719885..0a5284d 100644 --- a/docs/source/integration.rst +++ b/docs/source/integration.rst @@ -164,22 +164,9 @@ The ``from_neo4j`` method takes one mandatory positional parameter: A ``data`` argument representing either a query result in the shape of a ``neo4j.graph.Graph`` or ``neo4j.Result``, or a ``neo4j.Driver`` in which case a simple default query will be executed internally to retrieve the graph data. -We can also provide an optional ``size_property`` parameter, which should refer to a node property, -and will be used to determine the sizes of the nodes in the visualization. - -The ``node_caption`` and ``relationship_caption`` parameters are also optional, and indicate the node and relationship -properties to use for the captions of each element in the visualization. -By default, the captions will be set to the node labels relationship types, but you can specify any property that -exists on these entities. - -The last optional property, ``node_radius_min_max``, can be used (and is used by default) to scale the node sizes for -the visualization. -It is a tuple of two numbers, representing the radii (sizes) in pixels of the smallest and largest nodes respectively in -the visualization. -The node sizes will be scaled such that the smallest node will have the size of the first value, and the largest node -will have the size of the second value. -The other nodes will be scaled linearly between these two values according to their relative size. -This can be useful if node sizes vary a lot, or are all very small or very big. +The optional ``max_rows`` parameter can be used to limit the number of relationships shown in the visualization. +By default, it is set to 10.000, meaning that if the database has more than 10.000 rows, a warning will be raised. +Note, this only applies if the ``data`` parameter is a ``neo4j.Driver``. Example @@ -222,20 +209,6 @@ The ``from_gql_create`` method takes one mandatory positional parameter: * A valid ``query`` representing a GQL ``CREATE`` query as a string. -We can also provide an optional ``size_property`` parameter, which should refer to a node property, -and will be used to determine the sizes of the nodes in the visualization. - -The ``node_caption`` and ``relationship_caption`` parameters are also optional, and indicate the node and relationship properties to use for the captions of each element in the visualization. - -The last optional property, ``node_radius_min_max``, can be used (and is used by default) to scale the node sizes for -the visualization. -It is a tuple of two numbers, representing the radii (sizes) in pixels of the smallest and largest nodes respectively in -the visualization. -The node sizes will be scaled such that the smallest node will have the size of the first value, and the largest node -will have the size of the second value. -The other nodes will be scaled linearly between these two values according to their relative size. -This can be useful if node sizes vary a lot, or are all very small or very big. - Example ~~~~~~~ @@ -283,39 +256,14 @@ The ``from_snowflake`` method takes two mandatory positional parameters: * A `project configuration `_ as a dictionary, that specifies how you want your tables to be projected as a graph. This configuration is the same as the project configuration of the `Neo4j Snowflake Graph Analytics application `_. -``from_snowflake`` also takes an optional property, ``node_radius_min_max``, that can be used (and is used by default) to -scale the node sizes for the visualization. -It is a tuple of two numbers, representing the radii (sizes) in pixels of the smallest and largest nodes respectively in -the visualization. -The node sizes will be scaled such that the smallest node will have the size of the first value, and the largest node -will have the size of the second value. -The other nodes will be scaled linearly between these two values according to their relative size. -This can be useful if node sizes vary a lot, or are all very small or very big. - - -Special columns -~~~~~~~~~~~~~~~ - -It is possible to modify the visualization directly by including columns of certain specific names in the node and relationship tables. - -All such special columns can be found :doc:`here <./api-reference/node>` for nodes and :doc:`here <./api-reference/relationship>` for relationships. -Though listed in ``snake_case`` here, ``SCREAMING_SNAKE_CASE`` and ``camelCase`` are also supported. -Some of the most commonly used special columns are: - -* **Node sizes**: The sizes of nodes can be controlled by including a column named "SIZE" in node tables. - The values in these columns should be of a numeric type. This can be useful for visualizing the relative importance or size of nodes in the graph, for example using a computed centrality score. - -* **Captions**: The caption text of nodes and relationships can be controlled by including a column named "CAPTION" in the tables. - The values in these columns should be of a string type. This can be useful for displaying additional information about the nodes, such as their names or labels. If no "CAPTION" column is provided, the default captions in the visualization will be the names of the corresponding node and relationship tables. - -Please also note that you can further customize the visualization after the `VisualizationGraph` has been created, by using the methods described in the :doc:`Customizing the visualization <./customizing>` section. +You can further customize the visualization after the `VisualizationGraph` has been created, by using the methods described in the :doc:`Customizing the visualization <./customizing>` section. Default behavior ~~~~~~~~~~~~~~~~ -Unless there are "CAPTION" columns in the tables, the node and relationship captions will be set to the names of the corresponding tables. -Similarly, if there are are no "COLOR" node table columns, the nodes will be colored be colored so that nodes from the same table have the same color, and different tables have different colors. +The node and relationship captions will be set to the names of the corresponding tables. +The nodes will be colored so that nodes from the same table have the same color, and different tables have different colors. Example diff --git a/python-wrapper/src/neo4j_viz/gds.py b/python-wrapper/src/neo4j_viz/gds.py index 10c1c8b..013f62f 100644 --- a/python-wrapper/src/neo4j_viz/gds.py +++ b/python-wrapper/src/neo4j_viz/gds.py @@ -167,7 +167,7 @@ def from_gds( VG = _from_dfs(node_df, rel_dfs, dropna=True) for node in VG.nodes: - node.caption = str(node.properties.get("labels")) + node.caption = ":".join([label for label in node.properties["labels"]]) for rel in VG.relationships: rel.caption = rel.properties.get("relationshipType") diff --git a/python-wrapper/src/neo4j_viz/gql_create.py b/python-wrapper/src/neo4j_viz/gql_create.py index e584d9a..6272194 100644 --- a/python-wrapper/src/neo4j_viz/gql_create.py +++ b/python-wrapper/src/neo4j_viz/gql_create.py @@ -5,6 +5,7 @@ from pydantic import BaseModel, ValidationError from neo4j_viz import Node, Relationship, VisualizationGraph +from neo4j_viz.colors import NEO4J_COLORS_DISCRETE, ColorSpace def _parse_value(value_str: str) -> Any: @@ -91,10 +92,7 @@ def _parse_value(value_str: str) -> Any: return value_str.strip("'\"") -def _parse_prop_str( - query: str, prop_str: str, prop_start: int, top_level_keys: set[str] -) -> tuple[dict[str, Any], dict[str, Any]]: - top_level: dict[str, Any] = {} +def _parse_prop_str(query: str, prop_str: str, prop_start: int) -> dict[str, Any]: props: dict[str, Any] = {} depth = 0 in_string = None @@ -115,10 +113,7 @@ def _parse_prop_str( k, v = pair.split(":", 1) k = k.strip().strip("'\"") - if k in top_level_keys: - top_level[k] = _parse_value(v) - else: - props[k] = _parse_value(v) + props[k] = _parse_value(v) start_idx = i + 1 else: @@ -133,17 +128,12 @@ def _parse_prop_str( k, v = pair.split(":", 1) k = k.strip().strip("'\"") - if k in top_level_keys: - top_level[k] = _parse_value(v) - else: - props[k] = _parse_value(v) + props[k] = _parse_value(v) - return top_level, props + return props -def _parse_labels_and_props( - query: str, s: str, top_level_keys: set[str] -) -> tuple[Optional[str], dict[str, Any], dict[str, Any]]: +def _parse_labels_and_props(query: str, s: str) -> tuple[Optional[str], dict[str, Any]]: prop_match = re.search(r"\{(.*)\}", s) prop_str = "" if prop_match: @@ -155,9 +145,8 @@ def _parse_labels_and_props( final_alias = raw_alias if raw_alias else None if prop_str: - top_level, props = _parse_prop_str(query, prop_str, prop_start, top_level_keys) + props = _parse_prop_str(query, prop_str, prop_start) else: - top_level = {} props = {} label_list = [lbl.strip() for lbl in alias_labels[1:]] @@ -165,7 +154,7 @@ def _parse_labels_and_props( props["__labels"] = props["labels"] props["labels"] = sorted(label_list) - return final_alias, top_level, props + return final_alias, props def _get_snippet(q: str, idx: int, context: int = 15) -> str: @@ -175,21 +164,20 @@ def _get_snippet(q: str, idx: int, context: int = 15) -> str: return q[start:end].replace("\n", " ") -def from_gql_create( - query: str, - size_property: Optional[str] = None, - node_caption: Optional[str] = "labels", - relationship_caption: Optional[str] = "type", - node_radius_min_max: Optional[tuple[float, float]] = (3, 60), -) -> VisualizationGraph: +def from_gql_create(query: str) -> VisualizationGraph: """ Parse a GQL CREATE query and return a VisualizationGraph object representing the graph it creates. All node and relationship properties will be included in the visualization graph. - If the properties are named as the fields of the `Node` or `Relationship` classes, they will be included as - top level fields of the respective objects. Otherwise, they will be included in the `properties` dictionary. + All properties of nodes and relationships will be included in the `properties` dictionary of the respective objects. Additionally, a "labels" property will be added for nodes and a "type" property for relationships. + By default: + + * the caption of a node will be based on its `labels`. + * the caption of a relationship will be based on its `type`. + * the color of nodes will be set based on their label, unless there are more than 12 unique labels. + Please note that this function is not a full GQL parser, it only handles CREATE queries that do not contain other clauses like MATCH, WHERE, RETURN, etc, or any Cypher function calls. It also does not handle all possible GQL syntax, but it should work for most common cases. @@ -199,15 +187,6 @@ def from_gql_create( ---------- query : str The GQL CREATE query to parse - size_property : str, optional - Property to use for node size, by default None. - node_caption : str, optional - Property to use as the node caption, by default the node labels will be used. - relationship_caption : str, optional - Property to use as the relationship caption, by default the relationship type will be used. - node_radius_min_max : tuple[float, float], optional - Minimum and maximum node radius, by default (3, 60). - To avoid tiny or huge nodes in the visualization, the node sizes are scaled to fit in the given range. """ query = query.strip() @@ -251,19 +230,9 @@ def from_gql_create( node_pattern = re.compile(r"^\(([^)]*)\)$") rel_pattern = re.compile(r"^\(([^)]*)\)-\s*\[\s*:(\w+)\s*(\{[^}]*\})?\s*\]->\(([^)]*)\)$") - node_top_level_keys = Node.all_validation_aliases(exempted_fields=["id", "size", "caption"]) - rel_top_level_keys = Relationship.all_validation_aliases(exempted_fields=["id", "source", "target", "caption"]) - def _parse_validation_error(e: ValidationError, entity_type: type[BaseModel]) -> None: for err in e.errors(): loc = err["loc"][0] - if (loc == "size") and size_property is not None: - loc = size_property - if loc == "caption": - if (entity_type == Node) and (node_caption is not None): - loc = node_caption - elif (entity_type == Relationship) and (relationship_caption is not None): - loc = relationship_caption raise ValueError( f"Error for {entity_type.__name__.lower()} property '{loc}' with provided input '{err['input']}'. Reason: {err['msg']}" ) @@ -277,14 +246,14 @@ def _parse_validation_error(e: ValidationError, entity_type: type[BaseModel]) -> node_m = node_pattern.match(part) if node_m: alias_labels_props = node_m.group(1).strip() - alias, top_level, props = _parse_labels_and_props(query, alias_labels_props, node_top_level_keys) + alias, props = _parse_labels_and_props(query, alias_labels_props) if not alias: alias = f"_anon_{anonymous_count}" anonymous_count += 1 if alias not in alias_to_id: alias_to_id[alias] = str(uuid.uuid4()) try: - nodes.append(Node(id=alias_to_id[alias], **top_level, properties=props)) + nodes.append(Node(id=alias_to_id[alias], properties=props)) except ValidationError as e: _parse_validation_error(e, Node) @@ -296,14 +265,14 @@ def _parse_validation_error(e: ValidationError, entity_type: type[BaseModel]) -> right_node = rel_m.group(4).strip() # Parse left node pattern - left_alias, left_top_level, left_props = _parse_labels_and_props(query, left_node, node_top_level_keys) + left_alias, left_props = _parse_labels_and_props(query, left_node) if not left_alias: left_alias = f"_anon_{anonymous_count}" anonymous_count += 1 if left_alias not in alias_to_id: alias_to_id[left_alias] = str(uuid.uuid4()) try: - nodes.append(Node(id=alias_to_id[left_alias], **left_top_level, properties=left_props)) + nodes.append(Node(id=alias_to_id[left_alias], properties=left_props)) except ValidationError as e: _parse_validation_error(e, Node) elif left_alias not in alias_to_id: @@ -311,14 +280,14 @@ def _parse_validation_error(e: ValidationError, entity_type: type[BaseModel]) -> raise ValueError(f"Relationship references unknown node alias: '{left_alias}' near: `{snippet}`.") # Parse right node pattern - right_alias, right_top_level, right_props = _parse_labels_and_props(query, right_node, node_top_level_keys) + right_alias, right_props = _parse_labels_and_props(query, right_node) if not right_alias: right_alias = f"_anon_{anonymous_count}" anonymous_count += 1 if right_alias not in alias_to_id: alias_to_id[right_alias] = str(uuid.uuid4()) try: - nodes.append(Node(id=alias_to_id[right_alias], **right_top_level, properties=right_props)) + nodes.append(Node(id=alias_to_id[right_alias], properties=right_props)) except ValidationError as e: _parse_validation_error(e, Node) elif right_alias not in alias_to_id: @@ -331,9 +300,8 @@ def _parse_validation_error(e: ValidationError, entity_type: type[BaseModel]) -> if rel_props_str: inner_str = rel_props_str.strip("{}").strip() prop_start = query.index(inner_str, query.index(inner_str)) - top_level, props = _parse_prop_str(query, inner_str, prop_start, rel_top_level_keys) + props = _parse_prop_str(query, inner_str, prop_start) else: - top_level = {} props = {} if "type" in props: props["__type"] = props["type"] @@ -345,7 +313,6 @@ def _parse_validation_error(e: ValidationError, entity_type: type[BaseModel]) -> id=rel_id, source=alias_to_id[left_alias], target=alias_to_id[right_alias], - **top_level, properties=props, ) ) @@ -357,28 +324,15 @@ def _parse_validation_error(e: ValidationError, entity_type: type[BaseModel]) -> snippet = part[:30] raise ValueError(f"Invalid element in CREATE near: `{snippet}`.") - if size_property is not None: - try: - for node in nodes: - node.size = node.properties.get(size_property) - except ValidationError as e: - _parse_validation_error(e, Node) - if node_caption is not None: - for node in nodes: - if node_caption == "labels": - if len(node.properties["labels"]) > 0: - node.caption = ":".join([label for label in node.properties["labels"]]) - else: - node.caption = str(node.properties.get(node_caption)) - if relationship_caption is not None: - for rel in relationships: - if relationship_caption == "type": - rel.caption = rel.properties["type"] - else: - rel.caption = str(rel.properties.get(relationship_caption)) - VG = VisualizationGraph(nodes=nodes, relationships=relationships) - if (node_radius_min_max is not None) and (size_property is not None): - VG.resize_nodes(node_radius_min_max=node_radius_min_max) + + for node in VG.nodes: + node.caption = ":".join([label for label in node.properties["labels"]]) + for rel in VG.relationships: + rel.caption = rel.properties.get("type") + + number_of_colors = len({str(n.properties.get("labels")) for n in VG.nodes}) + if number_of_colors <= len(NEO4J_COLORS_DISCRETE): + VG.color_nodes(property="labels", color_space=ColorSpace.DISCRETE) return VG diff --git a/python-wrapper/src/neo4j_viz/neo4j.py b/python-wrapper/src/neo4j_viz/neo4j.py index 5b202b8..3a301ca 100644 --- a/python-wrapper/src/neo4j_viz/neo4j.py +++ b/python-wrapper/src/neo4j_viz/neo4j.py @@ -7,6 +7,7 @@ from neo4j import Driver, Result, RoutingControl from pydantic import BaseModel, ValidationError +from neo4j_viz.colors import NEO4J_COLORS_DISCRETE, ColorSpace from neo4j_viz.node import Node from neo4j_viz.relationship import Relationship from neo4j_viz.visualization_graph import VisualizationGraph @@ -22,18 +23,18 @@ def _parse_validation_error(e: ValidationError, entity_type: type[BaseModel]) -> def from_neo4j( data: Union[neo4j.graph.Graph, Result, Driver], - size_property: Optional[str] = None, - node_caption: Optional[str] = "labels", - relationship_caption: Optional[str] = "type", - node_radius_min_max: Optional[tuple[float, float]] = (3, 60), row_limit: int = 10_000, ) -> VisualizationGraph: """ Create a VisualizationGraph from a Neo4j `Graph`, Neo4j `Result` or Neo4j `Driver`. - All node and relationship properties will be included in the visualization graph. - If the properties are named as the fields of the `Node` or `Relationship` classes, they will be included as - top level fields of the respective objects. Otherwise, they will be included in the `properties` dictionary. + By default: + + * the caption of a node will be based on its `labels`. + * the caption of a relationship will be based on its `type`. + * the color of nodes will be set based on their label, unless there are more than 12 unique labels. + + All node and relationship properties will be included in the visualization graph under the `properties` field. Additionally, a "labels" property will be added for nodes and a "type" property for relationships. Parameters @@ -41,15 +42,6 @@ def from_neo4j( data : Union[neo4j.graph.Graph, neo4j.Result, neo4j.Driver] Either a query result in the shape of a `neo4j.graph.Graph` or `neo4j.Result`, or a `neo4j.Driver` in which case a simple default query will be executed internally to retrieve the graph data. - size_property : str, optional - Property to use for node size, by default None. - node_caption : str, optional - Property to use as the node caption, by default the node labels will be used. - relationship_caption : str, optional - Property to use as the relationship caption, by default the relationship type will be used. - node_radius_min_max : tuple[float, float], optional - Minimum and maximum node radius, by default (3, 60). - To avoid tiny or huge nodes in the visualization, the node sizes are scaled to fit in the given range. row_limit : int, optional Maximum number of rows to return from the query, by default 10_000. This is only used if a `neo4j.Driver` is passed as `result` argument, otherwise the limit is ignored. @@ -77,117 +69,62 @@ def from_neo4j( else: raise ValueError(f"Invalid input type `{type(data)}`. Expected `neo4j.Graph`, `neo4j.Result` or `neo4j.Driver`") - all_node_field_aliases = Node.all_validation_aliases(exempted_fields=["size", "caption"]) - all_rel_field_aliases = Relationship.all_validation_aliases(exempted_fields=["caption"]) - - try: - nodes = [ - _map_node(node, all_node_field_aliases, size_property, caption_property=node_caption) - for node in graph.nodes - ] - except ValueError as e: - err_msg = str(e) - if ("'size'" in err_msg) and (size_property is not None): - err_msg = err_msg.replace("'size'", f"'{size_property}'") - elif ("'caption'" in err_msg) and (node_caption is not None): - err_msg = err_msg.replace("'caption'", f"'{node_caption}'") - raise ValueError(err_msg) + nodes = [_map_node(node) for node in graph.nodes] relationships = [] - try: - for rel in graph.relationships: - mapped_rel = _map_relationship(rel, all_rel_field_aliases, caption_property=relationship_caption) - if mapped_rel: - relationships.append(mapped_rel) - except ValueError as e: - err_msg = str(e) - if ("'caption'" in err_msg) and (relationship_caption is not None): - err_msg = err_msg.replace("'caption'", f"'{relationship_caption}'") - raise ValueError(err_msg) + + for rel in graph.relationships: + mapped_rel = _map_relationship(rel) + if mapped_rel: + relationships.append(mapped_rel) VG = VisualizationGraph(nodes, relationships) - if (node_radius_min_max is not None) and (size_property is not None): - VG.resize_nodes(node_radius_min_max=node_radius_min_max) + for node in VG.nodes: + node.caption = ":".join(node.properties["labels"]) + for r in VG.relationships: + r.caption = r.properties["type"] + + number_of_colors = len({n.caption for n in VG.nodes}) + if number_of_colors <= len(NEO4J_COLORS_DISCRETE): + VG.color_nodes(field="caption", color_space=ColorSpace.DISCRETE, colors=NEO4J_COLORS_DISCRETE) return VG def _map_node( node: neo4j.graph.Node, - all_node_field_aliases: set[str], - size_property: Optional[str], - caption_property: Optional[str], ) -> Node: - top_level_fields = {"id": node.element_id} - - if size_property: - top_level_fields["size"] = node.get(size_property) - labels = sorted([label for label in node.labels]) - if caption_property: - if caption_property == "labels": - if len(labels) > 0: - top_level_fields["caption"] = ":".join([label for label in labels]) - else: - top_level_fields["caption"] = str(node.get(caption_property)) - - properties = {} - for prop, value in node.items(): - if prop not in all_node_field_aliases: - properties[prop] = value - continue - if prop in top_level_fields: - properties[prop] = value - continue - - top_level_fields[prop] = value + properties = {prop: value for prop, value in node.items()} if "labels" in properties: properties["__labels"] = properties["labels"] properties["labels"] = labels try: - viz_node = Node(**top_level_fields, properties=properties) + viz_node = Node(id=node.element_id, properties=properties) except ValidationError as e: _parse_validation_error(e, Node) return viz_node -def _map_relationship( - rel: neo4j.graph.Relationship, all_rel_field_aliases: set[str], caption_property: Optional[str] -) -> Optional[Relationship]: +def _map_relationship(rel: neo4j.graph.Relationship) -> Optional[Relationship]: if rel.start_node is None or rel.end_node is None: return None - top_level_fields = {"id": rel.element_id, "source": rel.start_node.element_id, "target": rel.end_node.element_id} - - if caption_property: - if caption_property == "type": - top_level_fields["caption"] = rel.type - else: - top_level_fields["caption"] = str(rel.get(caption_property)) - - properties = {} - for prop, value in rel.items(): - if prop not in all_rel_field_aliases: - properties[prop] = value - continue - - if prop in top_level_fields: - properties[prop] = value - continue - - top_level_fields[prop] = value + properties = {prop: value for prop, value in rel.items()} if "type" in properties: properties["__type"] = properties["type"] properties["type"] = rel.type try: - viz_rel = Relationship(**top_level_fields, properties=properties) + viz_rel = Relationship( + id=rel.element_id, source=rel.start_node.element_id, target=rel.end_node.element_id, properties=properties + ) except ValidationError as e: _parse_validation_error(e, Relationship) diff --git a/python-wrapper/src/neo4j_viz/node.py b/python-wrapper/src/neo4j_viz/node.py index db0ad5f..a8bdca3 100644 --- a/python-wrapper/src/neo4j_viz/node.py +++ b/python-wrapper/src/neo4j_viz/node.py @@ -90,15 +90,6 @@ def cast_color(cls, color: ColorType) -> Color: def to_dict(self) -> dict[str, Any]: return self.model_dump(exclude_none=True, by_alias=True) - @staticmethod - def all_validation_aliases(exempted_fields: Optional[list[str]] = None) -> set[str]: - if exempted_fields is None: - exempted_fields = [] - - by_field = [v.validation_alias.choices for k, v in Node.model_fields.items() if k not in exempted_fields] # type: ignore - - return {str(alias) for aliases in by_field for alias in aliases} - @staticmethod def basic_fields_validation_aliases() -> set[str]: mandatory_fields = ["id"] diff --git a/python-wrapper/src/neo4j_viz/relationship.py b/python-wrapper/src/neo4j_viz/relationship.py index 0498e09..b92ee8a 100644 --- a/python-wrapper/src/neo4j_viz/relationship.py +++ b/python-wrapper/src/neo4j_viz/relationship.py @@ -97,19 +97,6 @@ def cast_color(cls, color: ColorType) -> Color: def to_dict(self) -> dict[str, Any]: return self.model_dump(exclude_none=True, by_alias=True) - @staticmethod - def all_validation_aliases(exempted_fields: Optional[list[str]] = None) -> set[str]: - if exempted_fields is None: - exempted_fields = [] - - by_field = [ - v.validation_alias.choices # type: ignore - for k, v in Relationship.model_fields.items() - if k not in exempted_fields - ] - - return {str(alias) for aliases in by_field for alias in aliases} - @staticmethod def basic_fields_validation_aliases() -> set[str]: basic_fields = ["id", "source", "target"] diff --git a/python-wrapper/src/neo4j_viz/snowflake.py b/python-wrapper/src/neo4j_viz/snowflake.py index 4332601..75b43c2 100644 --- a/python-wrapper/src/neo4j_viz/snowflake.py +++ b/python-wrapper/src/neo4j_viz/snowflake.py @@ -317,9 +317,11 @@ def from_snowflake( Create a VisualizationGraph from Snowflake tables based on a project configuration. By default: + * The caption of the nodes will be set to the table name. * The caption of the relationships will be set to the table name. * The color of the nodes will be set based on the caption, unless there are more than 12 node tables used. + Otherwise, columns will be included as properties on the nodes and relationships. Args: diff --git a/python-wrapper/tests/test_gds.py b/python-wrapper/tests/test_gds.py index 7542f28..fb078aa 100644 --- a/python-wrapper/tests/test_gds.py +++ b/python-wrapper/tests/test_gds.py @@ -62,19 +62,19 @@ def test_from_gds_integration_all_properties(gds: Any) -> None: assert sorted(VG.nodes, key=lambda x: x.id) == [ Node( id=0, - caption="['A']", + caption="A", color="#ffdf81", properties=dict(size=0.1, labels=["A"], component=float(1), score=1337.0), ), Node( id=1, - caption="['C']", + caption="C", color="#f79767", properties=dict(size=0.2, labels=["C"], component=float(4), score=42.0), ), Node( id=2, - caption="['A', 'B']", + caption="A:B", color="#c990c0", properties=dict(size=0.3, labels=["A", "B"], component=float(2), score=3.14), ), @@ -167,10 +167,10 @@ def test_from_gds_hetero(gds: Any) -> None: assert len(VG.nodes) == 4 assert sorted(VG.nodes, key=lambda x: x.id) == [ - Node(id=0, caption="['A']", color="#ffdf81", properties=dict(labels=["A"], component=float(1))), - Node(id=1, caption="['A']", color="#ffdf81", properties=dict(labels=["A"], component=float(2))), - Node(id=2, caption="['B']", color="#c990c0", properties=dict(labels=["B"])), - Node(id=3, caption="['B']", color="#c990c0", properties=dict(labels=["B"])), + Node(id=0, caption="A", color="#ffdf81", properties=dict(labels=["A"], component=float(1))), + Node(id=1, caption="A", color="#ffdf81", properties=dict(labels=["A"], component=float(2))), + Node(id=2, caption="B", color="#c990c0", properties=dict(labels=["B"])), + Node(id=3, caption="B", color="#c990c0", properties=dict(labels=["B"])), ] assert len(VG.relationships) == 2 diff --git a/python-wrapper/tests/test_gql_create.py b/python-wrapper/tests/test_gql_create.py index 6792319..75b91e8 100644 --- a/python-wrapper/tests/test_gql_create.py +++ b/python-wrapper/tests/test_gql_create.py @@ -25,55 +25,84 @@ def test_from_gql_create_syntax() -> None: ()-[:LINK]->({name: 'Florentin'}); """ expected_node_dicts: list[dict[str, dict[str, Any]]] = [ + # node a { - "top_level": {}, + "top_level": {"caption": "User", "color": "#ffdf81"}, "properties": {"name": "Alice", "age": 23, "labels": ["User"], "__labels": ["Happy"], "id": 42}, }, + # node b { - "top_level": {}, + "top_level": {"caption": "User:person", "color": "#c990c0"}, "properties": {"name": "Bridget", "caption": "Bridget", "age": 34, "labels": ["User", "person"]}, }, + # node wizardMan { - "top_level": {}, + "top_level": {"caption": "User", "color": "#ffdf81"}, "properties": {"name": "Charles: The wizard, man", "hello": True, "height": None, "labels": ["User"]}, }, - {"top_level": {}, "properties": {"labels": ["User"]}}, + # node d + {"top_level": {"caption": "User", "color": "#ffdf81"}, "properties": {"labels": ["User"]}}, + # node e { - "top_level": {}, + "top_level": {"caption": "User", "color": "#ffdf81"}, "properties": { "age": 67, "my_map": {"key": "value", "key2": 3.14, "key3": [1, 2, 3], "key4": {"a": 1, "b": None}}, "labels": ["User"], }, }, - {"top_level": {}, "properties": {"age": 42, "pets": ["cat", False, "dog"], "labels": ["User"]}}, - {"top_level": {}, "properties": {"labels": []}}, - {"top_level": {}, "properties": {"name": "Fawad", "age": 78, "labels": ["Person", "User"]}}, - {"top_level": {}, "properties": {"age": 29, "labels": []}}, - {"top_level": {}, "properties": {"labels": []}}, - {"top_level": {}, "properties": {"name": "Florentin", "labels": []}}, + # node without alias + { + "top_level": {"caption": "User", "color": "#ffdf81"}, + "properties": {"age": 42, "pets": ["cat", False, "dog"], "labels": ["User"]}, + }, + # empty node + {"top_level": {"caption": "", "color": "#f79767"}, "properties": {"labels": []}}, + # node f + { + "top_level": {"caption": "Person:User", "color": "#56c7e4"}, + "properties": {"name": "Fawad", "age": 78, "labels": ["Person", "User"]}, + }, + # node without alias 2 + {"top_level": {"caption": "", "color": "#f79767"}, "properties": {"age": 29, "labels": []}}, + # anonymous node at source of rel to florentin + {"top_level": {"caption": "", "color": "#f79767"}, "properties": {"labels": []}}, + # anonymous node at target rel + {"top_level": {"caption": "", "color": "#f79767"}, "properties": {"name": "Florentin", "labels": []}}, ] - VG = from_gql_create(query, node_caption=None, relationship_caption=None) + VG = from_gql_create(query) assert len(VG.nodes) == len(expected_node_dicts) for i, exp_node in enumerate(expected_node_dicts): created_node = VG.nodes[i] - assert created_node.model_dump(exclude_none=True, exclude={"properties", "id"}) == exp_node["top_level"] - assert created_node.properties == exp_node["properties"] + assert created_node.model_dump(exclude_none=True, exclude={"properties", "id"}) == exp_node["top_level"], ( + f"Failed at node {created_node}" + ) + assert created_node.properties == exp_node["properties"], f"Failed at node {created_node}" expected_relationships_dicts: list[dict[str, Any]] = [ - {"source_idx": 0, "target_idx": 1, "top_level": {}, "properties": {"weight": 0.5, "type": "LINK"}}, - {"source_idx": 0, "target_idx": 2, "top_level": {}, "properties": {"weight": 4, "type": "LINK"}}, - {"source_idx": 4, "target_idx": 3, "top_level": {}, "properties": {"type": "LINK"}}, + { + "source_idx": 0, + "target_idx": 1, + "top_level": {"caption": "LINK"}, + "properties": {"weight": 0.5, "type": "LINK"}, + }, + { + "source_idx": 0, + "target_idx": 2, + "top_level": {"caption": "LINK"}, + "properties": {"weight": 4, "type": "LINK"}, + }, + {"source_idx": 4, "target_idx": 3, "top_level": {"caption": "LINK"}, "properties": {"type": "LINK"}}, { "source_idx": 4, "target_idx": 7, - "top_level": {}, + "top_level": {"caption": "OTHER_LINK"}, "properties": {"weight": -2, "caption": "Balloon", "type": "OTHER_LINK", "__type": 1, "source": 1337}, }, - {"source_idx": 9, "target_idx": 10, "top_level": {}, "properties": {"type": "LINK"}}, + {"source_idx": 9, "target_idx": 10, "top_level": {"caption": "LINK"}, "properties": {"type": "LINK"}}, ] assert len(VG.relationships) == len(expected_relationships_dicts) @@ -84,8 +113,8 @@ def test_from_gql_create_syntax() -> None: assert ( created_rel.model_dump(exclude_none=True, exclude={"properties", "id", "source", "target"}) == exp_rel["top_level"] - ) - assert created_rel.properties == exp_rel["properties"] + ), f"Failed at relationship {created_rel}" + assert created_rel.properties == exp_rel["properties"], f"Failed at relationship {created_rel}" def test_from_gql_create_captions() -> None: @@ -97,11 +126,11 @@ def test_from_gql_create_captions() -> None: """ expected_node_dicts: list[dict[str, dict[str, Any]]] = [ { - "top_level": {"caption": "User"}, + "top_level": {"caption": "User", "color": "#ffdf81"}, "properties": {"name": "Alice", "age": 23, "labels": ["User"]}, }, { - "top_level": {"caption": "User:person"}, + "top_level": {"caption": "User:person", "color": "#c990c0"}, "properties": {"name": "Bridget", "caption": "Bridget", "age": 34, "labels": ["User", "person"]}, }, ] @@ -136,33 +165,6 @@ def test_from_gql_create_captions() -> None: assert created_rel.properties == exp_rel["properties"] -def test_from_gql_create_sizes() -> None: - query = """ - CREATE - (a:User {name: 'Alice', age: 23}), - (b:User:person {name: "Bridget", age: 34, "caption": "Bridget"}); - """ - expected_node_dicts: list[dict[str, dict[str, Any]]] = [ - { - "top_level": {"size": 3.0}, - "properties": {"name": "Alice", "age": 23, "labels": ["User"]}, - }, - { - "top_level": {"size": 60.0}, - "properties": {"name": "Bridget", "caption": "Bridget", "age": 34, "labels": ["User", "person"]}, - }, - ] - - VG = from_gql_create(query, size_property="age", node_caption=None, relationship_caption=None) - - assert len(VG.nodes) == len(expected_node_dicts) - for i, exp_node in enumerate(expected_node_dicts): - created_node = VG.nodes[i] - - assert created_node.model_dump(exclude_none=True, exclude={"properties", "id"}) == exp_node["top_level"] - assert created_node.properties == exp_node["properties"] - - def test_unbalanced_parentheses_snippet() -> None: query = "CREATE (a:User, (b:User })" with pytest.raises(ValueError, match=r"Unbalanced parentheses near: `.*\(b:User.*"): @@ -217,30 +219,3 @@ def test_no_create_keyword() -> None: query = "(a:User {y:4})" with pytest.raises(ValueError, match=r"Query must begin with 'CREATE' \(case insensitive\)."): from_gql_create(query) - - -def test_illegal_node_x() -> None: - query = "CREATE (a:User {x:'tennis'})" - with pytest.raises( - ValueError, - match="Error for node property 'x' with provided input 'tennis'. Reason: Input should be a valid integer, unable to parse string as an integer", - ): - from_gql_create(query) - - -def test_illegal_node_size() -> None: - query = "CREATE (a:User {hello: 'tennis'})" - with pytest.raises( - ValueError, - match="Error for node property 'hello' with provided input 'tennis'", - ): - from_gql_create(query, size_property="hello") - - -def test_illegal_rel_caption_size() -> None: - query = "CREATE ()-[:LINK {caption_size: -42}]->()" - with pytest.raises( - ValueError, - match="Error for relationship property 'caption_size' with provided input '-42'. Reason: Input should be greater than 0", - ): - from_gql_create(query) diff --git a/python-wrapper/tests/test_neo4j.py b/python-wrapper/tests/test_neo4j.py index b4de935..43fe8bc 100644 --- a/python-wrapper/tests/test_neo4j.py +++ b/python-wrapper/tests/test_neo4j.py @@ -5,6 +5,7 @@ import pytest from neo4j import Driver, Session +from neo4j_viz.colors import NEO4J_COLORS_DISCRETE from neo4j_viz.neo4j import from_neo4j from neo4j_viz.node import Node @@ -35,6 +36,7 @@ def test_from_neo4j_graph_basic(neo4j_session: Session) -> None: Node( id=node_ids[0], caption="_CI_A", + color=NEO4J_COLORS_DISCRETE[0], properties=dict( labels=["_CI_A"], name="Alice", @@ -47,7 +49,7 @@ def test_from_neo4j_graph_basic(neo4j_session: Session) -> None: Node( id=node_ids[1], caption="_CI_A:_CI_B", - size=None, + color=NEO4J_COLORS_DISCRETE[1], properties=dict( size=11, labels=["_CI_A", "_CI_B"], @@ -70,41 +72,6 @@ def test_from_neo4j_graph_basic(neo4j_session: Session) -> None: ] -@pytest.mark.requires_neo4j_and_gds -def test_from_neo4j_graph_size_property(neo4j_session: Session) -> None: - # set a non parsable size property, by default it should not be picked up - neo4j_session.run("MATCH (n) SET n.size = 'banana'") - - graph = neo4j_session.run("MATCH (a:_CI_A|_CI_B)-[r]->(b) RETURN a, b, r ORDER BY a").graph() - - VG = from_neo4j(graph, size_property="height", node_radius_min_max=None) - - assert {n.properties["name"]: n.size for n in VG.nodes} == {"Alice": 20, "Bob": 10} - - VG = from_neo4j(graph, size_property=None, node_radius_min_max=None) - - assert {n.properties["name"]: n.size for n in VG.nodes} == {"Alice": None, "Bob": None} - - -@pytest.mark.requires_neo4j_and_gds -def test_from_neo4j_graph_default_caption(neo4j_session: Session) -> None: - neo4j_session.run("MATCH (n) SET n.caption = 'my_caption' SET n.other_caption = 'other_caption'") - - graph = neo4j_session.run("MATCH (a:_CI_A|_CI_B)-[r]->(b) RETURN a, b, r ORDER BY a").graph() - - VG = from_neo4j(graph, node_caption=None, node_radius_min_max=None) - - assert [n.caption for n in VG.nodes] == [None, None] - - VG = from_neo4j(graph, node_caption="other_caption", node_radius_min_max=None) - - assert [n.caption for n in VG.nodes] == ["other_caption", "other_caption"] - - VG = from_neo4j(graph, relationship_caption="year") - - assert {e.properties["type"]: e.caption for e in VG.relationships} == {"KNOWS": "2025", "RELATED": "2015"} - - @pytest.mark.requires_neo4j_and_gds def test_from_neo4j_result(neo4j_session: Session) -> None: result = neo4j_session.run("MATCH (a:_CI_A|_CI_B)-[r]->(b) RETURN a, b, r ORDER BY a") @@ -120,6 +87,7 @@ def test_from_neo4j_result(neo4j_session: Session) -> None: Node( id=node_ids[0], caption="_CI_A", + color=NEO4J_COLORS_DISCRETE[0], properties=dict( labels=["_CI_A"], name="Alice", @@ -132,6 +100,7 @@ def test_from_neo4j_result(neo4j_session: Session) -> None: Node( id=node_ids[1], caption="_CI_A:_CI_B", + color=NEO4J_COLORS_DISCRETE[1], properties=dict( size=11, labels=["_CI_A", "_CI_B"], @@ -154,95 +123,6 @@ def test_from_neo4j_result(neo4j_session: Session) -> None: ] -@pytest.mark.requires_neo4j_and_gds -def test_from_neo4j_graph_full(neo4j_session: Session) -> None: - graph = neo4j_session.run("MATCH (a:_CI_A|_CI_B)-[r]->(b) RETURN a, b, r ORDER BY a").graph() - - VG = from_neo4j(graph, node_caption="name", relationship_caption="year", size_property="height") - - sorted_nodes: list[neo4j.graph.Node] = sorted(graph.nodes, key=lambda x: dict(x.items())["name"]) - node_ids: list[str] = [node.element_id for node in sorted_nodes] - - expected_nodes = [ - Node( - id=node_ids[0], - caption="Alice", - size=60.0, - properties=dict( - labels=["_CI_A"], - name="Alice", - height=20, - id=42, - _id=1337, - caption="hello", - ), - ), - Node( - id=node_ids[1], - caption="Bob", - size=3.0, - properties=dict( - labels=["_CI_A", "_CI_B"], - name="Bob", - size=11, - height=10, - id=84, - __labels=[1, 2], - ), - ), - ] - - assert len(VG.nodes) == 2 - assert sorted(VG.nodes, key=lambda x: x.properties["name"]) == expected_nodes - - assert len(VG.relationships) == 2 - vg_rels = sorted([(e.source, e.target, e.caption) for e in VG.relationships], key=lambda x: x[2] if x[2] else "foo") - assert vg_rels == [ - (node_ids[1], node_ids[0], "2015"), - (node_ids[0], node_ids[1], "2025"), - ] - - -@pytest.mark.requires_neo4j_and_gds -def test_from_neo4j_node_error(neo4j_session: Session) -> None: - neo4j_session.run("MATCH (n:_CI_A|_CI_B) DETACH DELETE n") - neo4j_session.run( - "CREATE (a:_CI_A {name:'Alice', height:20, id:42, _id: 1337, caption: 'hello', caption_size: -5})" - ) - graph = neo4j_session.run("MATCH (a:_CI_A) RETURN a").graph() - - with pytest.raises( - ValueError, - match="Error for node property 'caption_size' with provided input '-5'. Reason: Input should be greater than or equal to 1", - ): - from_neo4j(graph) - - neo4j_session.run("MATCH (n:_CI_A|_CI_B) DETACH DELETE n") - neo4j_session.run("CREATE (a:_CI_A {name:'Alice', height:20, id:42, _id: 1337, hello: -5})") - graph = neo4j_session.run("MATCH (a:_CI_A) RETURN a").graph() - with pytest.raises( - ValueError, - match="Error for node property 'hello' with provided input '-5'. Reason: Input should be greater than or equal to 0", - ): - from_neo4j(graph, size_property="hello") - - -@pytest.mark.requires_neo4j_and_gds -def test_from_neo4j_rel_error(neo4j_session: Session) -> None: - neo4j_session.run("MATCH (n:_CI_A|_CI_B) DETACH DELETE n") - neo4j_session.run( - "CREATE (a:_CI_A {name:'Alice', height:20, id:42, _id: 1337, caption: 'hello'})-[:KNOWS {year: 2025, id: 41, source: 1, target: 2, caption_align: 'banana'}]->" - "(b:_CI_A:_CI_B {name:'Bob', height:10, id: 84, size: 11, labels: [1,2]})" - ) - graph = neo4j_session.run("MATCH (a:_CI_A|_CI_B)-[r]->(b) RETURN a, b, r ORDER BY a").graph() - - with pytest.raises( - ValueError, - match="Error for relationship property 'caption_align' with provided input 'banana'. Reason: Input should be 'top', 'center' or 'bottom'", - ): - from_neo4j(graph) - - @pytest.mark.requires_neo4j_and_gds def test_from_neo4j_graph_driver(neo4j_session: Session, neo4j_driver: Driver) -> None: graph = neo4j_session.run("MATCH (a:_CI_A|_CI_B)-[r]->(b) RETURN a, b, r ORDER BY a").graph() @@ -257,6 +137,7 @@ def test_from_neo4j_graph_driver(neo4j_session: Session, neo4j_driver: Driver) - Node( id=node_ids[0], caption="_CI_A", + color=NEO4J_COLORS_DISCRETE[0], properties=dict( labels=["_CI_A"], name="Alice", @@ -269,6 +150,7 @@ def test_from_neo4j_graph_driver(neo4j_session: Session, neo4j_driver: Driver) - Node( id=node_ids[1], caption="_CI_A:_CI_B", + color=NEO4J_COLORS_DISCRETE[1], properties=dict( labels=["_CI_A", "_CI_B"], size=11, diff --git a/python-wrapper/tests/test_node.py b/python-wrapper/tests/test_node.py index f1cf5e3..a4079d4 100644 --- a/python-wrapper/tests/test_node.py +++ b/python-wrapper/tests/test_node.py @@ -96,12 +96,7 @@ def test_node_casing() -> None: def test_all_validation_aliases() -> None: - all_aliases = Node.all_validation_aliases() - assert "CAPTION_ALIGN" in all_aliases - assert "captionAlign" in all_aliases - assert "caption_align" in all_aliases - - all_aliases = Node.all_validation_aliases(exempted_fields=["caption_align"]) - assert "CAPTION_ALIGN" not in all_aliases - assert "captionAlign" not in all_aliases - assert "caption_align" not in all_aliases + all_aliases = Node.basic_fields_validation_aliases() + assert "id" in all_aliases + assert "ID" in all_aliases + assert "NODE_ID" in all_aliases diff --git a/python-wrapper/tests/test_relationship.py b/python-wrapper/tests/test_relationship.py index a4e10ab..1b4cced 100644 --- a/python-wrapper/tests/test_relationship.py +++ b/python-wrapper/tests/test_relationship.py @@ -104,12 +104,7 @@ def test_rel_casing() -> None: def test_all_validation_aliases() -> None: - all_aliases = Relationship.all_validation_aliases() - assert "CAPTION_ALIGN" in all_aliases - assert "captionAlign" in all_aliases - assert "caption_align" in all_aliases - - all_aliases = Relationship.all_validation_aliases(exempted_fields=["caption_align"]) - assert "CAPTION_ALIGN" not in all_aliases - assert "captionAlign" not in all_aliases - assert "caption_align" not in all_aliases + all_aliases = Relationship.basic_fields_validation_aliases() + assert "SOURCE_NODE_ID" in all_aliases + assert "targetNodeId" in all_aliases + assert "source_node_id" in all_aliases