Skip to content

Commit 5944106

Browse files
[Term Entry] Sankey (#6268)
* feat: add initial empty sankey entry * docs: add metadata of sankey entry * docs: add syntax definition * docs: add syntax example * docs: correct syntax example * docs: add output image of syntax example * docs: correct style * fix: prettier code styling * fix: PR issues * minor fixes ---------
1 parent 7ccc690 commit 5944106

File tree

2 files changed

+93
-0
lines changed

2 files changed

+93
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
---
2+
Title: '.Sankey()'
3+
Description: 'Creates Sankey diagrams, which visualize flow between categories using nodes and links.'
4+
Subjects:
5+
- 'Data Science'
6+
- 'Data Visualization'
7+
Tags:
8+
- 'Charts'
9+
- 'Graphics'
10+
- 'Plotly'
11+
- 'Python'
12+
CatalogContent:
13+
- 'learn-python-3'
14+
- 'paths/data-science'
15+
---
16+
17+
**`.Sankey()`** is a method in Plotly's [`graph_objects`](https://www.codecademy.com/resources/docs/plotly/graph-objects) module that creates Sankey diagrams, which visualize the flow between different values. The connected elements are called nodes, and the connections between them are links, where the width of each link represents the quantity of flow.
18+
19+
## Syntax
20+
21+
```pseudo
22+
plotly.graph_objects.Sankey(node=None, link=None, arrangement = 'snap', orientation = 'h', valueformat = None, ...)
23+
```
24+
25+
- `node`: A dictionary that defines the properties of the nodes in the Sankey diagram. It has the following keys:
26+
- `label`: Names of the nodes, where the order corresponds to their indices used in `link`.
27+
- `color`: A string or an array of strings specifying the color of each node. If a single string is provided, all nodes will have the same color. If an array is provided, each element defines the color of the node with the same index.
28+
- `link`: A dictionary that defines the links (connections) between nodes and their flow values. It contains the following keys:
29+
- `source`: An array of numerical indices. Each index specifies the source node of a link. The numerical index refers to the position of the node's name in the `label` array.
30+
- `target`: An array of numerical indices. Each index specifies the target node of a link. The numerical index refers to the position of the node's name in the `label` array.
31+
- `value`: An array of numerical values. Each value represents the flow quantity associated with a specific link. The order of values corresponds to the order of the source-target pairs.
32+
33+
> **Note:** The `source`, `target`, and `value` arrays must have the same length.
34+
35+
- `arrangement`: Sets the arrangement of the nodes in the Sankey diagram. The possible values are: `snap`, `perpendicular`, `freeform`, and `fixed`. The default value is `snap`.
36+
- `orientation`: Determines whether the Sankey diagram is displayed horizontally or vertically. The possible values are: `h` for horizontal, and `v` for vertical. The default value is `h`.
37+
- `valueformat`: Sets the numerical format for flow values, using d3-format syntax.
38+
39+
> **Note:** There are many additional, optional parameters that are not listed here, as indicated by the ellipsis (`...`) in the syntax.
40+
41+
## Example
42+
43+
This code displays a Sankey diagram, illustrating the advertising cash flow through its nodes, and links:
44+
45+
```py
46+
import plotly.graph_objects as go
47+
48+
# Define the data for the Sankey diagram (Advertising Cash Flow).
49+
data = {
50+
'source': ['Ad Campaign', 'Social Media', 'Search Engines', 'Referrals', 'Social Media', 'Search Engines', 'Referrals', 'Social Media', 'Search Engines', 'Referrals'],
51+
'target': ['Clicks', 'Clicks', 'Clicks', 'Clicks', 'Leads', 'Leads', 'Leads', 'Conversions', 'Conversions', 'Conversions'],
52+
'value': [500, 300, 200, 100, 150, 80, 40, 60, 30, 10]
53+
}
54+
55+
# Create a list of unique nodes.
56+
all_nodes = data['source'] + data['target']
57+
58+
# Create a dictionary that links the name of the node to its index.
59+
node_to_index = {node: i for i, node in enumerate(all_nodes)}
60+
61+
# Convert source, and target names to indices.
62+
source_indices = [node_to_index[source] for source in data['source']]
63+
target_indices = [node_to_index[target] for target in data['target']]
64+
65+
# Create the Sankey diagram.
66+
fig = go.Figure(data=[go.Sankey(
67+
node=dict(
68+
label=all_nodes,
69+
pad=20, # Add padding between nodes.
70+
thickness=10, # Define the thickness of the nodes.
71+
line=dict(color="black", width=0.5) # Add a border to the nodes.
72+
),
73+
link=dict(
74+
source=source_indices,
75+
target=target_indices,
76+
value=data['value'],
77+
color=['lightblue', 'lightgreen', 'lightcoral', 'indigo', 'turquoise', 'mediumvioletred', 'darkorange', 'yellowgreen', 'dodgerblue', 'lightblue'], # Define the color of the links.
78+
line=dict(color='black', width=0.2) # Define the border of the links.
79+
),
80+
arrangement='snap', # Set the arrangement of the nodes.
81+
orientation='h' # Set the orientation of the diagram.
82+
)])
83+
84+
# Update layout to add a title.
85+
fig.update_layout(title_text="Advertising Cash Flow", font_size=10)
86+
87+
# Display the plot.
88+
fig.show()
89+
```
90+
91+
This example results in the following output:
92+
93+
![The output will be a Sankey diagram.](https://raw.githubusercontent.com/Codecademy/docs/main/media/sankey-cash-flow.png)

media/sankey-cash-flow.png

56.5 KB
Loading

0 commit comments

Comments
 (0)