Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Term Entry] Sankey #6268

Merged
merged 15 commits into from
Mar 12, 2025
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
117 changes: 117 additions & 0 deletions content/plotly/concepts/graph-objects/terms/sankey/sankey.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
---
Title: 'Sankey'
Description: 'Creates a Sankey diagram in Plotly using the graph_objects module.'
Subjects:
- 'Data Science'
- 'Data Visualization'
Tags:
- 'Graphics'
- 'Charts'
- 'Plotly'
- 'Python'
CatalogContent:
- 'learn-python-3'
- 'paths/data-science'
- 'paths/data-science-foundations'
---

**`.Sankey()`** is a method in Plotly's [`graph_objects`](https://www.codecademy.com/resources/docs/plotly/graph-objects) module that creates visualizations to illustrate the flow between different values. Sankey diagrams. The connected elements are referred to as nodes, and the connections between nodes are called links. The width of each link represents
the quantity of flow.

## Syntax

```pseudo
plotly.graph_objects.Sankey(node=None, link=None, arrangement = 'snap', orientation = 'h', valueformat = None, ...)
```

- `node`: A dictionary that defines the properties of the nodes in the Sankey
diagram. It has the following keys:

- `label`: An array of strings, each representing the name of a node. The
order of the labels in this array corresponds to the node indices used
in the `link` parameter.
- `color`: A string or an array of strings specifying the color of each
node. If a single string is provided, all nodes will have the same
color. If an array is provided, each element defines the color of the
node with the same index.

- `link`: A dictionary that defines the links (connections) between nodes and
their flow values. It contains the following keys:

- `source`: An array of numerical indices. Each index specifies the source
node of a link. The numerical index refers to the position of the node's
name in the `label` array.
- `target`: An array of numerical indices. Each index specifies the target
node of a link. The numerical index refers to the position of the node's
name in the `label` array.
- `value`: An array of numerical values. Each value represents the flow
quantity associated with a specific link. The order of values
corresponds to the order of the source-target pairs.

> **Note:** The `source`, `target`, and `value` arrays must have the same length.

- `arrangement`: Sets the arrangement of the nodes in the Sankey diagram. The
possible values are: `snap`, `perpendicular`, `freeform`, and
`fixed`. The default value is `snap`.

- `orientation`: Determines whether the Sankey diagram is displayed horizontally or vertically. The
possible values are: `h` for horizontal, and `v` for vertical. The default value is `h`.

- `valueformat`: Sets the format of the numerical values displayed on the
links, using d3-format's syntax.

> **Note:** There are many additional, optional parameters that are not listed here, as indicated by the ellipsis (`...`) in the syntax.

## Example

This code displays a Sankey diagram, illustrating the advertising cash flow through its nodes, and links.

```py
import plotly.graph_objects as go

# Define the data for the Sankey diagram (Advertising Cash Flow).
data = {
'source': ['Ad Campaign', 'Social Media', 'Search Engines', 'Referrals', 'Social Media', 'Search Engines', 'Referrals', 'Social Media', 'Search Engines', 'Referrals'],
'target': ['Clicks', 'Clicks', 'Clicks', 'Clicks', 'Leads', 'Leads', 'Leads', 'Conversions', 'Conversions', 'Conversions'],
'value': [500, 300, 200, 100, 150, 80, 40, 60, 30, 10]
}

# Create a list of unique nodes.
all_nodes = data['source'] + data['target']

# Create a dictionary that links the name of the node to its index.
node_to_index = {node: i for i, node in enumerate(all_nodes)}

# Convert source, and target names to indices.
source_indices = [node_to_index[source] for source in data['source']]
target_indices = [node_to_index[target] for target in data['target']]

# Create the Sankey diagram.
fig = go.Figure(data=[go.Sankey(
node=dict(
label=all_nodes,
pad=20, # Add padding between nodes.
thickness=10, # Define the thickness of the nodes.
line=dict(color="black", width=0.5) # Add a border to the nodes.
),
link=dict(
source=source_indices,
target=target_indices,
value=data['value'],
color=['lightblue', 'lightgreen', 'lightcoral', 'indigo', 'turquoise', 'mediumvioletred', 'darkorange', 'yellowgreen', 'dodgerblue', 'lightblue'], # Define the color of the links.
line=dict(color='black', width=0.2) # Define the border of the links.
),
arrangement='snap', # Set the arrangement of the nodes.
orientation='h' # Set the orientation of the diagram.
)])

# Update layout to add a title.
fig.update_layout(title_text="Advertising Cash Flow", font_size=10)

# Display the plot.
fig.show()
```

This example results in the following output:

![The output will be a Sankey diagram.](https://raw.githubusercontent.com/Codecademy/docs/main/media/sankey-cash-flow.png)
Binary file added media/sankey-cash-flow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.