Skip to content

Conversation

@sueoglu
Copy link
Collaborator

@sueoglu sueoglu commented Nov 26, 2025

fixes #232
two new visualization functions plot_sankey and plot_sankey_time for creating Sankey diagrams to analyze patient flows and state transitions in EHR data, added comprehensive tests

plot_sankey : relationships across observation columns
plot_sankey_time : patient state transitions over time

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@sueoglu sueoglu marked this pull request as ready for review December 12, 2025 08:52
@sueoglu sueoglu requested review from Zethson and eroell December 12, 2025 10:11
Copy link
Member

@Zethson Zethson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much!

  1. Don't forget add the function to our docs, please.
  2. Could you please add previews to the PR description?
  3. We should have a consistent plotting user interface where all have the same parameters and style like height, width etc. We should avoid kwargs where possible.
  4. Great tests! We might need to use try -> finally because you're changing the plotting backend. If the test fails and the plotting backend is not reset, it could cause other tests to fail.

edata: EHRData,
*,
columns: list[str],
show: bool = False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not using a show parameter for holoviews plots anymore because I think they're shown by default. Do we really still need this?

*,
columns: list[str],
show: bool = False,
**kwargs,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please look at the parameters of the survival analysis plots? (I'll make another PR very soon but one of them is already updated). We should have parameters like height, width etc. Consistency is very very important.

) -> hv.Sankey:
"""Create a Sankey diagram showing relationships across observation columns.
Please call :func:`holoviews.extension` with ``"matplotlib"`` or ``"bokeh"`` before using this function to select the backend.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Please call :func:`holoviews.extension` with ``"matplotlib"`` or ``"bokeh"`` before using this function to select the backend.

I think we should set a default backend and these functions will error if none is set.

>>> import pandas as pd
>>> import ehrdata as ed
>>>
>>> layer = np.array(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example is too complex. Can we make this work with blobs? If not, can we adapt the blobs function so that you can use it here, please?


default_opts = {"label_position": "right", "show_values": True, "title": f"Patient flows: {columns[0]} over time"}

default_opts.update(kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worried about this - see above.



@pytest.fixture
def ehr_3d_mini():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally with blobs as well - see above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add time based sankey diagrams

3 participants