Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Command Line Interface Wrapper #1256

Closed
wants to merge 5 commits into from
Closed

Add Command Line Interface Wrapper #1256

wants to merge 5 commits into from

Conversation

ahuang11
Copy link
Collaborator

@ahuang11 ahuang11 commented Jan 20, 2024

Addresses #1150

This command-line tool allows for quick and easy visualization of various
data file formats using HoloViews and hvPlot, including CSV, JSON, HTML, Excel,
Parquet, NetCDF, and HDF5, and more.

The tool will automatically determine the appropriate reader for the file
extension and use it to load the data. Ensure the correct dependencies for
file format support are installed (e.g., pandas for CSV, xarray for NetCDF).

All of the keywords supported by hvPlot can be passed as arguments to the
command-line tool. See the hvPlot customization docs for available kwargs:
https://hvplot.holoviz.org/user_guide/Customization.html

Examples:
    hvplot iris.csv
    hvplot air.nc x=lon y=lat groupby=time,level geo=true
    hvplot detrend.nino34.ascii.txt x=YR y=ANOM -r pandas.read_csv -rk delimiter='\s+'

positional arguments:
  file_path             Path to the data file
  hvplot_kwargs         HoloViews options in 'key=value' format

options:
  -h, --help            show this help message and exit
  --opts [OPTS ...]     HoloViews plot options in 'key=value' format
  -r READER, --reader READER
                        Fully-qualified name of the reader function to use, e.g. pandas.read_csv
  -rk READER_KWARGS, --reader_kwargs READER_KWARGS
                        Reader options in 'key=value' format
  --port PORT           Port to use for displaying the plot
  -o OUTPUT_PATH, --output_path OUTPUT_PATH
                        Path to save the output file

Demo:

demo.mp4

@ahuang11
Copy link
Collaborator Author

I think to make it easier to change file name and use the same option, I should make path last: e.g.
hvplot x=lon y=lat groupby=time,level geo=true --path air.nc
hvplot x=lon y=lat groupby=time,level geo=true --path airx2.nc

@maximlt maximlt requested a review from jbednar January 20, 2024 17:27
Copy link
Member

@jbednar jbednar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks just like what I had in mind! Lightweight, clear, and useful!

Is it too much to ask for the Explorer "code" tab to show both Python code and CLI code to generate the selected plot? :-)

"PR with an MVP of the proposed CLI along with a list of desired but unimplemented features and a list of non-features (things explicitly not considered in scope)."

Can you add such a list to this PR or as a separate issue? I.e. one list of desired future features (with checkboxes indicating they are not implemented), plus another list of things you do not feel belong in an hvPlot CLI (e.g. input filtering and transformation beyond what Explorer does).

Separately from this, I've been thinking that the Explorer should initially usually start with just a hamburger button, without the various selectors visible, and those would show up only when the user clicks on the hamburger. That's independent of this PR, but it would help avoid the user having to make a strict choice between a "plot" and an "explorer"; it would just be a plot that then can be reconfigured if desired. I think that approach works well from the cli, and also in other contexts where the Explorer is used, e.g. Jupyter.

break


def parse_inputs(inputs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add docstrings to this and other functions in this file, stating the intent of the function and the assumptions it makes about its arguments.

hvplot/cli.py Outdated Show resolved Hide resolved
hvplot/cli.py Outdated Show resolved Hide resolved
hvplot/cli.py Show resolved Hide resolved
@ahuang11
Copy link
Collaborator Author

Is it too much to ask for the Explorer "code" tab to show both Python code and CLI code to generate the selected plot? :-)

Super easy I think.

Separately from this, I've been thinking that the Explorer should initially usually start with just a hamburger button, without the various selectors visible,

I think Panel needs a Sidebar card layout that does this.

@maximlt
Copy link
Member

maximlt commented Jan 21, 2024

The issue addressed by this PR started with:

With hvplot explorer soon supporting #1141, I think the next logical step for hvplot explorer is a CLI (in addition to ideas from #1149). From my experience, scientists call ncview or panoply in the terminal to do a quick validation on their datasets.

So I did something very simple, I downloaded the first netCDF file I found on https://www.unidata.ucar.edu/software/netcdf/examples/files.html and threw it at the CLI (had to run python -m hvplot.cli sresa1b_ncar_ccsm3-example.nc). Unfortunately, it didn't work. I got a not so informative error page and a long traceback.

image
Launching server at http://localhost:5006
WARNING:param.Image02223: Image dimension lat is  not evenly sampled to relative tolerance of 0.001. Please use the QuadMesh element for irregularly sampled data or set a higher tolerance on hv.config.image_rtol or the rtol parameter in the Image constructor.
WARNING:param.Image02223: Image dimension lat is  not evenly sampled to relative tolerance of 0.001. Please use the QuadMesh element for irregularly sampled data or set a higher tolerance on hv.config.image_rtol or the rtol parameter in the Image constructor.
Uncaught exception GET / (127.0.0.1)
HTTPServerRequest(protocol='http', host='localhost:5006', method='GET', uri='/', version='HTTP/1.1', remote_ip='127.0.0.1')
Traceback (most recent call last):
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/tornado/web.py", line 1786, in _execute
    result = await result
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/panel/io/server.py", line 490, in get
    session = await self.get_session()
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/panel/io/server.py", line 474, in get_session
    session = await super().get_session()
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/bokeh/server/views/session_handler.py", line 145, in get_session
    session = await self.application_context.create_session_if_needed(session_id, self.request, token)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/bokeh/server/contexts.py", line 240, in create_session_if_needed
    self._application.initialize_document(doc)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/panel/io/server.py", line 410, in initialize_document
    super().initialize_document(doc)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/bokeh/application/application.py", line 190, in initialize_document
    h.modify_document(doc)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/bokeh/application/handlers/function.py", line 140, in modify_document
    self._func(doc)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/panel/io/server.py", line 153, in _eval_panel
    doc = as_panel(panel)._modify_doc(server_id, title, doc, location)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/panel/viewable.py", line 315, in _modify_doc
    return self.server_doc(doc, title, location) # type: ignore
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/panel/viewable.py", line 1025, in server_doc
    model = self.get_root(doc)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/panel/layout/base.py", line 306, in get_root
    root = super().get_root(doc, comm, preprocess)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/panel/viewable.py", line 667, in get_root
    root = self._get_model(doc, comm=comm)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/panel/layout/base.py", line 174, in _get_model
    objects, _ = self._get_objects(model, [], doc, root, comm)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/panel/layout/base.py", line 156, in _get_objects
    child = pane._get_model(doc, root, model, comm)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/panel/layout/base.py", line 174, in _get_model
    objects, _ = self._get_objects(model, [], doc, root, comm)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/panel/layout/base.py", line 156, in _get_objects
    child = pane._get_model(doc, root, model, comm)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/panel/layout/base.py", line 174, in _get_model
    objects, _ = self._get_objects(model, [], doc, root, comm)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/panel/layout/tabs.py", line 211, in _get_objects
    rendered[pref] = child = pane._get_model(doc, root, model, comm)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/panel/layout/base.py", line 174, in _get_model
    objects, _ = self._get_objects(model, [], doc, root, comm)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/panel/layout/base.py", line 156, in _get_objects
    child = pane._get_model(doc, root, model, comm)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/panel/layout/base.py", line 174, in _get_model
    objects, _ = self._get_objects(model, [], doc, root, comm)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/panel/layout/base.py", line 156, in _get_objects
    child = pane._get_model(doc, root, model, comm)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/panel/pane/holoviews.py", line 413, in _get_model
    plot = self._render(doc, comm, root)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/panel/pane/holoviews.py", line 508, in _render
    return renderer.get_plot(self.object, **kwargs)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/holoviews/plotting/bokeh/renderer.py", line 70, in get_plot
    plot = super().get_plot(obj, doc, renderer, **kwargs)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/holoviews/plotting/renderer.py", line 241, in get_plot
    plot.update(init_key)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/holoviews/plotting/plot.py", line 943, in update
    return self.initialize_plot()
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/holoviews/plotting/bokeh/element.py", line 1859, in initialize_plot
    self._init_glyphs(plot, element, ranges, source)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/holoviews/plotting/bokeh/element.py", line 1778, in _init_glyphs
    data, mapping, style = self.get_data(element, ranges, style)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/holoviews/plotting/bokeh/raster.py", line 110, in get_data
    img = element.dimension_values(i, flat=False)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/holoviews/core/data/__init__.py", line 195, in pipelined_fn
    result = method_fn(*args, **kwargs)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/holoviews/core/data/__init__.py", line 1089, in dimension_values
    values = self.interface.values(self, dim, expanded, flat)
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/holoviews/core/data/xarray.py", line 424, in values
    data = cls.canonicalize(dataset, data, data_coords=data_coords,
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/holoviews/core/data/grid.py", line 328, in canonicalize
    data = np.squeeze(data, axis=tuple(dropped))
  File "<__array_function__ internals>", line 200, in squeeze
  File "/Users/mliquet/miniconda3/envs/hvplot310/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 1571, in squeeze
    return squeeze(axis=axis)
ValueError: cannot select an axis to squeeze out which has size not equal to one

I think this demonstrates that building "an application scientists can easily use to explore datasets" is not as simple as exposing some features of hvPlot in a CLI. I'm overall 0- on this feature, though I won't block it if others want to push for it. (Generally, I'm going to push strongly for fixing hvPlot/HoloViews and documenting them better instead of adding new features, in particular when there is funding to do so).

If this CLI makes it in the code base in the end, would it be possible in the future to add another subcommand if we have this need? I'm not sure how argparse works.

@jbednar
Copy link
Member

jbednar commented Jan 22, 2024

@maximlt , would your error message have been any different when invoking hvPlot from within Python on the same file?

My guess is that the CLI interface makes it very easy to invoke issues that were already present in hvPlot and Explorer itself, making it a useful way to improve testing and usability in general. If a user opens Jupyter, loads that file into xarray, launches the Explorer and gets the same error, that's a usability issue we should be addressing anyway; we just found it much more easily.

Of course, some issues will come from the file-loading handling itself, which I argue in #1150 (comment) is a genuine tension but something larger than this specific PR. Making file loading easy and robust is important for HoloViz users and currently not properly localized into a specific library.

@TomNicholas
Copy link

This seems extremely useful, especially if it can be run simply over ssh. The number of co-workers who know about pangeo but still use ncview sometimes is high. One might argue that its an alternative path to lowering the difficulty of using hvplot for simple visualizations. I have no strong opinions about which repo it lives in.

@philippjfr
Copy link
Member

Over ssh you'd have to tunnel the port I suppose, but that does seem workable.

@ahuang11
Copy link
Collaborator Author

Yes tunneling is quite straightforward and common; that's how I used Jupyter notebooks before. https://bluewaters.ncsa.illinois.edu/pythonnotebooks

@maximlt
Copy link
Member

maximlt commented Jan 25, 2024

@maximlt , would your error message have been any different when invoking hvPlot from within Python on the same file?

No, I can reproduce it with:

import hvplot.xarray
import xarray as xr

ds = xr.open_dataset("/Users/mliquet/dev/hvplot/sresa1b_ncar_ccsm3-example.nc")
ds.hvplot.explorer()

My guess is that the CLI interface makes it very easy to invoke issues that were already present in hvPlot and Explorer itself, making it a useful way to improve testing and usability in general. If a user opens Jupyter, loads that file into xarray, launches the Explorer and gets the same error, that's a usability issue we should be addressing anyway; we just found it much more easily.

The more interfaces the more likely bugs and edge cases are going to be surfaced; I'm not sure that's a good argument to motivate adding more interfaces to a piece of functionality vs. doing better testing. Talking about usability, at the moment the explorer doesn't allow selecting between the variables of an xarray Dataset.

image

I'll open an issue for each of these problems.

@maximlt
Copy link
Member

maximlt commented Feb 2, 2024

We haven't managed to reach a consensus on whether this feature belongs to hvPlot or not. @droumis did an amazing job when setting up the HoloViz and per-project governance models, giving us a decision model for this kind of case. Any maintainer can call a vote, let's do that then.

@philippjfr @hoxbro @ahuang11 please reply with your vote, either +1 or -1. If you want to add more motivation than what you wrote in this PR or #1150, please do so.

-1 for me:

  • I have doubt the CLI would in fact replace the more dedicated applications scientists use to explore their files, it's hard to compete with specialized tools.
  • My wish for hvPlot, as explained in its new roadmap, would be to clean up what the existing (docs, developer experience, bug fixes, tests) to make the library more robust and pleasant to use. Adding a CLI goes in the opposite direction in my opinion, requiring us to add another entry point in the doc, write new kinds of tests, manage more dependencies on the CI, take care of more bug reports and feature requests, etc.

To precise my vote, I would like to say that it is a temporary -1, I would be open to revisiting this idea in the future. If we vote no, I'd suggest keeping #1150 open to let users interested in this feature chime in/upvote, updating the OP with some more context and the scope of the feature (the issue focused on xarray but the implementation in this PR interfaces with other libraries/file types).

@ahuang11
Copy link
Collaborator Author

ahuang11 commented Feb 5, 2024

+1:

  • other tools require sudo apt-get install, conda, or manual build; hvplot is just a pip install away
  • additionally, hvPlot is extremely interactive; zoom, hover over values, and also adjusting variable selection
  • the CLI is a thin wrapper; not much addition besides a file ingestion engine so not much to additionally maintain; we can do a feature freeze to work out other issues. understandably this is a bit of a scope creep, but I imagine covers 90% of use cases

However, I'm okay if this lives elsewhere, and actually starting to lean towards Lumen using lumen explore test.nc which will format a template Lumen spec with the necessary inputs. This can help bring attention to Lumen too and expose Lumen to the gridded dataset world too. Else, I'd still like to release this somewhere like under a new holoviz-incubator or even my personal account so others can take advantage of it.

@hoxbro
Copy link
Member

hoxbro commented Feb 8, 2024

-1

@philippjfr
Copy link
Member

On balance I will abstain from the vote and record my vote as -0. I am not the day to day maintainer for hvPlot so the maintenance concerns are not front of mind for me, so I primarily want to respect @maximlt's concerns as lead maintainer. I do see the utility for the CLI particularly for the climate community working with gridded data but would like to work towards supporting that at the Lumen level.

@maximlt
Copy link
Member

maximlt commented Mar 15, 2024

Oops, I forgot to come back to this! With 2 -1, 1 +1 and 1 -0, we're not going to move forward with this feature. I'm going to close this PR. However, as I already wrote, I suggest keeping the issue open and updating its initial message to better reflect the scope of the intended CLI, and encourage people to upvote the issue/chime in.

@maximlt maximlt closed this Mar 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants