Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fastplot plot function #131

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jerabaul29
Copy link
Collaborator

Using trajan I regularly feel like I wait quite a bit when plotting, and I miss a "fastplot" function, that does "simple and basic but fast" plotting without too many options to tune. The default plotting functions are very nice, but they can be slow especially with large datasets, when land is in the view, etc.

The idea with "fastplot" is to provide a rudimentary plot that is drawn very fast and provides few options - for advanced plotting, the other already existing functions should be used.

A couple of points:

  • it uses PlateCarree for the projection, and no transform, as I have found when working with large dataset that the transform arg makes the plotting very slow
  • it also optionally sets the central_longitude using scipy.stats.circmean; this way, if there is a dataset that is "heavily centered" around a given region of the world, this is what will be used by PlateCarree
  • it does not plot lines, only markers - this way, we avoid "crossing lines" on global dataset

I agree that it has quite some overlap with other existing functions, but I think that for large datasets, it is convenient to have a "quick and efficient" plot function to try in just one call in early data exploration. If you do not want to merge this e.g. because it is "redundant ad hoc functionality for a specific use case", maybe we should think about how the existing functions could get a flag or similar to optionally get this kind of behavior? :)

@jerabaul29
Copy link
Collaborator Author

(test coming soon)

@gauteh
Copy link
Member

gauteh commented Oct 25, 2024

We have the land='mask' options to plot, and you can also plot in cartesian coordinates if you remove the CRS. I'm positive, but I am not sure what the difference is to scatter with no map?

@gauteh
Copy link
Member

gauteh commented Oct 25, 2024

You can also plot without land

@jerabaul29
Copy link
Collaborator Author

Maybe I just missed some of the options to turn stuff off and get scatter to go faster ^^ .

One other thing with scatter is that I was getting grey only plots I think, while it is a bit easier to visualize with different colors maybe? I can have an extra look at the different options in the days to come and summarize it here :) .

@gauteh
Copy link
Member

gauteh commented Oct 25, 2024

Scatter is not so well tested, so it has less functionality than lines. But the idea is to not use different colors when there are many trajectories (> 100). Maybe we can improve scatter a bit, or split things up so that you get this functionality.

@jerabaul29
Copy link
Collaborator Author

I have had a look at changing the scatter method directly by adding a color_by_trajectory_rank arg defaulting to False and some changes into the method body that look like:

        if 'color_by_trajectory_rank':
            numb = self.ds.sizes['trajectory']
            colormap_mapper = ColormapMapper(plt.get_cmap("viridis"), 0, self.ds.sizes['trajectory'])
            colors = np.transpose(np.vectorize(colormap_mapper.get_rgb)(np.arange(0, self.ds.sizes['trajectory'], 1.0)))
            colors = np.repeat(colors, len(self.ds.obs))
            kwargs['c'] = colors
            del kwargs['color']

where



class ColormapMapper:
    """A mapper from values to RGB colors using built in colormaps
    and scaling these."""

    def __init__(self, cmap, vmin, vmax, warn_saturated=False):
        """cmap: the matplotlib colormap to use, min: the min value to be plotted,
        max: the max value to be plotted."""
        self.vmin = vmin
        self.vmax = vmax
        self.warn_saturated = warn_saturated
        norm = mpl.colors.Normalize(vmin=vmin, vmax=vmax)
        self.normalized_colormap = cm.ScalarMappable(norm=norm, cmap=cmap)

    def get_rgb(self, val):
        """Get the RGB value associated with val given the normalized colormap
        settings."""
        if self.warn_saturated:
            if val < self.vmin:
                print("ColormapMapper warning: saturated low value")
            if val > self.vmax:
                print("ColormapMapper warning: saturated high value")

        return self.normalized_colormap.to_rgba(val)

if you want that we look further into this direction, the (for now not fully working and partially broken) attempt is available at:

https://github.com/jerabaul29/trajan/tree/feat/fastplot_on_scatter

However 1) this seems a bit slow compared to the fastplot method, 2) I hit some issues with different kinds of datasets and dimension sizes, for example when using ragged arrays (I am not sure that the auto-expansion is performed).

I wonder if it may be simplest to have a fastplot method as above that we can tune for speed without covering every corner case, rather than integrating complex logics into the other pre-existing plotting utilities?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants