Skip to content

Releases: histogrammar/histogrammar-python

v1.0.27

20 May 21:31

Choose a tag to compare

Version 1.0.27, May 2022

  • Multiple performance updates, thanks to Simon Brugman!
  • Turn off unnecessary specialize function (slow) for all Count objects.
  • Use pandas functions to infer datatypes and return numpy arrays.

v1.0.26

09 Apr 07:22

Choose a tag to compare

What's Changed

  • FIX: show edge of heatmap for categorical histograms by @mbaak in #47

v1.0.25

05 Apr 08:18
75eeaf2

Choose a tag to compare

Version 1.0.25, Apr 2021

  • Improve null handling in pandas dataframes, by inferring datatype using pandas' infer_dtype function.
  • nans in bool columns get converted to "NaN", so the column keeps True and False values in Categorize, not "1" and "0".
  • columns of type object get converted to strings using to_string(), of type string uses only_str().

v1.0.24

03 Apr 14:53
2207fc9

Choose a tag to compare

Version 1.0.24, Apr 2021

  • Categorize histogram now handles nones and nans in friendlier way, they are converted to "NaN".
  • make_histogram() now casts spark nulls to nan in case of numeric columns. scala interprets null as 0.
  • SparselyBin histograms did not add up nanflow when added. Now fixed.
  • Added unit test for doing checks on null conversion to nans
  • Use new histogrammar-scala jar files, v1.0.20
  • Added new histogrammar-scala v1.0.20 jar files to tests/jars/

v1.0.20

05 Feb 21:14

Choose a tag to compare

Added make_histograms functionality from popmon to dataframes

- Add n_dim and datatype properties to all histogram types
- Added unit test of new datetype property of histograms
- Moved sparksql.py to new dfinterface directory
- Added addmethods functions for spark and pandas
- Set default identity function for quantity for each histogram type
- Added make_histograms functionality from popmon to spark and pandas dataframes
- force sparselybin origin and binwidth to be float always
- Fix spark filling of multiple histograms
- make_histograms now includes all available histograms
- Updated docs strings for histogram makers
- Accept list and tuple as numpy array
- 1d plotting for irregularlybin and centrallybin histrograms
- Added default values for time_width and time_offset
- working unit tests for pandas_histogrammar and spark_histogrammar
- handy monkey patch functions for pandas and spark dataframes
- plot function for 2d irregularlybin
- in categorize plot sort labels alphabetically
- implemented matplotlib function for most 2d count histograms
- autopep8 of primitives
- fix flake8 errors of all code
- Bump version to 1.0.20
- added json dumper util function
- added bin_centers function for categorize histograms
- popmon reference in histogram_filler_base.py transform function
- fix small assignment bug in Sum histogram
- make convenience definitions work
- add basic tutorial notebook for histogrammar
- added advanced histogrammar notebook
- unit tests for notebooks
- added manifest and notice file
- update of readme.rst file
- remove python 2.7 and 3.4, 3.5 from travis config

1.0.9

15 Aug 19:47

Choose a tag to compare

Fixes two bugs in Categorize and SparselyBin when used with SparkSQL.

1.0.8

30 Mar 16:34

Choose a tag to compare

Fixes a bug in SparselyBin and adds an ASCII method to Bin.

1.0.7

22 Mar 14:41

Choose a tag to compare

Max Baak (mbaak) added visualization methods for Categorize in Matplotlib and PyROOT, as well as fixing some Python 3 bugs.

1.0.6

29 Nov 17:49

Choose a tag to compare

Until this version, the Categorize aggregator didn't have a correct Numpy implementation (one that permits sub-aggregators other than Count). This version allows nested sub-aggregators in a Categorize with Numpy and is also 10X faster.

1.0.5

11 Nov 00:15

Choose a tag to compare

Recognizes PySpark Columns as a kind of user function. Wraps PySpark DataFrames with Histogrammar methods, and the filling is dispatched to Histogrammar-Scala on the JVM.