Nice and simple histogram plotting with matplotlib, including easy importing of ROOT histograms. See the following sections for examples.
import plotter
%matplotlib inline
# %matplotlib notebook
import numpy as np
np.random.seed(42)
from utils import Hist1D, MET_LATEX
import matplotlib as mplThe Hist1D object wraps a numpy array or a ROOT histogram into a common histogram object, so ROOT histograms can be easily swapped with numpy array histograms. A histogram object is simply a set of bin counts, errors, and edges.
Instantiate a Hist1D object with a numpy array and a binning scheme. Then print it. The __repr__ shows bin counts and errors, which are Poissonian by default. Note that other kwargs to Hist1D get passed to a np.histogram() call, so you retain the full functionality of numpy. Bin counts, edges, etc. are all stored in numpy arrays, so operations on these are optimized.
h1 = Hist1D(np.random.normal(0,1,300),bins=np.linspace(-3.5,3.5,10))
print h1<Hist1D:
[ 1.00 ± 1.00 5.00 ± 2.24 32.00 ± 5.66 69.00 ± 8.31
95.00 ± 9.75 70.00 ± 8.37 20.00 ± 4.47 7.00 ± 2.65
0.00 ± 0.00]
>
Take a histogram from ROOT and turn it into a Hist1D. Bin contents, edges, and errors are carried over and now you have a nice Hist1D object.
import ROOT as r
hroot = r.TH1F("h1","h1",10,0,10)
hroot.FillRandom("gaus")
h1 = Hist1D(hroot)
print h1<Hist1D:
[3462.00 ± 58.84 1290.00 ± 35.92 235.00 ± 15.33 12.00 ± 3.46
1.00 ± 1.00 0.00 ± 0.00 0.00 ± 0.00 0.00 ± 0.00
0.00 ± 0.00 0.00 ± 0.00]
>
Warning in <TROOT::Append>: Replacing existing TH1: h1 (Potential memory leak).
Now that we've made these histogram objects, what can we do? Math. Errors are propagated with gaussian rules.
h1 = Hist1D(np.random.normal(0,1,300),bins=np.linspace(-3.5,3.5,10))
h2 = Hist1D(np.random.normal(0,1,300),bins=np.linspace(-3.5,3.5,10))
h3 = Hist1D(np.random.normal(0,1,300),bins=np.linspace(-3.5,3.5,10))
print 0.5*(h1+h2-0.2*h2)/h3<Hist1D:
[ 0.00 ± 0.00 0.85 ± 0.44 0.91 ± 0.23 1.23 ± 0.20
0.87 ± 0.12 0.76 ± 0.12 0.82 ± 0.19 0.74 ± 0.33
nan ± nan]
>
What about asymmetric binomial errors for efficiencies, for example?
hratio = h2.divide(h3,binomial=True)
print hratio.get_errors_up()
print hratio.get_errors_down()[0.60169107 0.21193764 nan nan 0.02047241 0.03334018
nan 0.09209998 nan]
[ nan 0.28202753 nan nan 0.03270268 0.04608427
nan 0.21156139 nan]
Now that we have these histogram objects, we can start making plots. Start off with a very simple 3 line example. First, make a couple of Hist1D objects, and use the label kwarg to store extra information in the object (which will become a legend label eventually).
Next, just call plot_stack() with a list of histograms to stack. Easy.
plot_stack() returns the Figure object and a list of Axes objects.
h1 = Hist1D(np.random.normal(0,1,300),label="h1",bins=np.linspace(-3.5,3.5,30))
h2 = Hist1D(np.random.normal(0,1,300),label="h2",bins=np.linspace(-3.5,3.5,30))
plotter.plot_stack([h1,h2])(<matplotlib.figure.Figure at 0x1215a0610>,
[<matplotlib.axes._subplots.AxesSubplot at 0x1214b6210>])
You can also attach a color to Hist1D objects. Any kind of matplotlib-acceptable color works, as you see from the different types below. The plot_stack call has more options here for labels. mpl_hist_params is a dictionary of kwargs that get passed to the hist draw call in matplotlib, so customization is straightforward (and not complicated, because options can be found in the matplotlib documentation).
Additionally, there is a ratio option. Using the math capabilities of the Hist1D objects, you can specify an arbitrary ratio to show in a panel below the main pad. This pad shows up if you specify data in the plot_stack call, or explicitly give a ratio parameter.
bgs = [
Hist1D(np.random.normal(0,1,300),label="x2",bins=np.linspace(-3.5,3.5,30),color=(0.4,0.4,1)),
Hist1D(np.random.normal(0,1,300),label="x3",bins=np.linspace(-3.5,3.5,30),color="lightblue"),
Hist1D(np.random.normal(0,1,400),label="x4",bins=np.linspace(-3.5,3.5,30),color="darkblue"),
Hist1D(np.random.normal(0,1,300),label="x5",bins=np.linspace(-3.5,3.5,30),color=mpl.cm.get_cmap('Spectral')(0.9)),
]
plotter.plot_stack(
bgs=bgs,
title="test plot",
xlabel="xaxis",
ylabel="Events",
mpl_hist_params={"edgecolor":"k","linewidth":0.2},
ratio=(bgs[1]+bgs[2])/bgs[0],
)(<matplotlib.figure.Figure at 0x1219b5f10>,
[<matplotlib.axes._subplots.AxesSubplot at 0x12189b850>,
<matplotlib.axes._subplots.AxesSubplot at 0x12114af90>])
Here is an example of two unstacked histograms (one is a subset of the other) with an "efficiency" ratio (with asymmetric errors) below.
gaus1 = np.random.normal(0,1,1000)
bins = np.linspace(-3,3,36)
gaus2 = gaus1[gaus1 > np.random.normal(0,1,1000)]
h1 = Hist1D(gaus1,bins=bins,label="h1")
h2 = Hist1D(gaus2,bins=bins,label="h2")
hratio = h2.divide(h1,binomial=True)
plotter.plot_stack(
bgs=[h2,h1],
title="test title",
xlabel=r"$\mathrm{p}_\mathrm{T}$",
ylabel="Entries",
mpl_hist_params={"edgecolor":"k","linewidth":0.2,"stacked":False,"alpha":0.7},
ratio=hratio
)(<matplotlib.figure.Figure at 0x1217aec10>,
[<matplotlib.axes._subplots.AxesSubplot at 0x1215fde10>,
<matplotlib.axes._subplots.AxesSubplot at 0x120c37f50>])
Here's a more "realistic" (but still fake) plot. This time, a signal histogram is included. Note the mpl_figure_params dictionary. These are kwargs to the plt.figure() call. It's used here to specify the figure size since Jupyter hijacks it and creates small figures, which would cause the legend to overlap with many things.
bginfo = [
("Charge misid." , [0.4, 0.4, 0.4]),
("Rare" , [1.0, 0.4, 1.0]),
("$X\\gamma$" , [0.4, 0.0, 0.8]),
("$t\\bar{t}VV$" , [0.4, 0.6, 1.0]),
("$t\\bar{t}Z$" , [0.4, 0.8, 0.4]),
("Nonprompt lep.", [0.9, 0.9, 0.9]),
("$t\\bar{t}H$" , [0.4, 0.4, 0.6]),
("$t\\bar{t}W$" , [0.0, 0.4, 0.0]),
]
bgs = []
for label, color in bginfo:
bg = Hist1D(np.random.normal(5.8,2,30), bins=np.linspace(0,15,15), label=label, color=color)
bgs.append(bg)
# fake data
data = Hist1D(np.random.normal(5.8,2,240), bins=bgs[0].get_edges(), color="k", label="Data")
# get sig and scale by 5
label_sig = "$t\\bar{t}t\\bar{t} \\times 5$"
sig = Hist1D(np.random.normal(10.5,1.5,500), weights=0.02*np.ones(500), bins=np.linspace(0,15,15), label=label_sig,color="r")
sig *= 5
# labels
title = "Discriminator"
xlabel = "{} [GeV]".format(MET_LATEX)
ylabel = "Events"
plotter.plot_stack(
bgs=bgs,
data=data,
sigs=[sig],
title=title,
xlabel=xlabel,
ylabel=ylabel,
mpl_figure_params={"figsize":(8,6)},
cms_type="Fakeplot",
lumi="75.0",
)(<matplotlib.figure.Figure at 0x121994ad0>,
[<matplotlib.axes._subplots.AxesSubplot at 0x120becf10>,
<matplotlib.axes._subplots.AxesSubplot at 0x120e3e490>])
Various matplotlib styles can be used. Can't guarantee that they all look pretty out of the box, but the plotter works with matplotlib's style contexts. You can also write out the figure to a file via the filename argument.
with mpl.pyplot.style.context("ggplot"):
# with mpl.pyplot.style.context('fivethirtyeight'):
plotter.plot_stack(
bgs=bgs,
data=data,
sigs=[sig],
title=title,
xlabel=xlabel,
ylabel=ylabel,
mpl_figure_params={"figsize":(8,6)},
filename="test.png",
)from IPython.display import Image
Image(filename='test.png') Two dimensional histograms are supported as well, with some nice plotting goodies. Construction of a Hist2D parallels Hist1D. Thus, ROOT histograms can be passed in to make a Hist2D object, or a two-column numpy array can be used (for the x and y values). weights and mathematical operations are also handled correctly, as for Hist1D.
from utils import Hist2D
import numpy as np
N = 200
vals = np.c_[ np.random.normal(0,1.5,N), np.random.uniform(-4.0,4.0,N) ]
bins = [ np.linspace(-5,5,5), np.linspace(-5,5,6) ]
hist = Hist2D(vals,bins=bins)
print hist<Hist2D:
[[ 0.00 ± 0.00 9.00 ± 3.00 11.00 ± 3.32 1.00 ± 1.00]
[ 2.00 ± 1.41 28.00 ± 5.29 22.00 ± 4.69 1.00 ± 1.00]
[ 0.00 ± 0.00 24.00 ± 4.90 17.00 ± 4.12 5.00 ± 2.24]
[ 2.00 ± 1.41 29.00 ± 5.39 26.00 ± 5.10 1.00 ± 1.00]
[ 1.00 ± 1.00 12.00 ± 3.46 9.00 ± 3.00 0.00 ± 0.00]]
>
2D histograms can be converted into 1D projections (i.e., sum up bins along x/y) or 1D profiles (get means of slices along x/y).
print hist.get_x_profile()
print hist.get_y_profile()
print hist.get_x_projection()
print hist.get_y_projection()<Hist1D:
[ 1.00 ± 0.45 20.40 ± 2.02 17.00 ± 1.84 1.60 ± 0.57]
>
<Hist1D:
[ 5.25 ± 1.02 13.25 ± 1.63 11.50 ± 1.52 14.50 ± 1.70
5.50 ± 1.05]
>
<Hist1D:
[ 5.00 ± 2.24 102.00 ± 10.10 85.00 ± 9.22 8.00 ± 2.83]
>
<Hist1D:
[21.00 ± 4.58 53.00 ± 7.28 46.00 ± 6.78 58.00 ± 7.62
22.00 ± 4.69]
>
Let's make a larger histogram and plot it.
N = 10000
vals = np.c_[ np.random.normal(0,1.5,N), np.random.uniform(-4.0,4.0,N) ]
bins = [ np.linspace(-5,5,50), np.linspace(-5,5,50) ]
hist = Hist2D(vals,bins=bins)
_ = plotter.plot_2d(hist)Marginal 1D histograms can be drawn as well. Use do_projection for projections (i.e., sum up bins along x/y) and do_profile for profiles (get means of slices along x/y). The matplotlib colorscheme name can also be specified via cmap. (Note that putting _r at the end of a colorscheme name makes matplotlib reverse the color scale.)
_ = plotter.plot_2d(hist, do_projection=True, xlabel="x label", ylabel="ylabel", cms_type="Fake", cmap="inferno")






