Histograms¶
histbook has only one histogram class, which can have arbitrarily many independent dimensions (binned axes) and dependent dimensions (profiles). Plotable one and two-dimensional histograms are derived by projection.
-
class
histbook.hist.Hist(*axis, **opts)¶ Parameters: *axis (
Axis) – axis or axes that define the independent and dependent variables of the histogramKeyword Arguments: - weight (
None, algebraic expression (string) or number) – ifNone(default), data will be filled with weight1; if an expression, weights are computed from the expression; if a number, weights are constant - filter (
None, algebraic expression (string)) – if notNone, data will be filtered through this logical expression (equivalent to multiplyingweightbywhere(filter, 1, 0)) - defs (
Noneor dict of str → algebraic expression (string) orExpr) – if notNone, definitions to use when computing expressions - fill (
None, single Numpy array or dict of str → Numpy arrays) – if notNone, data to immediately fill after constructing the histogram; single Numpy array is only permitted if there’s only one field - attachment (
Noneor dict of str → any JSON) – histogram metadata, such as fit results or other context - systematic (
None, tuple of numbers) – the systematic error vector this histogram represents; a special case of attachment (and stored in attachment)
-
COUNTTYPE¶ alias of
numpy.float64
-
area(axis=None, profile=None, error=False, normalized=False, width=None, height=None, title=None, config=None, xscale=None, yscale=None, colorscale=None, shapescale=None)¶ Display bins in
axis(if not the only axis) as areas on the horizontal axis.Parameters: - axis (
None,Axis, algebraic expression (lambda or string), or index position (integer)) – the axis to overlay; ifNone(default), use the only axis in thisHist - profile (
None,profile, algebraic expression (lambda or string) or index position (integer)) – ifNone(default), display bin counts; otherwise, display profile means (and errors on the mean) - error (bool) – if
True, overlay error bars - normalized (bool) – if
True, normalize the histogram - height, title, config, xscale, yscale, colorscale, shapescale (width,) – graphical directives to pass to Vega-Lite
Returns: Return type: - axis (
-
attach(key, value)¶ Add an attachment to the histogram (changing it in-place and returning it).
-
attachment¶ Python dict of attachment metadata (linked, not a copy).
-
axis¶ The axes that define a histogram’s binning of space.
-
bar(axis=None, profile=None, error=False, normalized=False, width=None, height=None, title=None, config=None, xscale=None, yscale=None, colorscale=None, shapescale=None)¶ Display bins in
axis(if not the only axis) as bars on the horizontal axis.Parameters: - axis (
None,Axis, algebraic expression (lambda or string), or index position (integer)) – the axis to overlay; ifNone(default), use the only axis in thisHist - profile (
None,profile, algebraic expression (lambda or string) or index position (integer)) – ifNone(default), display bin counts; otherwise, display profile means (and errors on the mean) - error (bool) – if
True, overlay error bars - normalized (bool) – if
True, normalize the histogram - height, title, config, xscale, yscale, colorscale, shapescale (width,) – graphical directives to pass to Vega-Lite
Returns: Return type: - axis (
-
below(axis)¶ Display bins in
axisnext to each other vertically.Parameters: axis ( Axis, algebraic expression (lambda or string), or index position (integer)) – the axis to overlayReturns: Return type: PlottingChain
-
beside(axis)¶ Display bins in
axisnext to each other horizontally.Parameters: axis ( Axis, algebraic expression (lambda or string), or index position (integer)) – the axis to overlayReturns: Return type: PlottingChain
-
clear()¶ Effectively reset all bins to zero.
-
cleared()¶ Return a copy with all bins set to zero.
-
compatible(other)¶ Returns True if the histograms have the same non-profile axis types and binning, regardless of the expressions used to compute them.
-
copy()¶ Return an immediate copy of the histogram.
-
copyonfill()¶ Return a copy of the histogram whose content is copied if filled.
-
defs¶ Definitions used by axis expressions.
-
detach(key)¶ Remove an attachment from the histogram (changing it in-place and returning it).
-
drop(*profile)¶ Remove one or more
profileaxes.Parameters: *profile ( profile, algebraic expression (string), or index position (integer)) – the axis or axes to dropReturns: a histogram without the selected profiles.Return type: Hist
-
fields¶ Names of fields that must be provided in the
fillmethod.
-
fill(arrays=None, **more)¶ Fill the histogram: identify bins for independent variables, increase their counts by
1orweight, and increment any profile (dependent variable) means and errors in the means.All arrays must have the same length (one-dimensional shape). Numbers are treated as one-element arrays.
Parameters: - arrays (dict → Numpy array or number; Spark DataFrame; Pandas DataFrame) – field values to use in the calculation of independent and dependent variables (axes)
- **more (Numpy arrays or numbers) – more field values
-
filter(expr)¶ Returns a copy of this histogram with
expras filter (for fluent construction).
-
fraction(*cut, **opts)¶ Return a table of the fraction of entries that pass a set of cuts in each bin.
Parameters: *cut (
profile) – the cut axis or axes to include in the tableKeyword Arguments: - count (bool) – if
True(default), include the (possibly weighted) count of entries in each bin (denominator of the fraction) - error (string or
None) – if notNone, include “errors” on all parameters (uncertainty in the mean of the distribution the count or fraction represents); options are"clopper-pearson","normal"(default),"wilson","agresti-coull","feldman-cousins","jeffrey","bayesian-uniform" - level (number or iterable of numbers) – confidence level or levels at which to evaluate error; default is erf(sqrt(0.5)) or 0.6827, otherwise known as “one sigma”
- recarray (bool) – if
True(default), return results as a Numpy record array, which is rank-2 with named columns; ifFalse, return a plain Numpy array, which is rank-N for N axes and has no column labels. - columns (bool) – if
True(not default), return a 2-tuple in which the second argument is a list of column labels.
- count (bool) – if
-
get(key, *default)¶ Get an item of attachment metadata.
If
keyisn’t found and nodefaultis specified, raise aKeyError. Ifkeyisn’t found and adefaultis provided, return thedefaultinstead.Only one
defaultis allowed.
-
classmethod
group(by='source', **hists)¶ Combine histograms, maintaining their distinctiveness by adding a new categorical axis to each.
To combine histograms by adding bins, just use the
+operator.Parameters: - by (string) – name of the new axis (must not already exist)
- **hists (
Hist) – histograms to combine (must have the same axes)
-
groupkeys(axis)¶ Return all categorical keys associated with a groupby axis or non-zero bins associated with a groupbin axis.
Parameters: axis ( Axis, algebraic expression (string), or index position (integer)) – the groupby or groupbin axisReturns: all keys for this axis, even if that is a union over other group axes Return type: set
-
has(key)¶ Returns
Trueifkeyexists in the attachment metadata.
-
heatmap(xaxis=None, yaxis=None, profile=None, width=None, height=None, title=None, config=None, xscale=None, yscale=None, colorscale=None)¶ Display bins in
xaxisandyaxis(if not the only two axes) as a heatmap.Parameters: - xaxis (
None,Axis, algebraic expression (lambda or string), or index position (integer)) – the horizontal axis to overlay; ifNone(default), use the first of the only two axes in thisHist - yaxis (
None,Axis, algebraic expression (lambda or string), or index position (integer)) – the vertical axis to overlay; ifNone(default), use the second of the only two axes in thisHist - profile (
None,profile, algebraic expression (lambda or string) or index position (integer)) – ifNone(default), display bin counts; otherwise, display profile means (and errors on the mean) - height, title, config, xscale, yscale, colorscale (width,) – graphical directives to pass to Vega-Lite
Returns: Return type: - xaxis (
-
line(axis=None, profile=None, error=False, normalized=False, width=None, height=None, title=None, config=None, xscale=None, yscale=None, colorscale=None, shapescale=None)¶ Display bins in
axis(if not the only axis) as lines on the horizontal axis.Parameters: - axis (
None,Axis, algebraic expression (lambda or string), or index position (integer)) – the axis to overlay; ifNone(default), use the only axis in thisHist - profile (
None,profile, algebraic expression (lambda or string) or index position (integer)) – ifNone(default), display bin counts; otherwise, display profile means (and errors on the mean) - error (bool) – if
True, overlay error bars - normalized (bool) – if
True, normalize the histogram - height, title, config, xscale, yscale, colorscale, shapescale (width,) – graphical directives to pass to Vega-Lite
Returns: Return type: - axis (
-
marker(axis=None, profile=None, error=True, normalized=False, width=None, height=None, title=None, config=None, xscale=None, yscale=None, colorscale=None, shapescale=None)¶ Display bins in
axis(if not the only axis) as markers on the horizontal axis.Parameters: - axis (
None,Axis, algebraic expression (lambda or string), or index position (integer)) – the axis to overlay; ifNone(default), use the only axis in thisHist - profile (
None,profile, algebraic expression (lambda or string) or index position (integer)) – ifNone(default), display bin counts; otherwise, display profile means (and errors on the mean) - error (bool) – if
True, overlay error bars - normalized (bool) – if
True, normalize the histogram - height, title, config, xscale, yscale, colorscale, shapescale (width,) – graphical directives to pass to Vega-Lite
Returns: Return type: - axis (
-
overlay(axis)¶ Display bins in
axisoverlaid on each other in different colors.Parameters: axis ( Axis, algebraic expression (lambda or string), or index position (integer)) – the axis to overlayReturns: Return type: PlottingChain
-
pandas(*axis, **opts)¶ Exports the data in the histogram to a Pandas DataFrame.
Parameters: *axis ( Axis, algebraic expression (lambda or string), or index position (integer)) – axis or axes to include in the table; if no axes or allprofileaxes, this function callsHist.table; if allcutaxes, this function callsHist.fractionKeyword Arguments: **opts (any) – passed to Hist.tableorHist.fraction; see these methods for options
-
project(*axis)¶ Project onto a given set of
axis.Parameters: *axis ( Axis, algebraic expression (string), or index position (integer)) – the axis or axes to keep (allprofileaxes are kept)Returns: a histogram projected onto the selected axis or axes. Return type: Hist
-
rebin(axis, edges)¶ Reduce the number of bins by combining existing bins at a specified set of
edges.Parameters: - axis (
Axis, algebraic expression (string), or index position (integer)) – the axis to rebin - edges (iterable of numbers) – new bin edges; must be a subset of existing bin edges
Returns: a rebinned histogram.
Return type: - axis (
-
rebinby(axis, factor)¶ Reduce the number of bins by an approximate
factorby combining existing bins.Parameters: - axis (
Axis, algebraic expression (string), or index position (integer)) – the axis to rebin - factor (positive integer) – number of bins to combine into a single bin (inexact if the number of bins in
axisis not an exact multiple offactor)
Returns: a rebinned histogram.
Return type: - axis (
-
root(*axis, **opts)¶ Exports the data in the histogram to a ROOT histogram.
The histogram may need to be selected (
Hist.select) or projected (Hist.project) to make it useful in a ROOT histogram.Parameters: *axis (
Axis, algebraic expression (lambda or string), or index position (integer)) – axis or axes to include in the output; if no axes, this function returns a histogram of counts; if oneprofile, this function returns a profile; …Keyword Arguments: - name (string) – name to give to the ROOT object (default is the empty string)
- title (string) – title to give to the ROOT object (default is the empty string)
- cache (dict-like object) – if supplied, the return value is inserted into
cachekeyed by its name; this for convenience (ROOT objects must be kept in scope to be drawn)
-
select(expr, tolerance=1e-12)¶ Eliminate bins by selecting data with a boolean
expr.Parameters: - expr (algebraic expression (string)) – boolean expression of data to keep; selection thresholds must align with bin edges with the right inequality (e.g.
<vs<=) - tolerance (small positive number) – absolute difference between selection threshold and bin edge to qualify as a match
Returns: a histogram with data removed (fewer bins)
Return type: - expr (algebraic expression (string)) – boolean expression of data to keep; selection thresholds must align with bin edges with the right inequality (e.g.
-
stack(axis, order=None)¶ Display bins in
axisstacked on one another in an area plot.Parameters: - axis (
Axis, algebraic expression (lambda or string), or index position (integer)) – the axis to overlay - order (iterable of strings) – stacking order of bins
Returns: Return type: - axis (
-
step(axis=None, profile=None, error=False, normalized=False, width=None, height=None, title=None, config=None, xscale=None, yscale=None, colorscale=None, shapescale=None)¶ Display bins in
axis(if not the only axis) as steps on the horizontal axis.Parameters: - axis (
None,Axis, algebraic expression (lambda or string), or index position (integer)) – the axis to overlay; ifNone(default), use the only axis in thisHist - profile (
None,profile, algebraic expression (lambda or string) or index position (integer)) – ifNone(default), display bin counts; otherwise, display profile means (and errors on the mean) - error (bool) – if
True, overlay error bars - normalized (bool) – if
True, normalize the histogram - height, title, config, xscale, yscale, colorscale, shapescale (width,) – graphical directives to pass to Vega-Lite
Returns: Return type: - axis (
-
systematic(vector)¶ Returns a copy of this histogram with
vectoras systematic (for fluent construction).
-
table(*profile, **opts)¶ Return histogram data as a table of counts and, optionally, dependent variables (profiles).
Parameters: *profile (
profile) – the dependent variables to include in the tableKeyword Arguments: - count (bool) – if
True(default), include the (possibly weighted) count of entries in each bin - effcount (bool) – if
True(not default), include the effective count, which is used to convert between weighted profile errors and weighted profile spreads (equal tocountfor unweighted data) - error (bool) – if
True(default), include “errors” on all parameters (uncertainty in the mean of the distribution the count or profile average represents) - normalized (bool) – if
True(not default), scale eachcountanderr(count)such that the sum over counts times bin widths is 1; does not affect profiles - recarray (bool) – if
True(default), return results as a Numpy record array, which is rank-2 with named columns; ifFalse, return a plain Numpy array, which is rank-N for N axes and has no column labels. - columns (bool) – if
True(not default), return a 2-tuple in which the second argument is a list of column labels.
- count (bool) – if
-
togroup()¶ Add histograms to the
groupbythat is the first axis.Histograms created with
Hist.grouphave a first axis that is agroupby.Keyword Arguments: **hists (dict of str → Hist) – histograms to add to the existing group
-
weight(expr)¶ Returns a copy of this histogram with
expras weights (for fluent construction).
- weight (