Histograms

histbook has only one histogram class, which can have arbitrarily many independent dimensions (binned axes) and dependent dimensions (profiles). Plotable one and two-dimensional histograms are derived by projection.

class histbook.hist.Hist(*axis, **opts)
Parameters:

*axis (Axis) – axis or axes that define the independent and dependent variables of the histogram

Keyword Arguments:
 
  • weight (None, algebraic expression (string) or number) – if None (default), data will be filled with weight 1; if an expression, weights are computed from the expression; if a number, weights are constant
  • filter (None, algebraic expression (string)) – if not None, data will be filtered through this logical expression (equivalent to multiplying weight by where(filter, 1, 0))
  • defs (None or dict of str → algebraic expression (string) or Expr) – if not None, definitions to use when computing expressions
  • fill (None, single Numpy array or dict of str → Numpy arrays) – if not None, data to immediately fill after constructing the histogram; single Numpy array is only permitted if there’s only one field
  • attachment (None or dict of str → any JSON) – histogram metadata, such as fit results or other context
  • systematic (None, tuple of numbers) – the systematic error vector this histogram represents; a special case of attachment (and stored in attachment)
COUNTTYPE

alias of numpy.float64

area(axis=None, profile=None, error=False, normalized=False, width=None, height=None, title=None, config=None, xscale=None, yscale=None, colorscale=None, shapescale=None)

Display bins in axis (if not the only axis) as areas on the horizontal axis.

Parameters:
  • axis (None, Axis, algebraic expression (lambda or string), or index position (integer)) – the axis to overlay; if None (default), use the only axis in this Hist
  • profile (None, profile, algebraic expression (lambda or string) or index position (integer)) – if None (default), display bin counts; otherwise, display profile means (and errors on the mean)
  • error (bool) – if True, overlay error bars
  • normalized (bool) – if True, normalize the histogram
  • height, title, config, xscale, yscale, colorscale, shapescale (width,) – graphical directives to pass to Vega-Lite
Returns:

Return type:

Plotable1d

attach(key, value)

Add an attachment to the histogram (changing it in-place and returning it).

attachment

Python dict of attachment metadata (linked, not a copy).

axis

The axes that define a histogram’s binning of space.

bar(axis=None, profile=None, error=False, normalized=False, width=None, height=None, title=None, config=None, xscale=None, yscale=None, colorscale=None, shapescale=None)

Display bins in axis (if not the only axis) as bars on the horizontal axis.

Parameters:
  • axis (None, Axis, algebraic expression (lambda or string), or index position (integer)) – the axis to overlay; if None (default), use the only axis in this Hist
  • profile (None, profile, algebraic expression (lambda or string) or index position (integer)) – if None (default), display bin counts; otherwise, display profile means (and errors on the mean)
  • error (bool) – if True, overlay error bars
  • normalized (bool) – if True, normalize the histogram
  • height, title, config, xscale, yscale, colorscale, shapescale (width,) – graphical directives to pass to Vega-Lite
Returns:

Return type:

Plotable1d

below(axis)

Display bins in axis next to each other vertically.

Parameters:axis (Axis, algebraic expression (lambda or string), or index position (integer)) – the axis to overlay
Returns:
Return type:PlottingChain
beside(axis)

Display bins in axis next to each other horizontally.

Parameters:axis (Axis, algebraic expression (lambda or string), or index position (integer)) – the axis to overlay
Returns:
Return type:PlottingChain
clear()

Effectively reset all bins to zero.

cleared()

Return a copy with all bins set to zero.

compatible(other)

Returns True if the histograms have the same non-profile axis types and binning, regardless of the expressions used to compute them.

copy()

Return an immediate copy of the histogram.

copyonfill()

Return a copy of the histogram whose content is copied if filled.

defs

Definitions used by axis expressions.

detach(key)

Remove an attachment from the histogram (changing it in-place and returning it).

drop(*profile)

Remove one or more profile axes.

Parameters:*profile (profile, algebraic expression (string), or index position (integer)) – the axis or axes to drop
Returns:a histogram without the selected profiles.
Return type:Hist
fields

Names of fields that must be provided in the fill method.

fill(arrays=None, **more)

Fill the histogram: identify bins for independent variables, increase their counts by 1 or weight, and increment any profile (dependent variable) means and errors in the means.

All arrays must have the same length (one-dimensional shape). Numbers are treated as one-element arrays.

Parameters:
  • arrays (dict → Numpy array or number; Spark DataFrame; Pandas DataFrame) – field values to use in the calculation of independent and dependent variables (axes)
  • **more (Numpy arrays or numbers) – more field values
filter(expr)

Returns a copy of this histogram with expr as filter (for fluent construction).

fraction(*cut, **opts)

Return a table of the fraction of entries that pass a set of cuts in each bin.

Parameters:

*cut (profile) – the cut axis or axes to include in the table

Keyword Arguments:
 
  • count (bool) – if True (default), include the (possibly weighted) count of entries in each bin (denominator of the fraction)
  • error (string or None) – if not None, include “errors” on all parameters (uncertainty in the mean of the distribution the count or fraction represents); options are "clopper-pearson", "normal" (default), "wilson", "agresti-coull", "feldman-cousins", "jeffrey", "bayesian-uniform"
  • level (number or iterable of numbers) – confidence level or levels at which to evaluate error; default is erf(sqrt(0.5)) or 0.6827, otherwise known as “one sigma”
  • recarray (bool) – if True (default), return results as a Numpy record array, which is rank-2 with named columns; if False, return a plain Numpy array, which is rank-N for N axes and has no column labels.
  • columns (bool) – if True (not default), return a 2-tuple in which the second argument is a list of column labels.
get(key, *default)

Get an item of attachment metadata.

If key isn’t found and no default is specified, raise a KeyError. If key isn’t found and a default is provided, return the default instead.

Only one default is allowed.

classmethod group(by='source', **hists)

Combine histograms, maintaining their distinctiveness by adding a new categorical axis to each.

To combine histograms by adding bins, just use the + operator.

Parameters:
  • by (string) – name of the new axis (must not already exist)
  • **hists (Hist) – histograms to combine (must have the same axes)
groupkeys(axis)

Return all categorical keys associated with a groupby axis or non-zero bins associated with a groupbin axis.

Parameters:axis (Axis, algebraic expression (string), or index position (integer)) – the groupby or groupbin axis
Returns:all keys for this axis, even if that is a union over other group axes
Return type:set
has(key)

Returns True if key exists in the attachment metadata.

heatmap(xaxis=None, yaxis=None, profile=None, width=None, height=None, title=None, config=None, xscale=None, yscale=None, colorscale=None)

Display bins in xaxis and yaxis (if not the only two axes) as a heatmap.

Parameters:
  • xaxis (None, Axis, algebraic expression (lambda or string), or index position (integer)) – the horizontal axis to overlay; if None (default), use the first of the only two axes in this Hist
  • yaxis (None, Axis, algebraic expression (lambda or string), or index position (integer)) – the vertical axis to overlay; if None (default), use the second of the only two axes in this Hist
  • profile (None, profile, algebraic expression (lambda or string) or index position (integer)) – if None (default), display bin counts; otherwise, display profile means (and errors on the mean)
  • height, title, config, xscale, yscale, colorscale (width,) – graphical directives to pass to Vega-Lite
Returns:

Return type:

Plotable2d

line(axis=None, profile=None, error=False, normalized=False, width=None, height=None, title=None, config=None, xscale=None, yscale=None, colorscale=None, shapescale=None)

Display bins in axis (if not the only axis) as lines on the horizontal axis.

Parameters:
  • axis (None, Axis, algebraic expression (lambda or string), or index position (integer)) – the axis to overlay; if None (default), use the only axis in this Hist
  • profile (None, profile, algebraic expression (lambda or string) or index position (integer)) – if None (default), display bin counts; otherwise, display profile means (and errors on the mean)
  • error (bool) – if True, overlay error bars
  • normalized (bool) – if True, normalize the histogram
  • height, title, config, xscale, yscale, colorscale, shapescale (width,) – graphical directives to pass to Vega-Lite
Returns:

Return type:

Plotable1d

marker(axis=None, profile=None, error=True, normalized=False, width=None, height=None, title=None, config=None, xscale=None, yscale=None, colorscale=None, shapescale=None)

Display bins in axis (if not the only axis) as markers on the horizontal axis.

Parameters:
  • axis (None, Axis, algebraic expression (lambda or string), or index position (integer)) – the axis to overlay; if None (default), use the only axis in this Hist
  • profile (None, profile, algebraic expression (lambda or string) or index position (integer)) – if None (default), display bin counts; otherwise, display profile means (and errors on the mean)
  • error (bool) – if True, overlay error bars
  • normalized (bool) – if True, normalize the histogram
  • height, title, config, xscale, yscale, colorscale, shapescale (width,) – graphical directives to pass to Vega-Lite
Returns:

Return type:

Plotable1d

overlay(axis)

Display bins in axis overlaid on each other in different colors.

Parameters:axis (Axis, algebraic expression (lambda or string), or index position (integer)) – the axis to overlay
Returns:
Return type:PlottingChain
pandas(*axis, **opts)

Exports the data in the histogram to a Pandas DataFrame.

Parameters:*axis (Axis, algebraic expression (lambda or string), or index position (integer)) – axis or axes to include in the table; if no axes or all profile axes, this function calls Hist.table; if all cut axes, this function calls Hist.fraction
Keyword Arguments:
 **opts (any) – passed to Hist.table or Hist.fraction; see these methods for options
project(*axis)

Project onto a given set of axis.

Parameters:*axis (Axis, algebraic expression (string), or index position (integer)) – the axis or axes to keep (all profile axes are kept)
Returns:a histogram projected onto the selected axis or axes.
Return type:Hist
rebin(axis, edges)

Reduce the number of bins by combining existing bins at a specified set of edges.

Parameters:
  • axis (Axis, algebraic expression (string), or index position (integer)) – the axis to rebin
  • edges (iterable of numbers) – new bin edges; must be a subset of existing bin edges
Returns:

a rebinned histogram.

Return type:

Hist

rebinby(axis, factor)

Reduce the number of bins by an approximate factor by combining existing bins.

Parameters:
  • axis (Axis, algebraic expression (string), or index position (integer)) – the axis to rebin
  • factor (positive integer) – number of bins to combine into a single bin (inexact if the number of bins in axis is not an exact multiple of factor)
Returns:

a rebinned histogram.

Return type:

Hist

root(*axis, **opts)

Exports the data in the histogram to a ROOT histogram.

The histogram may need to be selected (Hist.select) or projected (Hist.project) to make it useful in a ROOT histogram.

Parameters:

*axis (Axis, algebraic expression (lambda or string), or index position (integer)) – axis or axes to include in the output; if no axes, this function returns a histogram of counts; if one profile, this function returns a profile; …

Keyword Arguments:
 
  • name (string) – name to give to the ROOT object (default is the empty string)
  • title (string) – title to give to the ROOT object (default is the empty string)
  • cache (dict-like object) – if supplied, the return value is inserted into cache keyed by its name; this for convenience (ROOT objects must be kept in scope to be drawn)
select(expr, tolerance=1e-12)

Eliminate bins by selecting data with a boolean expr.

Parameters:
  • expr (algebraic expression (string)) – boolean expression of data to keep; selection thresholds must align with bin edges with the right inequality (e.g. < vs <=)
  • tolerance (small positive number) – absolute difference between selection threshold and bin edge to qualify as a match
Returns:

a histogram with data removed (fewer bins)

Return type:

Hist

shape

Shape of the Numpy array defining the content of the fixed-memory axes (FixedAxis) only.

stack(axis, order=None)

Display bins in axis stacked on one another in an area plot.

Parameters:
  • axis (Axis, algebraic expression (lambda or string), or index position (integer)) – the axis to overlay
  • order (iterable of strings) – stacking order of bins
Returns:

Return type:

PlottingChain

step(axis=None, profile=None, error=False, normalized=False, width=None, height=None, title=None, config=None, xscale=None, yscale=None, colorscale=None, shapescale=None)

Display bins in axis (if not the only axis) as steps on the horizontal axis.

Parameters:
  • axis (None, Axis, algebraic expression (lambda or string), or index position (integer)) – the axis to overlay; if None (default), use the only axis in this Hist
  • profile (None, profile, algebraic expression (lambda or string) or index position (integer)) – if None (default), display bin counts; otherwise, display profile means (and errors on the mean)
  • error (bool) – if True, overlay error bars
  • normalized (bool) – if True, normalize the histogram
  • height, title, config, xscale, yscale, colorscale, shapescale (width,) – graphical directives to pass to Vega-Lite
Returns:

Return type:

Plotable1d

systematic(vector)

Returns a copy of this histogram with vector as systematic (for fluent construction).

table(*profile, **opts)

Return histogram data as a table of counts and, optionally, dependent variables (profiles).

Parameters:

*profile (profile) – the dependent variables to include in the table

Keyword Arguments:
 
  • count (bool) – if True (default), include the (possibly weighted) count of entries in each bin
  • effcount (bool) – if True (not default), include the effective count, which is used to convert between weighted profile errors and weighted profile spreads (equal to count for unweighted data)
  • error (bool) – if True (default), include “errors” on all parameters (uncertainty in the mean of the distribution the count or profile average represents)
  • normalized (bool) – if True (not default), scale each count and err(count) such that the sum over counts times bin widths is 1; does not affect profiles
  • recarray (bool) – if True (default), return results as a Numpy record array, which is rank-2 with named columns; if False, return a plain Numpy array, which is rank-N for N axes and has no column labels.
  • columns (bool) – if True (not default), return a 2-tuple in which the second argument is a list of column labels.
togroup()

Add histograms to the groupby that is the first axis.

Histograms created with Hist.group have a first axis that is a groupby.

Keyword Arguments:
 **hists (dict of str → Hist) – histograms to add to the existing group
weight(expr)

Returns a copy of this histogram with expr as weights (for fluent construction).