Books of histograms

Histograms can be collected into “books,” both as a user convenience (fill all histograms in a book with a single call to fill) and for performance (avoid multiple passes over the data or repeated calculations).

Books of histograms behave like dicts, with access to individual histograms through square brackets (__getitem__ and __setitem__).

class histbook.book.Book(hists1={}, *hists2, **hists3)

A collection of histograms (Hist) or other Books that can be filled with a single fill call.

Behaves like a dict (item assignment, keys, values).

Positional arguments may be a dict of str → Hist or GenericBook.

Or they may be Hist or GenericBook as unnamed varargs.

In either case, keyword name → Hist or Book are also accepted.

allitems(onlyhist=False)

Return a recursive list of path, book-or-histogram pairs.

Parameters:onlyhist (bool) – if True (not default), only return histograms (type Hist), not books
allkeys(onlyhist=False)

Return a recursive list of paths.

Parameters:onlyhist (bool) – if True (not default), only return names of histograms (type Hist), not books
allvalues(onlyhist=False)

Return a recursive list of books and histograms.

Parameters:onlyhist (bool) – if True (not default), only return histograms (type Hist), not books
assertcompatible()

Raises ValueError if not all books have the same set of histogram names and those histograms with matching names are compatible.

attach(key, value)

Add an attachment to the book (changing it in-place and returning it).

attachment

Python dict of attachment metadata (linked, not a copy).

clear()

Effectively reset all bins of all histograms to zero.

cleared()

Return a copy with all bins of all histograms set to zero.

compatible(other)

Returns True if the books have the same set of histogram names and those histograms with matching names are compatible.

copy()

Return an immediate copy of the book of histograms.

copyonfill()

Return a copy of the book of histograms whose content is copied if filled.

detach(key)

Remove an attachment from the book (changing it in-place and returning it).

fields

Names of fields that must be provided in the fill method.

fill(arrays=None, **more)

Fill the histograms: identify bins for independent variables, increase their counts by 1 or weight, and increment any profile (dependent variable) means and errors in the means.

All arrays must have the same length (one-dimensional shape). Numbers are treated as one-element arrays.

All histograms in the book are filled with the same inputs.

Parameters:
  • arrays (dict → Numpy array or number; Spark DataFrame; Pandas DataFrame) – field values to use in the calculation of independent and dependent variables (axes)
  • **more (Numpy arrays or numbers) – more field values
classmethod fromdicts(content, attachment)

Construct a book from its content and attachment dicts.

get(key, *default)

Get an item of attachment metadata.

If key isn’t found and no default is specified, raise a KeyError. If key isn’t found and a default is provided, return the default instead.

Only one default is allowed.

classmethod group(by='source', **books)

Combine histograms, maintaining their distinctiveness by adding a new categorical axis to each.

To combine histograms by adding bins, just use the + operator.

Parameters:
  • by (string) – name of the new axis (must not already exist)
  • **books (Book) – books to combine (histograms with the same names must have the same axes)
has(key)

Returns True if key exists in the attachment metadata.

items(recursive=False, onlyhist=False)

Return a list of path, book-or-histogram pairs.

Parameters:
  • recursive (bool) – if True (default), descend into books of books
  • onlyhist (bool) – if True (not default), only return histograms (type Hist), not books
iteritems(recursive=False, onlyhist=False)

Iterate through path, book-or-histogram pairs.

Parameters:
  • recursive (bool) – if True (default), descend into books of books
  • onlyhist (bool) – if True (not default), only return histograms (type Hist), not books
iterkeys(recursive=False, onlyhist=False)

Iterate through paths.

Parameters:
  • recursive (bool) – if True (default), descend into books of books
  • onlyhist (bool) – if True (not default), only return names of histograms (type Hist), not books
itervalues(recursive=False, onlyhist=False)

Iterate through books and histograms.

Parameters:
  • recursive (bool) – if True (default), descend into books of books
  • onlyhist (bool) – if True (not default), only return histograms (type Hist), not books
keys(recursive=False, onlyhist=False)

Return a list of paths.

Parameters:
  • recursive (bool) – if True (default), descend into books of books
  • onlyhist (bool) – if True (not default), only return names of histograms (type Hist), not books
pop(k[, d]) → v, remove specified key and return the corresponding value.

If key is not found, d is returned if given, otherwise KeyError is raised.

popitem() → (k, v), remove and return some (key, value) pair

as a 2-tuple; but raise KeyError if D is empty.

setdefault(k[, d]) → D.get(k,d), also set D[k]=d if k not in D
update([E, ]**F) → None. Update D from mapping/iterable E and F.

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

values(recursive=False, onlyhist=False)

Return a list of books and histograms.

Parameters:
  • recursive (bool) – if True (default), descend into books of books
  • onlyhist (bool) – if True (not default), only return histograms (type Hist), not books