covasim.analysis module

Additional analysis functions that are not part of the core Covasim workflow, but which are useful for particular investigations. Currently, this just consists of the transmission tree.

class covasim.analysis.Analyzer(label=None)

Bases: sciris.sc_utils.prettyobj

Base class for analyzers. Based on the Intervention class.

Parameters

label (str) – a label for the Analyzer (used for ease of identification)

initialize(sim)

Initialize the analyzer, e.g. convert date strings to integers.

apply(sim)

Apply analyzer at each time point. The analyzer has full access to the sim object, and typically stores data/results in itself.

Parameters

sim – the Sim instance

class covasim.analysis.snapshot(days, *args, die=True, **kwargs)

Bases: covasim.analysis.Analyzer

Analyzer that takes a “snapshot” of the sim.people array at specified points in time, and saves them to itself. To retrieve them, you can either access the dictionary directly, or use the get() method.

Parameters
  • days (list) – list of ints/strings/date objects, the days on which to take the snapshot

  • args (list) – additional day(s)

  • die (bool) – whether or not to raise an exception if a date is not found (default true)

  • kwargs (dict) – passed to Analyzer()

Example:

sim = cv.Sim(analyzers=cv.snapshot('2020-04-04', '2020-04-14'))
sim.run()
snapshot = sim['analyzers'][0]
people = snapshot.snapshots[0]            # Option 1
people = snapshot.snapshots['2020-04-04'] # Option 2
people = snapshot.get('2020-04-14')       # Option 3
people = snapshot.get(34)                 # Option 4
people = snapshot.get()                   # Option 5
initialize(sim)

Initialize the analyzer, e.g. convert date strings to integers.

apply(sim)

Apply analyzer at each time point. The analyzer has full access to the sim object, and typically stores data/results in itself.

Parameters

sim – the Sim instance

get(key=None)

Retrieve a snapshot from the given key (int, str, or date)

class covasim.analysis.age_histogram(days=None, states=None, edges=None, datafile=None, sim=None, die=True, **kwargs)

Bases: covasim.analysis.Analyzer

Analyzer that takes a “snapshot” of the sim.people array at specified points in time, and saves them to itself. To retrieve them, you can either access the dictionary directly, or use the get() method. You can also apply this analyzer directly to a sim object.

Parameters
  • days (list) – list of ints/strings/date objects, the days on which to calculate the histograms (default: last day)

  • states (list) – which states of people to record (default: exposed, tested, diagnosed, dead)

  • edges (list) – edges of age bins to use (default: 10 year bins from 0 to 100)

  • datafile (str) – the name of the data file to load in for comparison, or a dataframe of data (optional)

  • sim (Sim) – only used if the analyzer is being used after a sim has already been run

  • die (bool) – whether to raise an exception if dates are not found (default true)

  • kwargs (dict) – passed to Analyzer()

Examples:

sim = cv.Sim(analyzers=cv.age_histogram())
sim.run()
agehist = sim['analyzers'][0].get()

agehist = cv.age_histogram(sim=sim)
from_sim(sim)

Create an age histogram from an already run sim

initialize(sim)

Initialize the analyzer, e.g. convert date strings to integers.

apply(sim)

Apply analyzer at each time point. The analyzer has full access to the sim object, and typically stores data/results in itself.

Parameters

sim – the Sim instance

get(key=None)

Retrieve a specific histogram from the given key (int, str, or date)

compute_windows()

Convert cumulative histograms to windows

plot(windows=False, width=0.8, color='#F8A493', font_size=18, fig_args=None, axis_args=None, data_args=None)

Simple method for plotting the histograms.

Parameters
  • windows (bool) – whether to plot windows instead of cumulative counts

  • width (float) – width of bars

  • color (hex or rgb) – the color of the bars

  • font_size (float) – size of font

  • fig_args (dict) – passed to pl.figure()

  • axis_args (dict) – passed to pl.subplots_adjust()

  • data_args (dict) – ‘width’, ‘color’, and ‘offset’ arguments for the data

class covasim.analysis.Fit(sim, weights=None, keys=None, custom=None, compute=True, verbose=False, **kwargs)

Bases: sciris.sc_utils.prettyobj

A class for calculating the fit between the model and the data. Note the following terminology is used here:

  • fit: nonspecific term for how well the model matches the data

  • difference: the absolute numerical differences between the model and the data (one time series per result)

  • goodness-of-fit: the result of passing the difference through a statistical function, such as mean squared error

  • loss: the goodness-of-fit for each result multiplied by user-specified weights (one time series per result)

  • mismatches: the sum of all the losses (a single scalar value per time series)

  • mismatch: the sum of the mismatches – this is the value to be minimized during calibration

Parameters
  • sim (Sim) – the sim object

  • weights (dict) – the relative weight to place on each result (by default: 10 for deaths, 5 for diagnoses, 1 for everything else)

  • keys (list) – the keys to use in the calculation

  • custom (dict) – a custom dictionary of additional data to fit; format is e.g. {‘my_output’:{‘data’:[1,2,3], ‘sim’:[1,2,4], ‘weights’:2.0}}

  • compute (bool) – whether to compute the mismatch immediately

  • verbose (bool) – detail to print

  • kwargs (dict) – passed to cv.compute_gof() – see this function for more detail on goodness-of-fit calculation options

Example:

sim = cv.Sim()
sim.run()
fit = sim.compute_fit()
fit.plot()
compute()

Perform all required computations

reconcile_inputs()

Find matching keys and indices between the model and the data

compute_diffs(absolute=False)

Find the differences between the sim and the data

compute_gofs(**kwargs)

Compute the goodness-of-fit

compute_losses()

Compute the weighted goodness-of-fit

compute_mismatch(use_median=False)

Compute the final mismatch

plot(keys=None, width=0.8, font_size=18, fig_args=None, axis_args=None, plot_args=None)

Plot the fit of the model to the data. For each result, plot the data and the model; the difference; and the loss (weighted difference). Also plots the loss as a function of time.

Parameters
  • keys (list) – which keys to plot (default, all)

  • width (float) – bar width

  • font_size (float) – size of font

  • fig_args (dict) – passed to pl.figure()

  • axis_args (dict) – passed to pl.subplots_adjust()

  • plot_args (dict) – passed to pl.plot()

class covasim.analysis.TransTree(sim, to_networkx=False)

Bases: sciris.sc_utils.prettyobj

A class for holding a transmission tree. There are several different representations of the transmission tree: “infection_log” is copied from the people object and is the simplest representation. “detailed h” includes additional attributes about the source and target. If NetworkX is installed (required for most methods), “graph” includes an NX representation of the transmission tree.

Parameters
  • sim (Sim) – the sim object

  • to_networkx (bool) – whether to convert the graph to a NetworkX object

property transmissions

Iterable over edges corresponding to transmission events

This excludes edges corresponding to seeded infections without a source

day(day=None, which=None)

Convenience function for converting an input to an integer day

count_targets(start_day=None, end_day=None)

Count the number of targets each infected person has. If start and/or end days are given, it will only count the targets of people who got infected between those dates (it does not, however, filter on the date the target got infected).

Parameters
  • start_day (int/str) – the day on which to start counting people who got infected

  • end_day (int/str) – the day on which to stop counting people who got infected

make_detailed(people, reset=False)

Construct a detailed transmission tree, with additional information for each person

r0(recovered_only=False)

Return average number of transmissions per person

This doesn’t include seed transmissions. By default, it also doesn’t adjust for length of infection (e.g. people infected towards the end of the simulation will have fewer transmissions because their infection may extend past the end of the simulation, these people are not included). If ‘recovered_only=True’ then the downstream transmissions will only be included for people that recover before the end of the simulation, thus ensuring they all had the same amount of time to transmit.

plot(*args, **kwargs)

Plot the transmission tree

animate(*args, **kwargs)

Animate the transmission tree.

Parameters
  • animate (bool) – whether to animate the plot (otherwise, show when finished)

  • verbose (bool) – print out progress of each frame

  • markersize (int) – size of the markers

  • sus_color (list) – color for susceptibles

  • fig_args (dict) – arguments passed to pl.figure()

  • axis_args (dict) – arguments passed to pl.subplots_adjust()

  • plot_args (dict) – arguments passed to pl.plot()

  • delay (float) – delay between frames in seconds

  • font_size (int) – size of the font

  • colors (list) – color of each person

  • cmap (str) – colormap for each person (if colors is not supplied)

Returns

the figure object

Return type

fig

plot_histograms(start_day=None, end_day=None, bins=None, width=0.8, fig_args=None, font_size=18)

Plots a histogram of the number of transmissions.

Parameters
  • start_day (int/str) – the day on which to start counting people who got infected

  • end_day (int/str) – the day on which to stop counting people who got infected

  • bins (list) – bin edges to use for the histogram

  • width (float) – width of bars

  • fig_args (dict) – passed to pl.figure()

  • font_size (float) – size of font