covasim.analysis module¶

Additional analysis functions that are not part of the core Covasim workflow, but which are useful for particular investigations. Currently, this just consists of the transmission tree.

class covasim.analysis.Analyzer(label=None)¶

Bases: sciris.sc_utils.prettyobj

Base class for analyzers. Based on the Intervention class.

Parameters: label (str) – a label for the Analyzer (used for ease of identification)

initialize(sim)¶: Initialize the analyzer, e.g. convert date strings to integers.

apply(sim)¶

Apply analyzer at each time point. The analyzer has full access to the sim object, and typically stores data/results in itself.

Parameters: sim – the Sim instance

class covasim.analysis.snapshot(days, *args, die=True, **kwargs)¶

Bases: covasim.analysis.Analyzer

Analyzer that takes a “snapshot” of the sim.people array at specified points in time, and saves them to itself. To retrieve them, you can either access the dictionary directly, or use the get() method.

Parameters

days (list) – list of ints/strings/date objects, the days on which to take the snapshot
args (list) – additional day(s)
die (bool) – whether or not to raise an exception if a date is not found (default true)
kwargs (dict) – passed to Analyzer()

Example:

sim = cv.Sim(analyzers=cv.snapshot('2020-04-04', '2020-04-14'))
sim.run()
snapshot = sim['analyzers'][0]
people = snapshot.snapshots[0]            # Option 1
people = snapshot.snapshots['2020-04-04'] # Option 2
people = snapshot.get('2020-04-14')       # Option 3
people = snapshot.get(34)                 # Option 4
people = snapshot.get()                   # Option 5

initialize(sim)¶: Initialize the analyzer, e.g. convert date strings to integers.

apply(sim)¶

Apply analyzer at each time point. The analyzer has full access to the sim object, and typically stores data/results in itself.

Parameters: sim – the Sim instance

get(key=None)¶: Retrieve a snapshot from the given key (int, str, or date)

class covasim.analysis.age_histogram(days=None, states=None, edges=None, datafile=None, sim=None, die=True, **kwargs)¶

Bases: covasim.analysis.Analyzer

Analyzer that takes a “snapshot” of the sim.people array at specified points in time, and saves them to itself. To retrieve them, you can either access the dictionary directly, or use the get() method. You can also apply this analyzer directly to a sim object.

Parameters

days (list) – list of ints/strings/date objects, the days on which to calculate the histograms (default: last day)
states (list) – which states of people to record (default: exposed, tested, diagnosed, dead)
edges (list) – edges of age bins to use (default: 10 year bins from 0 to 100)
datafile (str) – the name of the data file to load in for comparison, or a dataframe of data (optional)
sim (Sim) – only used if the analyzer is being used after a sim has already been run
die (bool) – whether to raise an exception if dates are not found (default true)
kwargs (dict) – passed to Analyzer()

Examples:

sim = cv.Sim(analyzers=cv.age_histogram())
sim.run()
agehist = sim['analyzers'][0].get()

agehist = cv.age_histogram(sim=sim)

from_sim(sim)¶: Create an age histogram from an already run sim

initialize(sim)¶: Initialize the analyzer, e.g. convert date strings to integers.

apply(sim)¶

Apply analyzer at each time point. The analyzer has full access to the sim object, and typically stores data/results in itself.

Parameters: sim – the Sim instance

get(key=None)¶: Retrieve a specific histogram from the given key (int, str, or date)

compute_windows()¶: Convert cumulative histograms to windows

plot(windows=False, width=0.8, color='#F8A493', font_size=18, fig_args=None, axis_args=None, data_args=None)¶

Simple method for plotting the histograms.

Parameters

windows (bool) – whether to plot windows instead of cumulative counts
width (float) – width of bars
color (hex or rgb) – the color of the bars
font_size (float) – size of font
fig_args (dict) – passed to pl.figure()
axis_args (dict) – passed to pl.subplots_adjust()
data_args (dict) – ‘width’, ‘color’, and ‘offset’ arguments for the data

class covasim.analysis.Fit(sim, weights=None, keys=None, custom=None, compute=True, verbose=False, **kwargs)¶

Bases: sciris.sc_utils.prettyobj

A class for calculating the fit between the model and the data. Note the following terminology is used here:

fit: nonspecific term for how well the model matches the data

difference: the absolute numerical differences between the model and the data (one time series per result)

goodness-of-fit: the result of passing the difference through a statistical function, such as mean squared error

loss: the goodness-of-fit for each result multiplied by user-specified weights (one time series per result)

mismatches: the sum of all the losses (a single scalar value per time series)

mismatch: the sum of the mismatches – this is the value to be minimized during calibration

Parameters

sim (Sim) – the sim object
weights (dict) – the relative weight to place on each result (by default: 10 for deaths, 5 for diagnoses, 1 for everything else)
keys (list) – the keys to use in the calculation
custom (dict) – a custom dictionary of additional data to fit; format is e.g. {‘my_output’:{‘data’:[1,2,3], ‘sim’:[1,2,4], ‘weights’:2.0}}
compute (bool) – whether to compute the mismatch immediately
verbose (bool) – detail to print
kwargs (dict) – passed to cv.compute_gof() – see this function for more detail on goodness-of-fit calculation options

Example:

sim = cv.Sim()
sim.run()
fit = sim.compute_fit()
fit.plot()

compute()¶: Perform all required computations

reconcile_inputs()¶: Find matching keys and indices between the model and the data

compute_diffs(absolute=False)¶: Find the differences between the sim and the data

compute_gofs(**kwargs)¶: Compute the goodness-of-fit

compute_losses()¶: Compute the weighted goodness-of-fit

compute_mismatch(use_median=False)¶: Compute the final mismatch

plot(keys=None, width=0.8, font_size=18, fig_args=None, axis_args=None, plot_args=None)¶

Plot the fit of the model to the data. For each result, plot the data and the model; the difference; and the loss (weighted difference). Also plots the loss as a function of time.

Parameters

keys (list) – which keys to plot (default, all)
width (float) – bar width
font_size (float) – size of font
fig_args (dict) – passed to pl.figure()
axis_args (dict) – passed to pl.subplots_adjust()
plot_args (dict) – passed to pl.plot()

class covasim.analysis.TransTree(sim, to_networkx=False)¶

Bases: sciris.sc_utils.prettyobj

A class for holding a transmission tree. There are several different representations of the transmission tree: “infection_log” is copied from the people object and is the simplest representation. “detailed h” includes additional attributes about the source and target. If NetworkX is installed (required for most methods), “graph” includes an NX representation of the transmission tree.

Parameters

sim (Sim) – the sim object
to_networkx (bool) – whether to convert the graph to a NetworkX object

property transmissions¶

Iterable over edges corresponding to transmission events

This excludes edges corresponding to seeded infections without a source

day(day=None, which=None)¶: Convenience function for converting an input to an integer day

count_targets(start_day=None, end_day=None)¶

Count the number of targets each infected person has. If start and/or end days are given, it will only count the targets of people who got infected between those dates (it does not, however, filter on the date the target got infected).

Parameters

start_day (int/str) – the day on which to start counting people who got infected
end_day (int/str) – the day on which to stop counting people who got infected

make_detailed(people, reset=False)¶: Construct a detailed transmission tree, with additional information for each person

r0(recovered_only=False)¶

Return average number of transmissions per person

This doesn’t include seed transmissions. By default, it also doesn’t adjust for length of infection (e.g. people infected towards the end of the simulation will have fewer transmissions because their infection may extend past the end of the simulation, these people are not included). If ‘recovered_only=True’ then the downstream transmissions will only be included for people that recover before the end of the simulation, thus ensuring they all had the same amount of time to transmit.

plot(*args, **kwargs)¶: Plot the transmission tree

animate(*args, **kwargs)¶

Animate the transmission tree.

Parameters

animate (bool) – whether to animate the plot (otherwise, show when finished)
verbose (bool) – print out progress of each frame
markersize (int) – size of the markers
sus_color (list) – color for susceptibles
fig_args (dict) – arguments passed to pl.figure()
axis_args (dict) – arguments passed to pl.subplots_adjust()
plot_args (dict) – arguments passed to pl.plot()
delay (float) – delay between frames in seconds
font_size (int) – size of the font
colors (list) – color of each person
cmap (str) – colormap for each person (if colors is not supplied)

Returns

the figure object

Return type

fig

plot_histograms(start_day=None, end_day=None, bins=None, width=0.8, fig_args=None, font_size=18)¶

Plots a histogram of the number of transmissions.

Parameters

start_day (int/str) – the day on which to start counting people who got infected
end_day (int/str) – the day on which to stop counting people who got infected
bins (list) – bin edges to use for the histogram
width (float) – width of bars
fig_args (dict) – passed to pl.figure()
font_size (float) – size of font