covasim.misc module

Miscellaneous functions that do not belong anywhere else

covasim.misc.load_data(datafile, columns=None, calculate=True, check_date=True, verbose=True, **kwargs)

Load data for comparing to the model output, either from file or from a dataframe.

Parameters
  • datafile (str or df) – if a string, the name of the file to load (either Excel or CSV); if a dataframe, use directly

  • columns (list) – list of column names (otherwise, load all)

  • calculate (bool) – whether to calculate cumulative values from daily counts

  • check_date (bool) – whether to check that a ‘date’ column is present

  • kwargs (dict) – passed to pd.read_excel()

Returns

pandas dataframe of the loaded data

Return type

data (dataframe)

covasim.misc.date(obj, *args, start_date=None, dateformat=None, as_date=True)

Convert a string or a datetime object to a date object. To convert to an integer from the start day, it is recommended you supply a start date, or use sim.date() instead; otherwise, it will calculate the date counting days from 2020-01-01. This means that the output of cv.date() will not necessarily match the output of sim.date() for an integer input.

Parameters
  • obj (str, date, datetime, list, array) – the object to convert

  • args (str, date, datetime) – additional objects to convert

  • start_date (str, date, datetime) – the starting date, if an integer is supplied

  • dateformat (str) – the format to return the date in

  • as_date (bool) – whether to return as a datetime date instead of a string

Returns

either a single date object, or a list of them

Return type

dates (date or list)

Examples:

cv.date('2020-04-05') # Returns datetime.date(2020, 4, 5)
cv.date('2020-04-14', start_date='2020-04-04', as_date=False) # Returns 10
cv.date([35,36,37], as_date=False) # Returns ['2020-02-05', '2020-02-06', '2020-02-07']
covasim.misc.day(obj, *args, start_day=None)

Convert a string, date/datetime object, or int to a day (int), the number of days since the start day. See also date() and daydiff(). Used primarily via sim.day() rather than directly.

Parameters
  • obj (str, date, int, or list) – convert any of these objects to a day relative to the start day

  • args (list) – additional days

  • start_day (str or date) – the start day; if none is supplied, return days since 2020-01-01.

Returns

the day(s) in simulation time

Return type

days (int or str)

Example:

sim.day('2020-04-05') # Returns 35
covasim.misc.daydiff(*args)

Convenience function to find the difference between two or more days. With only one argument, calculate days since 2020-01-01.

Example:

since_ny = cv.daydiff('2020-03-20') # Returns 79 days since Jan. 1st
diff     = cv.daydiff('2020-03-20', '2020-04-05') # Returns 16
diffs    = cv.daydiff('2020-03-20', '2020-04-05', '2020-05-01') # Returns [16, 26]
covasim.misc.date_range(start_date, end_date, inclusive=True, as_date=False, dateformat=None)

Return a list of dates from the start date to the end date. To convert a list of days (as integers) to dates, use cv.date() instead.

Parameters
  • start_date (int/str/date) – the starting date, in any format

  • end_date (int/str/date) – the end date, in any format

  • inclusive (bool) – if True (default), return to end_date inclusive; otherwise, stop the day before

  • as_date (bool) – if True, return a list of datetime.date objects instead of strings

  • dateformat (str) – passed to date()

Example:

dates = cv.date_range('2020-03-01', '2020-04-04')
covasim.misc.load(*args, **kwargs)

Convenience method for sc.loadobj() and equivalent to cv.Sim.load() or cv.Scenarios.load().

Examples:

sim = cv.load('calib.sim')
scens = cv.load(filename='school-closures.scens', folder='schools')
covasim.misc.save(*args, **kwargs)

Convenience method for sc.saveobj() and equivalent to cv.Sim.save() or cv.Scenarios.save().

Examples:

cv.save('calib.sim', sim)
cv.save(filename='school-closures.scens', folder='schools', obj=scens)
covasim.misc.savefig(filename=None, comments=None, **kwargs)

Wrapper for Matplotlib’s savefig() function which automatically stores Covasim metadata in the figure. By default, saves

Parameters
  • filename (str) – name of the file to save to (default, timestamp)

  • comments (str) – additional metadata to save to the figure

  • kwargs (dict) – passed to savefig()

Example:

cv.Sim().run(do_plot=True)
filename = cv.savefig()
covasim.misc.get_png_metadata(filename, output=False)

Read metadata from a PNG file. For use with images saved with cv.savefig(). Requires pillow, an optional dependency. Metadata retrieval for PDF and SVG is not currently supported.

Parameters

filename (str) – the name of the file to load the data from

Example:

cv.Sim().run(do_plot=True)
cv.savefig('covasim.png')
cv.get_png_metadata('covasim.png')
covasim.misc.git_info(filename=None, check=False, comments=None, old_info=None, die=False, indent=2, verbose=True, frame=2, **kwargs)

Get current git information and optionally write it to disk. Simplest usage is cv.git_info(__file__)

Parameters
  • filename (str) – name of the file to write to or read from

  • check (bool) – whether or not to compare two git versions

  • comments (dict) – additional comments to include in the file

  • old_info (dict) – dictionary of information to check against

  • die (bool) – whether or not to raise an exception if the check fails

  • indent (int) – how many indents to use when writing the file to disk

  • verbose (bool) – detail to print

  • frame (int) – how many frames back to look for caller info

  • kwargs (dict) – passed to loadjson (if check=True) or loadjson (if check=False)

Examples:

cv.git_info() # Return information
cv.git_info(__file__) # Writes to disk
cv.git_info('covasim_version.gitinfo') # Writes to disk
cv.git_info('covasim_version.gitinfo', check=True) # Checks that current version matches saved file
covasim.misc.check_version(expected, die=False, verbose=True, **kwargs)

Get current git information and optionally write it to disk.

Parameters
  • expected (str) – expected version information

  • die (bool) – whether or not to raise an exception if the check fails

covasim.misc.check_save_version(expected=None, filename=None, die=False, verbose=True, **kwargs)

A convenience function that bundles check_version with git_info and saves automatically to disk from the calling file. The idea is to put this at the top of an analysis script, and commit the resulting file, to keep track of which version of Covasim was used.

Parameters
  • expected (str) – expected version information

  • filename (str) – file to save to; if None, guess based on current file name

  • kwargs (dict) – passed to git_info()

Examples:

cv.check_save_version()
cv.check_save_version('1.3.2', filename='script.gitinfo', comments='This is the main analysis script')
covasim.misc.get_doubling_time(sim, series=None, interval=None, start_day=None, end_day=None, moving_window=None, exp_approx=False, max_doubling_time=100, eps=0.001, verbose=None)

Method to calculate doubling time.

Examples:

get_doubling_time(sim, interval=[3,30]) # returns the doubling time over the given interval (single float)
get_doubling_time(sim, interval=[3,30], moving_window=3) # returns doubling times calculated over moving windows (array)
covasim.misc.poisson_test(count1, count2, exposure1=1, exposure2=1, ratio_null=1, method='score', alternative='two-sided')

Test for ratio of two sample Poisson intensities

If the two Poisson rates are g1 and g2, then the Null hypothesis is

H0: g1 / g2 = ratio_null

against one of the following alternatives

H1_2-sided: g1 / g2 != ratio_null H1_larger: g1 / g2 > ratio_null H1_smaller: g1 / g2 < ratio_null

Parameters
  • count1 – int Number of events in first sample

  • exposure1 – float Total exposure (time * subjects) in first sample

  • count2 – int Number of events in first sample

  • exposure2 – float Total exposure (time * subjects) in first sample

  • ratio – float ratio of the two Poisson rates under the Null hypothesis. Default is 1.

  • method – string Method for the test statistic and the p-value. Defaults to ‘score’. Current Methods are based on Gu et. al 2008 Implemented are ‘wald’, ‘score’ and ‘sqrt’ based asymptotic normal distribution, and the exact conditional test ‘exact-cond’, and its mid-point version ‘cond-midp’, see Notes

  • alternative

    string The alternative hypothesis, H1, has to be one of the following

    ’two-sided’: H1: ratio of rates is not equal to ratio_null (default) ‘larger’ : H1: ratio of rates is larger than ratio_null ‘smaller’ : H1: ratio of rates is smaller than ratio_null

Returns

pvalue two-sided # stat

Notes

‘wald’: method W1A, wald test, variance based on separate estimates ‘score’: method W2A, score test, variance based on estimate under Null ‘wald-log’: W3A ‘score-log’ W4A ‘sqrt’: W5A, based on variance stabilizing square root transformation ‘exact-cond’: exact conditional test based on binomial distribution ‘cond-midp’: midpoint-pvalue of exact conditional test

The latter two are only verified for one-sided example.

References

Gu, Ng, Tang, Schucany 2008: Testing the Ratio of Two Poisson Rates, Biometrical Journal 50 (2008) 2, 2008

Author: Josef Perktold License: BSD-3

destination statsmodels

From: https://stackoverflow.com/questions/33944914/implementation-of-e-test-for-poisson-in-python

Date: 2020feb24

covasim.misc.compute_gof(actual, predicted, normalize=True, use_frac=False, use_squared=False, as_scalar='none', eps=1e-09, skestimator=None, **kwargs)

Calculate the goodness of fit. By default use normalized absolute error, but highly customizable. For example, mean squared error is equivalent to setting normalize=False, use_squared=True, as_scalar=’mean’.

Parameters
  • actual (arr) – array of actual (data) points

  • predicted (arr) – corresponding array of predicted (model) points

  • normalize (bool) – whether to divide the values by the largest value in either series

  • use_frac (bool) – convert to fractional mismatches rather than absolute

  • use_squared (bool) – square the mismatches

  • as_scalar (str) – return as a scalar instead of a time series: choices are sum, mean, median

  • eps (float) – to avoid divide-by-zero

  • skestimator (str) – if provided, use this scikit-learn estimator instead

  • kwargs (dict) – passed to the scikit-learn estimator

Returns

array of goodness-of-fit values, or a single value if as_scalar is True

Return type

gofs (arr)

Examples:

x1 = np.cumsum(np.random.random(100))
x2 = np.cumsum(np.random.random(100))

e1 = compute_gof(x1, x2) # Default, normalized absolute error
e2 = compute_gof(x1, x2, normalize=False, use_frac=False) # Fractional error
e3 = compute_gof(x1, x2, normalize=False, use_squared=True, as_scalar='mean') # Mean squared error
e4 = compute_gof(x1, x2, skestimator='mean_squared_error') # Scikit-learn's MSE method
e5 = compute_gof(x1, x2, as_scalar='median') # Normalized median absolute error -- highly robust