synthpops.contact_networks module

This module generates the household, school, and workplace contact networks.

synthpops.contact_networks.generate_household_sizes(Nhomes, hh_size_distr)

Given a number of homes and a household size distribution, generate the number of homes of each size.

Parameters
  • Nhomes (int) – The number of homes.

  • hh_size_distr (dict) – The distribution of household sizes.

Returns

An array with the count of households of size s at index s-1.

synthpops.contact_networks.generate_household_sizes_from_fixed_pop_size(N, hh_size_distr)

Given a number of people and a household size distribution, generate the number of homes of each size needed to place everyone in a household.

Parameters
  • N (int) – The number of people in the population.

  • hh_size_distr (dict) – The distribution of household sizes.

Returns

An array with the count of households of size s at index s-1.

synthpops.contact_networks.get_totalpopsize_from_household_sizes(hh_sizes)

Sum the population of a specific household size from the count array.

Parameters

hh_sizes (array) – The count of household size s at index s-1.

Returns

An integer indicating the total number of people in household size s.

synthpops.contact_networks.generate_household_head_age_by_size(hha_by_size_counts, hha_brackets, hh_size, single_year_age_distr)

Generate the age of the head of the household, also known as the reference person of the household, conditional on the size of the household.

Parameters
  • hha_by_size_counts (matrix) – A matrix in which each row contains the age distribution of the reference person for household size s at index s-1.

  • hha_brackets (dict) – The age brackets for the heads of household.

  • hh_size (int) – The household size.

  • single_year_age_distr (dict) – The age distribution.

Returns

Age of the head of the household or reference person.

synthpops.contact_networks.generate_living_alone(hh_sizes, hha_by_size_counts, hha_brackets, single_year_age_distr)

Generate the ages of those living alone.

Parameters
  • hh_sizes (array) – The count of household size s at index s-1.

  • hha_by_size_counts (matrix) – A matrix in which each row contains the age distribution of the reference person for household size s at index s-1.

  • hha_brackets (dict) – The age brackets for the heads of household.

  • single_year_age_distr (dict) – The age distribution.

Returns

An array of households of size 1 where each household is a row and the value in the row is the age of the household member.

synthpops.contact_networks.generate_larger_households(size, hh_sizes, hha_by_size_counts, hha_brackets, age_brackets, age_by_brackets_dic, contact_matrix_dic, single_year_age_distr)

Generate ages of those living in households of greater than one individual. Reference individual is sampled conditional on the household size. All other household members have their ages sampled conditional on the reference person’s age and the age mixing contact matrix in households for the population under study.

Parameters
  • size (int) – The household size.

  • hh_sizes (array) – The count of household size s at index s-1.

  • hha_by_size_counts (matrix) – A matrix in which each row contains the age distribution of the reference person for household size s at index s-1.

  • hha_brackets (dict) – The age brackets for the heads of household.

  • age_brackets (dict) – A dictionary mapping age bracket keys to age bracket range.

  • age_by_brackets_dic (dict) – A dictionary mapping age to the age bracket range it falls within.

  • contact_matrix_dic (dict) – A dictionary of the age-specific contact matrix for different physical contact settings.

  • single_year_age_distr (dict) – The age distribution.

Returns

An array of households for size size where each household is a row and the values in the row are the ages of the household members. The first age in the row is the age of the reference individual.

synthpops.contact_networks.generate_all_households(N, hh_sizes, hha_by_size_counts, hha_brackets, age_brackets, age_by_brackets_dic, contact_matrix_dic, single_year_age_distr)

Generate the ages of those living in households together. First create households of people living alone, then larger households. For households larger than 1, a reference individual’s age is sampled conditional on the household size, while all other household members have their ages sampled conditional on the reference person’s age and the age mixing contact matrix in households for the population under study.

Parameters
  • N (int) – The number of people in the population.

  • hh_sizes (array) – The count of household size s at index s-1.

  • hha_by_size_counts (matrix) – A matrix in which each row contains the age distribution of the reference person for household size s at index s-1.

  • hha_brackets (dict) – The age brackets for the heads of household.

  • age_brackets (dict) – The dictionary mapping age bracket keys to age bracket range.

  • age_by_brackets_dic (dict) – The dictionary mapping age to the age bracket range it falls within.

  • contact_matrix_dic (dict) – The dictionary of the age-specific contact matrix for different physical contact settings.

  • single_year_age_distr (dict) – The age distribution.

Returns

An array of all households where each household is a row and the values in the row are the ages of the household members. The first age in the row is the age of the reference individual. Households are randomly shuffled by size.

synthpops.contact_networks.assign_uids_by_homes(homes, id_len=16, use_int=True)

Assign IDs to everyone in order by their households.

Parameters
  • homes (array) – The generated synthetic ages of household members.

  • id_len (int) – The length of the UID.

  • use_int (bool) – If True, use ints for the uids of individuals; otherwise use strings of length ‘id_len’.

Returns

A copy of the generated households with IDs in place of ages, and a dictionary mapping ID to age.

synthpops.contact_networks.get_uids_in_school(datadir, n, location, state_location, country_location, age_by_uid_dic=None, homes_by_uids=None, folder_name=None, use_default=False)

Identify who in the population is attending school based on enrollment rates by age.

Parameters
  • datadir (string) – The file path to the data directory.

  • n (int) – The number of people in the population.

  • location (string) – The name of the location.

  • state_location (string) – The name of the state the location is in.

  • country_location (string) – The name of the country the location is in.

  • age_by_uid_dic (dict) – A dictionary mapping ID to age for all individuals in the population.

  • homes_by_uids (list) – A list of lists where each sublist is a household and the IDs of the household members.

  • folder_name (string) – The name of the folder the location is in, e.g. ‘contact_networks’

  • use_default (bool) – If True, try to first use the other parameters to find data specific to the location under study; otherwise, return default data drawing from Seattle, Washington.

Returns

A dictionary of students in schools mapping their ID to their age, a dictionary of students in school mapping age to the list of IDs with that age, and a dictionary mapping age to the number of students with that age.

synthpops.contact_networks.generate_school_sizes(school_size_distr_by_bracket, school_size_brackets, uids_in_school)

Given a number of students in school, generate a list of school sizes to place everyone in a school.

Parameters
  • school_size_distr_by_bracket (dict) – The distribution of binned school sizes.

  • school_size_brackets (dict) – A dictionary of school size brackets.

  • uids_in_school (dict) – A dictionary of students in school mapping ID to age.

Returns

A list of school sizes whose sum is the length of uids_in_school.

synthpops.contact_networks.send_students_to_school(school_sizes, uids_in_school, uids_in_school_by_age, ages_in_school_count, age_brackets, age_by_brackets_dic, contact_matrix_dic, verbose=False)

A method to send students to school together. Using the matrices to construct schools is not a perfect method so some things are more forced than the matrix method alone would create. This method models schools using matrices and so it does not create explicit school types.

Parameters
  • school_sizes (list) – A list of school sizes.

  • uids_in_school (dict) – A dictionary of students in school mapping ID to age.

  • uids_in_school_by_age (dict) – A dictionary of students in school mapping age to the list of IDs with that age.

  • ages_in_school_count (dict) – A dictionary mapping age to the number of students with that age.

  • age_brackets (dict) – A dictionary mapping age bracket keys to age bracket range.

  • age_by_brackets_dic (dict) – A dictionary mapping age to the age bracket range it falls within.

  • contact_matrix_dic (dict) – A dictionary of age specific contact matrix for different physical contact settings.

  • verbose (bool) – If True, print statements about the generated schools as they’re being generated.

Returns

Two lists of lists and third flat list, the first where each sublist is the ages of students in the same school, and the second is the same list but with the IDs of each student in place of their age. The third is a list of the school types for each school, where each school has a single string to represent it’s school type.

synthpops.contact_networks.send_students_to_school_with_school_types(school_size_distr_by_type, school_size_brackets, uids_in_school, uids_in_school_by_age, ages_in_school_count, school_types_by_age, school_type_age_ranges, verbose=False)

A method to send students to school together. This method uses the dictionaries school_types_by_age, school_type_age_ranges, and school_size_distr_by_type to first determine the type of school based on the age of a sampled reference student. Then the school type is used to determine the age range of the school. After that, the size of the school is then sampled conditionally on the school type and then the rest of the students are chosen from the lists of students available in the dictionary uids_in_school_by_age. This method is not perfect and requires a strict definition of school type by age. For now, it is not able to model mixed school types such as schools with Kindergarten through Grade 8 (K-8), or Kindergarten through Grade 12. These mixed types of schools may be common in some settings and this feature may be added later.

Parameters
  • school_size_distr_by_type (dict) – A dictionary of school size distributions binned by size groups or brackets for each school type.

  • school_size_brackets (dict) – A dictionary of school size brackets.

  • uids_in_school (dict) – A dictionary of students in school mapping ID to age.

  • uids_in_school_by_age (dict) – A dictionary of students in school mapping age to the list of IDs with that age.

  • ages_in_school_count (dict) – A dictionary mapping age to the number of students with that age.

  • school_types_by_age (dict) – A dictionary of the school type for each age.

  • school_type_age_ranges (dict) – A dictionary of the age range for each school type.

  • verbose (bool) – If True, print statements about the generated schools as they’re being generated.

Returns

Two lists of lists and third flat list, the first where each sublist is the ages of students in the same school, and the second is the same list but with the IDs of each student in place of their age. The third is a list of the school types for each school, where each school has a single string to represent it’s school type.

synthpops.contact_networks.get_uids_potential_workers(syn_school_uids, employment_rates, age_by_uid_dic)

Get IDs for everyone who could be a worker by removing those who are students and those who can’t be employed officially.

Parameters
  • syn_school_uids (list) – A list of lists where each sublist represents a school with the IDs of students in the school.

  • employment_rates (dict) – The employment rates by age.

  • age_by_uid_dic (dict) – A dictionary mapping ID to age for individuals in the population.

Returns

A dictionary of potential workers mapping their ID to their age, a dictionary mapping age to the list of IDs for potential workers with that age, and a dictionary mapping age to the count of potential workers left to assign to a workplace for that age.

synthpops.contact_networks.generate_workplace_sizes(workplace_size_distr_by_bracket, workplace_size_brackets, workers_by_age_to_assign_count)

Given a number of individuals employed, generate a list of workplace sizes to place everyone in a workplace.

Parameters
  • workplace_size_distr_by_bracket (dict) – The distribution of binned workplace sizes.

  • worplace_size_brackets (dict) – A dictionary of workplace size brackets.

  • workers_by_age_to_assign_count (dict) – A dictionary mapping age to the count of employed individuals of that age.

Returns

A list of workplace sizes.

synthpops.contact_networks.generate_usa_workplace_sizes(workplace_sizes_by_bracket, workplace_size_brackets, workers_by_age_to_assign_count)

Given a number of individuals employed, generate a list of workplace sizes to place everyone in a workplace. Specific to data from the US. Deprecated.

Parameters
  • workplace_sizes_by_bracket (dict) – The distribution of binned workplace sizes.

  • worplace_size_brackets (dict) – A dictionary of workplace size brackets.

  • workers_by_age_to_assign_count (dict) – A dictionary mapping age to the count of employed individuals of that age.

Returns

A list of workplace sizes.

synthpops.contact_networks.get_workers_by_age_to_assign(employment_rates, potential_worker_ages_left_count, uids_by_age_dic)

Get the number of people to assign to a workplace by age using those left who can potentially go to work and employment rates by age.

Parameters
  • employment_rates (dict) – A dictionary of employment rates by age.

  • potential_worker_ages_left_count (dict) – A dictionary of the count of workers to assign by age.

  • uids_by_age_dic (dict) – A dictionary mapping age to the list of ids with that age.

Returns

A dictionary with a count of workers to assign to a workplace.

synthpops.contact_networks.assign_teachers_to_schools(syn_schools, syn_school_uids, employment_rates, workers_by_age_to_assign_count, potential_worker_uids, potential_worker_uids_by_age, potential_worker_ages_left_count, average_student_teacher_ratio=20, teacher_age_min=25, teacher_age_max=75, verbose=False)

Assign teachers to each school according to the average student-teacher ratio.

Parameters
  • syn_schools (list) – list of lists where each sublist is a school with the ages of the students within

  • syn_school_uids (list) – list of lists where each sublist is a school with the ids of the students within

  • employment_rates (dict) – employment rates by age

  • workers_by_age_to_assign_count (dict) – dictionary of the count of workers left to assign by age

  • potential_worker_uids (dict) – dictionary of potential workers mapping their id to their age

  • potential_worker_uids_by_age (dict) – dictionary mapping age to the list of worker ids with that age

  • potential_worker_ages_left_count (dict) – dictionary of the count of potential workers left that can be assigned by age

  • average_student_teacher_ratio (float) – The average number of students per teacher.

  • teacher_age_min (int) – minimum age for teachers - should be location specific.

  • teacher_age_max (int) – maximum age for teachers - should be location specific.

  • verbose (bool) – If True, print statements about the generated schools as teachers are being added to each school.

Returns

List of lists of schools with the ages of individuals in each, lists of lists of schools with the ids of individuals in each, dictionary of potential workers mapping id to their age, dictionary mapping age to the list of potential workers of that age, dictionary with the count of workers left to assign for each age after teachers have been assigned.

synthpops.contact_networks.assign_additional_staff_to_schools(syn_school_uids, syn_teacher_uids, workers_by_age_to_assign_count, potential_worker_uids, potential_worker_uids_by_age, potential_worker_ages_left_count, average_student_teacher_ratio=20, average_student_all_staff_ratio=15, staff_age_min=20, staff_age_max=75, verbose=True)

Assign additional staff to each school according to the average student to all staff ratio.

Parameters
  • syn_school_uids (list) – list of lists where each sublist is a school with the ids of the students within

  • syn_teacher_uids (list) – list of lists where each sublist is a school with the ids of the teachers within

  • workers_by_age_to_assign_count (dict) – dictionary of the count of workers left to assign by age

  • potential_worker_uids (dict) – dictionary of potential workers mapping their id to their age

  • potential_worker_uids_by_age (dict) – dictionary mapping age to the list of worker ids with that age

  • potential_worker_ages_left_count (dict) – dictionary of the count of potential workers left that can be assigned by age

  • average_student_teacher_ratio (float) – The average number of students per teacher.

  • average_student_all_staff_ratio (float) – The average number of students per staff members at school (including both teachers and non teachers).

  • staff_age_min (int) – The minimum age for non teaching staff.

  • staff_age_max (int) – The maximum age for non teaching staff.

  • verbose (bool) – If True, print statements about the generated schools as teachers are being added to each school.

Returns

List of lists of schools with the ids of non teaching staff for each school, dictionary of potential workers mapping id to their age, dictionary mapping age to the list of potential workers of that age, dictionary with the count of workers left to assign for each age after teachers have been assigned.

synthpops.contact_networks.assign_rest_of_workers(workplace_sizes, potential_worker_uids, potential_worker_uids_by_age, workers_by_age_to_assign_count, age_by_uid_dic, age_brackets, age_by_brackets_dic, contact_matrix_dic, verbose=False)

Assign the rest of the workers to non-school workplaces.

Parameters
  • workplace_sizes (list) – list of workplace sizes

  • potential_worker_uids (dict) – dictionary of potential workers mapping their id to their age

  • potential_worker_uids_by_age (dict) – dictionary mapping age to the list of worker ids with that age

  • workers_by_age_to_assign_count (dict) – dictionary of the count of workers left to assign by age

  • age_by_uid_dic (dict) – dictionary mapping id to age for all individuals in the population

  • age_brackets (dict) – dictionary mapping age bracket keys to age bracket range

  • age_by_brackets_dic (dict) – dictionary mapping age to the age bracket range it falls in

  • contact_matrix_dic (dict) – dictionary of age specific contact matrix for different physical contact settings

  • verbose (bool) – If True, print statements about the generated schools as teachers are being added to each school.

Returns

List of lists where each sublist is a workplace with the ages of workers, list of lists where each sublist is a workplace with the ids of workers, dictionary of potential workers left mapping id to age, dictionary mapping age to a list of potential workers left of that age, dictionary mapping age to the count of workers left to assign.

synthpops.contact_networks.generate_synthetic_population(n, datadir, location='seattle_metro', state_location='Washington', country_location='usa', sheet_name='United States of America', school_enrollment_counts_available=False, with_school_types=False, school_mixing_type='random', average_class_size=20, inter_grade_mixing=0.1, average_student_teacher_ratio=20, average_teacher_teacher_degree=3, teacher_age_min=25, teacher_age_max=75, average_student_all_staff_ratio=15, average_additional_staff_degree=20, staff_age_min=20, staff_age_max=75, verbose=False, plot=False, write=False, return_popdict=False, use_default=False)

Wrapper function that calls other functions to generate a full population with their contacts in the household, school, and workplace layers, and then writes this population to appropriate files.

Parameters
  • n (int) – The number of people in the population.

  • datadir (string) – The file path to the data directory.

  • location (string) – The name of the location.

  • state_location (string) – The name of the state the location is in.

  • country_location (string) – The name of the country the location is in.

  • sheet_name (string) – The name of the sheet in the Excel file with contact patterns.

  • school_enrollment_counts_available (bool) – If True, a list of school sizes is available and a count of the sizes can be constructed.

  • with_school_types (bool) – If True, create explicit school types

  • average_class_size (float) – The average classroom size

  • inter_grade_mixing (float) – The average fraction of mixing between grades in the same school for clustered school mixing types.

  • average_student_teacher_ratio (float) – The average number of students per teacher.

  • average_teacher_teacher_degree (float) – The average number of contacts per teacher with other teachers.

  • teacher_age_min (int) – The minimum age for teachers.

  • teacher_age_max (int) – The maximum age for teachers.

  • average_student_all_staff_ratio (float) – The average number of students per staff members at school (including both teachers and non teachers).

  • average_additional_staff_degree (float) – The average number of contacts per additional non teaching staff in schools.

  • staff_age_min (int) – The minimum age for non teaching staff.

  • staff_age_max (int) – The maximum age for non teaching staff.

  • school_mixing_type (string) – The mixing type for schools.

  • verbose (bool) – If True, print statements as contacts are being generated.

  • plot (bool) – If True, plot and show a comparison of the generated age distribution in households vs. the expected age distribution of the population from census data being sampled.

  • write (bool) – If True, write population to file.

  • return_popdict (bool) – If True, returns a dictionary of individuals in the population

  • use_default (bool) – If True, try to first use the other parameters to find data specific to the location under study; otherwise, return default data drawing from Seattle, Washington.

Returns

If return_popdict is True, returns popdict, a dictionary of people with attributes. Dictionary keys are the IDs of individuals in the population and the values are a dictionary for each individual with their attributes, such as age, household ID (hhid), school ID (scid), workplace ID (wpid), workplace industry code (wpindcode) if available, and the IDs of their contacts in different layers. Different layers available are households (‘H’), schools (‘S’), and workplaces (‘W’). Contacts in these layers are clustered and thus form a network composed of groups of people interacting with each other. For example, all household members are contacts of each other, and everyone in the same school is a contact of each other. Else, return None.

Example

datadir = sp.datadir # point datadir where your data folder lives

location = 'seattle_metro'
state_location = 'Washington'
country_location = 'usa'
sheet_name = 'United States of America'

n = 10000
verbose = False
plot = False

# this will generate a population with microstructure and age demographics that
# approximate those of the location selected
# also saves to file in:
#    datadir/demographics/contact_matrices_152_countries/state_location/
sp.generate_synthetic_population(n,datadir,location=location,
                                 state_location=state_location,
                                 country_location=country_location,
                                 sheet_name=sheet_name,verbose=verbose,plot=plot)