synthpops.contact_networks module¶
This module generates the household, school, and workplace contact networks.
-
synthpops.contact_networks.
generate_household_sizes
(Nhomes, hh_size_distr)¶ Given a number of homes and a household size distribution, generate the number of homes of each size.
- Parameters
Nhomes (int) – The number of homes.
hh_size_distr (dict) – The distribution of household sizes.
- Returns
An array with the count of households of size s at index s-1.
-
synthpops.contact_networks.
generate_household_sizes_from_fixed_pop_size
(N, hh_size_distr)¶ Given a number of people and a household size distribution, generate the number of homes of each size needed to place everyone in a household.
- Parameters
N (int) – The number of people in the population.
hh_size_distr (dict) – The distribution of household sizes.
- Returns
An array with the count of households of size s at index s-1.
-
synthpops.contact_networks.
get_totalpopsize_from_household_sizes
(hh_sizes)¶ Sum the population of a specific household size from the count array.
- Parameters
hh_sizes (array) – The count of household size s at index s-1.
- Returns
An integer indicating the total number of people in household size s.
-
synthpops.contact_networks.
generate_household_head_age_by_size
(hha_by_size_counts, hha_brackets, hh_size, single_year_age_distr)¶ Generate the age of the head of the household, also known as the reference person of the household, conditional on the size of the household.
- Parameters
hha_by_size_counts (matrix) – A matrix in which each row contains the age distribution of the reference person for household size s at index s-1.
hha_brackets (dict) – The age brackets for the heads of household.
hh_size (int) – The household size.
single_year_age_distr (dict) – The age distribution.
- Returns
Age of the head of the household or reference person.
-
synthpops.contact_networks.
generate_living_alone
(hh_sizes, hha_by_size_counts, hha_brackets, single_year_age_distr)¶ Generate the ages of those living alone.
- Parameters
hh_sizes (array) – The count of household size s at index s-1.
hha_by_size_counts (matrix) – A matrix in which each row contains the age distribution of the reference person for household size s at index s-1.
hha_brackets (dict) – The age brackets for the heads of household.
single_year_age_distr (dict) – The age distribution.
- Returns
An array of households of size 1 where each household is a row and the value in the row is the age of the household member.
-
synthpops.contact_networks.
generate_larger_households
(size, hh_sizes, hha_by_size_counts, hha_brackets, age_brackets, age_by_brackets_dic, contact_matrix_dic, single_year_age_distr)¶ Generate ages of those living in households of greater than one individual. Reference individual is sampled conditional on the household size. All other household members have their ages sampled conditional on the reference person’s age and the age mixing contact matrix in households for the population under study.
- Parameters
size (int) – The household size.
hh_sizes (array) – The count of household size s at index s-1.
hha_by_size_counts (matrix) – A matrix in which each row contains the age distribution of the reference person for household size s at index s-1.
hha_brackets (dict) – The age brackets for the heads of household.
age_brackets (dict) – A dictionary mapping age bracket keys to age bracket range.
age_by_brackets_dic (dict) – A dictionary mapping age to the age bracket range it falls within.
contact_matrix_dic (dict) – A dictionary of the age-specific contact matrix for different physical contact settings.
single_year_age_distr (dict) – The age distribution.
- Returns
An array of households for size
size
where each household is a row and the values in the row are the ages of the household members. The first age in the row is the age of the reference individual.
-
synthpops.contact_networks.
generate_all_households
(N, hh_sizes, hha_by_size_counts, hha_brackets, age_brackets, age_by_brackets_dic, contact_matrix_dic, single_year_age_distr)¶ Generate the ages of those living in households together. First create households of people living alone, then larger households. For households larger than 1, a reference individual’s age is sampled conditional on the household size, while all other household members have their ages sampled conditional on the reference person’s age and the age mixing contact matrix in households for the population under study.
- Parameters
N (int) – The number of people in the population.
hh_sizes (array) – The count of household size s at index s-1.
hha_by_size_counts (matrix) – A matrix in which each row contains the age distribution of the reference person for household size s at index s-1.
hha_brackets (dict) – The age brackets for the heads of household.
age_brackets (dict) – The dictionary mapping age bracket keys to age bracket range.
age_by_brackets_dic (dict) – The dictionary mapping age to the age bracket range it falls within.
contact_matrix_dic (dict) – The dictionary of the age-specific contact matrix for different physical contact settings.
single_year_age_distr (dict) – The age distribution.
- Returns
An array of all households where each household is a row and the values in the row are the ages of the household members. The first age in the row is the age of the reference individual. Households are randomly shuffled by size.
-
synthpops.contact_networks.
assign_uids_by_homes
(homes, id_len=16, use_int=True)¶ Assign IDs to everyone in order by their households.
- Parameters
homes (array) – The generated synthetic ages of household members.
id_len (int) – The length of the UID.
use_int (bool) – If True, use ints for the uids of individuals; otherwise use strings of length ‘id_len’.
- Returns
A copy of the generated households with IDs in place of ages, and a dictionary mapping ID to age.
-
synthpops.contact_networks.
get_uids_in_school
(datadir, n, location, state_location, country_location, age_by_uid_dic=None, homes_by_uids=None, folder_name=None, use_default=False)¶ Identify who in the population is attending school based on enrollment rates by age.
- Parameters
datadir (string) – The file path to the data directory.
n (int) – The number of people in the population.
location (string) – The name of the location.
state_location (string) – The name of the state the location is in.
country_location (string) – The name of the country the location is in.
age_by_uid_dic (dict) – A dictionary mapping ID to age for all individuals in the population.
homes_by_uids (list) – A list of lists where each sublist is a household and the IDs of the household members.
folder_name (string) – The name of the folder the location is in, e.g. ‘contact_networks’
use_default (bool) – If True, try to first use the other parameters to find data specific to the location under study; otherwise, return default data drawing from Seattle, Washington.
- Returns
A dictionary of students in schools mapping their ID to their age, a dictionary of students in school mapping age to the list of IDs with that age, and a dictionary mapping age to the number of students with that age.
-
synthpops.contact_networks.
generate_school_sizes
(school_size_distr_by_bracket, school_size_brackets, uids_in_school)¶ Given a number of students in school, generate a list of school sizes to place everyone in a school.
- Parameters
school_size_distr_by_bracket (dict) – The distribution of binned school sizes.
school_size_brackets (dict) – A dictionary of school size brackets.
uids_in_school (dict) – A dictionary of students in school mapping ID to age.
- Returns
A list of school sizes whose sum is the length of
uids_in_school
.
-
synthpops.contact_networks.
send_students_to_school
(school_sizes, uids_in_school, uids_in_school_by_age, ages_in_school_count, age_brackets, age_by_brackets_dic, contact_matrix_dic, verbose=False)¶ A method to send students to school together. Using the matrices to construct schools is not a perfect method so some things are more forced than the matrix method alone would create. This method models schools using matrices and so it does not create explicit school types.
- Parameters
school_sizes (list) – A list of school sizes.
uids_in_school (dict) – A dictionary of students in school mapping ID to age.
uids_in_school_by_age (dict) – A dictionary of students in school mapping age to the list of IDs with that age.
ages_in_school_count (dict) – A dictionary mapping age to the number of students with that age.
age_brackets (dict) – A dictionary mapping age bracket keys to age bracket range.
age_by_brackets_dic (dict) – A dictionary mapping age to the age bracket range it falls within.
contact_matrix_dic (dict) – A dictionary of age specific contact matrix for different physical contact settings.
verbose (bool) – If True, print statements about the generated schools as they’re being generated.
- Returns
Two lists of lists and third flat list, the first where each sublist is the ages of students in the same school, and the second is the same list but with the IDs of each student in place of their age. The third is a list of the school types for each school, where each school has a single string to represent it’s school type.
-
synthpops.contact_networks.
send_students_to_school_with_school_types
(school_size_distr_by_type, school_size_brackets, uids_in_school, uids_in_school_by_age, ages_in_school_count, school_types_by_age, school_type_age_ranges, verbose=False)¶ A method to send students to school together. This method uses the dictionaries school_types_by_age, school_type_age_ranges, and school_size_distr_by_type to first determine the type of school based on the age of a sampled reference student. Then the school type is used to determine the age range of the school. After that, the size of the school is then sampled conditionally on the school type and then the rest of the students are chosen from the lists of students available in the dictionary uids_in_school_by_age. This method is not perfect and requires a strict definition of school type by age. For now, it is not able to model mixed school types such as schools with Kindergarten through Grade 8 (K-8), or Kindergarten through Grade 12. These mixed types of schools may be common in some settings and this feature may be added later.
- Parameters
school_size_distr_by_type (dict) – A dictionary of school size distributions binned by size groups or brackets for each school type.
school_size_brackets (dict) – A dictionary of school size brackets.
uids_in_school (dict) – A dictionary of students in school mapping ID to age.
uids_in_school_by_age (dict) – A dictionary of students in school mapping age to the list of IDs with that age.
ages_in_school_count (dict) – A dictionary mapping age to the number of students with that age.
school_types_by_age (dict) – A dictionary of the school type for each age.
school_type_age_ranges (dict) – A dictionary of the age range for each school type.
verbose (bool) – If True, print statements about the generated schools as they’re being generated.
- Returns
Two lists of lists and third flat list, the first where each sublist is the ages of students in the same school, and the second is the same list but with the IDs of each student in place of their age. The third is a list of the school types for each school, where each school has a single string to represent it’s school type.
-
synthpops.contact_networks.
get_uids_potential_workers
(syn_school_uids, employment_rates, age_by_uid_dic)¶ Get IDs for everyone who could be a worker by removing those who are students and those who can’t be employed officially.
- Parameters
syn_school_uids (list) – A list of lists where each sublist represents a school with the IDs of students in the school.
employment_rates (dict) – The employment rates by age.
age_by_uid_dic (dict) – A dictionary mapping ID to age for individuals in the population.
- Returns
A dictionary of potential workers mapping their ID to their age, a dictionary mapping age to the list of IDs for potential workers with that age, and a dictionary mapping age to the count of potential workers left to assign to a workplace for that age.
-
synthpops.contact_networks.
generate_workplace_sizes
(workplace_size_distr_by_bracket, workplace_size_brackets, workers_by_age_to_assign_count)¶ Given a number of individuals employed, generate a list of workplace sizes to place everyone in a workplace.
- Parameters
workplace_size_distr_by_bracket (dict) – The distribution of binned workplace sizes.
worplace_size_brackets (dict) – A dictionary of workplace size brackets.
workers_by_age_to_assign_count (dict) – A dictionary mapping age to the count of employed individuals of that age.
- Returns
A list of workplace sizes.
-
synthpops.contact_networks.
generate_usa_workplace_sizes
(workplace_sizes_by_bracket, workplace_size_brackets, workers_by_age_to_assign_count)¶ Given a number of individuals employed, generate a list of workplace sizes to place everyone in a workplace. Specific to data from the US. Deprecated.
- Parameters
workplace_sizes_by_bracket (dict) – The distribution of binned workplace sizes.
worplace_size_brackets (dict) – A dictionary of workplace size brackets.
workers_by_age_to_assign_count (dict) – A dictionary mapping age to the count of employed individuals of that age.
- Returns
A list of workplace sizes.
-
synthpops.contact_networks.
get_workers_by_age_to_assign
(employment_rates, potential_worker_ages_left_count, uids_by_age_dic)¶ Get the number of people to assign to a workplace by age using those left who can potentially go to work and employment rates by age.
- Parameters
employment_rates (dict) – A dictionary of employment rates by age.
potential_worker_ages_left_count (dict) – A dictionary of the count of workers to assign by age.
uids_by_age_dic (dict) – A dictionary mapping age to the list of ids with that age.
- Returns
A dictionary with a count of workers to assign to a workplace.
-
synthpops.contact_networks.
assign_teachers_to_schools
(syn_schools, syn_school_uids, employment_rates, workers_by_age_to_assign_count, potential_worker_uids, potential_worker_uids_by_age, potential_worker_ages_left_count, average_student_teacher_ratio=20, teacher_age_min=25, teacher_age_max=75, verbose=False)¶ Assign teachers to each school according to the average student-teacher ratio.
- Parameters
syn_schools (list) – list of lists where each sublist is a school with the ages of the students within
syn_school_uids (list) – list of lists where each sublist is a school with the ids of the students within
employment_rates (dict) – employment rates by age
workers_by_age_to_assign_count (dict) – dictionary of the count of workers left to assign by age
potential_worker_uids (dict) – dictionary of potential workers mapping their id to their age
potential_worker_uids_by_age (dict) – dictionary mapping age to the list of worker ids with that age
potential_worker_ages_left_count (dict) – dictionary of the count of potential workers left that can be assigned by age
average_student_teacher_ratio (float) – The average number of students per teacher.
teacher_age_min (int) – minimum age for teachers - should be location specific.
teacher_age_max (int) – maximum age for teachers - should be location specific.
verbose (bool) – If True, print statements about the generated schools as teachers are being added to each school.
- Returns
List of lists of schools with the ages of individuals in each, lists of lists of schools with the ids of individuals in each, dictionary of potential workers mapping id to their age, dictionary mapping age to the list of potential workers of that age, dictionary with the count of workers left to assign for each age after teachers have been assigned.
-
synthpops.contact_networks.
assign_additional_staff_to_schools
(syn_school_uids, syn_teacher_uids, workers_by_age_to_assign_count, potential_worker_uids, potential_worker_uids_by_age, potential_worker_ages_left_count, average_student_teacher_ratio=20, average_student_all_staff_ratio=15, staff_age_min=20, staff_age_max=75, verbose=True)¶ Assign additional staff to each school according to the average student to all staff ratio.
- Parameters
syn_school_uids (list) – list of lists where each sublist is a school with the ids of the students within
syn_teacher_uids (list) – list of lists where each sublist is a school with the ids of the teachers within
workers_by_age_to_assign_count (dict) – dictionary of the count of workers left to assign by age
potential_worker_uids (dict) – dictionary of potential workers mapping their id to their age
potential_worker_uids_by_age (dict) – dictionary mapping age to the list of worker ids with that age
potential_worker_ages_left_count (dict) – dictionary of the count of potential workers left that can be assigned by age
average_student_teacher_ratio (float) – The average number of students per teacher.
average_student_all_staff_ratio (float) – The average number of students per staff members at school (including both teachers and non teachers).
staff_age_min (int) – The minimum age for non teaching staff.
staff_age_max (int) – The maximum age for non teaching staff.
verbose (bool) – If True, print statements about the generated schools as teachers are being added to each school.
- Returns
List of lists of schools with the ids of non teaching staff for each school, dictionary of potential workers mapping id to their age, dictionary mapping age to the list of potential workers of that age, dictionary with the count of workers left to assign for each age after teachers have been assigned.
-
synthpops.contact_networks.
assign_rest_of_workers
(workplace_sizes, potential_worker_uids, potential_worker_uids_by_age, workers_by_age_to_assign_count, age_by_uid_dic, age_brackets, age_by_brackets_dic, contact_matrix_dic, verbose=False)¶ Assign the rest of the workers to non-school workplaces.
- Parameters
workplace_sizes (list) – list of workplace sizes
potential_worker_uids (dict) – dictionary of potential workers mapping their id to their age
potential_worker_uids_by_age (dict) – dictionary mapping age to the list of worker ids with that age
workers_by_age_to_assign_count (dict) – dictionary of the count of workers left to assign by age
age_by_uid_dic (dict) – dictionary mapping id to age for all individuals in the population
age_brackets (dict) – dictionary mapping age bracket keys to age bracket range
age_by_brackets_dic (dict) – dictionary mapping age to the age bracket range it falls in
contact_matrix_dic (dict) – dictionary of age specific contact matrix for different physical contact settings
verbose (bool) – If True, print statements about the generated schools as teachers are being added to each school.
- Returns
List of lists where each sublist is a workplace with the ages of workers, list of lists where each sublist is a workplace with the ids of workers, dictionary of potential workers left mapping id to age, dictionary mapping age to a list of potential workers left of that age, dictionary mapping age to the count of workers left to assign.
-
synthpops.contact_networks.
generate_synthetic_population
(n, datadir, location='seattle_metro', state_location='Washington', country_location='usa', sheet_name='United States of America', school_enrollment_counts_available=False, with_school_types=False, school_mixing_type='random', average_class_size=20, inter_grade_mixing=0.1, average_student_teacher_ratio=20, average_teacher_teacher_degree=3, teacher_age_min=25, teacher_age_max=75, average_student_all_staff_ratio=15, average_additional_staff_degree=20, staff_age_min=20, staff_age_max=75, verbose=False, plot=False, write=False, return_popdict=False, use_default=False)¶ Wrapper function that calls other functions to generate a full population with their contacts in the household, school, and workplace layers, and then writes this population to appropriate files.
- Parameters
n (int) – The number of people in the population.
datadir (string) – The file path to the data directory.
location (string) – The name of the location.
state_location (string) – The name of the state the location is in.
country_location (string) – The name of the country the location is in.
sheet_name (string) – The name of the sheet in the Excel file with contact patterns.
school_enrollment_counts_available (bool) – If True, a list of school sizes is available and a count of the sizes can be constructed.
with_school_types (bool) – If True, create explicit school types
average_class_size (float) – The average classroom size
inter_grade_mixing (float) – The average fraction of mixing between grades in the same school for clustered school mixing types.
average_student_teacher_ratio (float) – The average number of students per teacher.
average_teacher_teacher_degree (float) – The average number of contacts per teacher with other teachers.
teacher_age_min (int) – The minimum age for teachers.
teacher_age_max (int) – The maximum age for teachers.
average_student_all_staff_ratio (float) – The average number of students per staff members at school (including both teachers and non teachers).
average_additional_staff_degree (float) – The average number of contacts per additional non teaching staff in schools.
staff_age_min (int) – The minimum age for non teaching staff.
staff_age_max (int) – The maximum age for non teaching staff.
school_mixing_type (string) – The mixing type for schools.
verbose (bool) – If True, print statements as contacts are being generated.
plot (bool) – If True, plot and show a comparison of the generated age distribution in households vs. the expected age distribution of the population from census data being sampled.
write (bool) – If True, write population to file.
return_popdict (bool) – If True, returns a dictionary of individuals in the population
use_default (bool) – If True, try to first use the other parameters to find data specific to the location under study; otherwise, return default data drawing from Seattle, Washington.
- Returns
If return_popdict is True, returns popdict, a dictionary of people with attributes. Dictionary keys are the IDs of individuals in the population and the values are a dictionary for each individual with their attributes, such as age, household ID (hhid), school ID (scid), workplace ID (wpid), workplace industry code (wpindcode) if available, and the IDs of their contacts in different layers. Different layers available are households (‘H’), schools (‘S’), and workplaces (‘W’). Contacts in these layers are clustered and thus form a network composed of groups of people interacting with each other. For example, all household members are contacts of each other, and everyone in the same school is a contact of each other. Else, return None.
Example
datadir = sp.datadir # point datadir where your data folder lives location = 'seattle_metro' state_location = 'Washington' country_location = 'usa' sheet_name = 'United States of America' n = 10000 verbose = False plot = False # this will generate a population with microstructure and age demographics that # approximate those of the location selected # also saves to file in: # datadir/demographics/contact_matrices_152_countries/state_location/ sp.generate_synthetic_population(n,datadir,location=location, state_location=state_location, country_location=country_location, sheet_name=sheet_name,verbose=verbose,plot=plot)