synthpops.contacts module¶
Generate contacts between people in the population, with many options possible.
-
synthpops.contacts.
make_popdict
(n=None, uids=None, ages=None, sexes=None, location=None, state_location=None, country_location=None, use_demography=False, id_len=6)¶ Create a dictionary of n people with age, sex and loc keys
- Parameters
n (int) – number of people
uids (list) – supplied uids of individuals
ages (list) – supplied ages of individuals
sexes (list) – supplied sexes of individuals
location (string) – name of the location
state_location (string) – name of the state the location is in
country_location (string) – name of the country the location is in
use_demography (bool) – If True, use demographic data
id_len (int) – length of the uid
- Returns
A dictionary where keys are the uid of each person and the values are another dictionary containing values for other attributes of the person
-
synthpops.contacts.
make_contacts_generic
(popdict, network_distr_args)¶ Create contact network regardless of age, according to network distribution properties. Can be used by webapp.
- Parameters
popdict (dict) – dictionary of all individuals
network_distr_args (dict) – network distribution parameters dictionary for average_degree, network_type, and directionality
- Returns
A dictionary of individuals with contacts drawn from given network distribution parameters.
Create contact network according to overall age-mixing contact matrices. Does not capture clustering or microstructure, therefore exact households, schools, or workplaces are not created. However, this does separate agents according to their age and gives them contacts likely for their age. For all ages, the average number of contacts is constant with this method, although location specific data may prove this to not be true.
- Parameters
popdict (dict) – dictionary of all individuals
n_contacts_dic (dict) – number of contacts to draw on average by setting
location (string) – name of the location
state_location (string) – name of the state the location is in
country_location (string) – name of the country the location is in
sheet_name (string) – name of the sheet in the excel file with contact patterns
network_distr_args (dict) – network distribution parameters dictionary for average_degree, network_type, and directionality, can also include powerlaw exponents, block sizes (re: SBMs), clustering distribution, or other properties needed to generate network structures. Checkout https://networkx.github.io/documentation/stable/reference/generators.html#module-networkx.generators for what’s possible Default ‘network_type’ is ‘poisson_degree’ for Erdos-Renyi random graphs in large n limit.
- Returns
A dictionary of individuals with attributes, including their age and the ids of their contacts drawn from given network distribution parameters and the ages of contacts drawn according to overall age mixing data. A single social setting or layer of contacts.
Create contact network according to overall age-mixing contact matrices. Does not capture clustering or microstructure, therefore exact households, schools, or workplaces are not created. However, this does separate agents according to their age and gives them contacts likely for their age specified by the social settings they are likely to participate in. For all ages, the average number of contacts is constant with this method, although location specific data very wel may prove this to not be true. In general, college students may also be workers, however here they are only students and we assume that any of their contacts in the work environment are likely to look like their contacts at school.
Essentially recreates an age-specific compartmental model’s concept of contacts but for an agent based modeling framework.
- Parameters
popdict (dict) – dictionary of all individuals
n_contacts_dic (dict) – number of contacts to draw on average by setting
location (string) – name of the location
state_location (string) – name of the state the location is in
country_location (string) – name of the country the location is in
sheet_name (string) – name of the sheet in the excel file with contact patterns
activity_args (dict) – dictionary of age bounds for participating in different activities like going to school or working, also student-teacher ratio
network_distr_args (dict) – network distribution parameters dictionary for average_degree, network_type, and directionality, can also include powerlaw exponents, block sizes (re: SBMs), clustering distribution, or other properties needed to generate network structures. Checkout https://networkx.github.io/documentation/stable/reference/generators.html#module-networkx.generators for what’s possible Default ‘network_type’ is ‘poisson_degree’ for Erdos-Renyi random graphs in large n limit.
- Returns
A dictionary of individuals with contacts with attributes, including their age and the ids of their contacts drawn from given network distribution parameters and the ages of contacts drawn according to age mixing data. Multiple social settings or layers so contacts are listed for different layers.
Create contact network according to overall age-mixing contact matrices for the US. Does not capture clustering or microstructure, therefore exact households, schools, or workplaces are not created. However, this does separate agents according to their age and gives them contacts likely for their age. For all ages, the average number of contacts is constant, although location specific data may prove this to not be true. Individuals also have a sex, though this in general does not have an impact on their contact patterns.
- Parameters
popdict (dict) – dictionary of all individuals
n_contacts_dic (dict) – number of contacts to draw on average by setting
location (string) – name of the location
state_location (string) – name of the state the location is in
country_location (string) – name of the country the location is in
sheet_name (string) – name of the sheet in the excel file with contact patterns
network_distr_args (dict) – network distribution parameters dictionary for average_degree, network_type, and directionality, can also include powerlaw exponents, block sizes (re: SBMs), clustering distribution, or other properties needed to generate network structures. Checkout https://networkx.github.io/documentation/stable/reference/generators.html#module-networkx.generators for what’s possible Default ‘network_type’ is ‘poisson_degree’ for Erdos-Renyi random graphs in large n limit.
- Returns
A dictionary of individuals with attributes, including their age and the ids of their contacts drawn from given network distribution parameters and the ages of contacts drawn according to overall age mixing data. A single social setting or layer of contacts.
Create contact network according to overall age-mixing contact matrices for the US. Does not capture clustering or microstructure, therefore exact households, schools, or workplaces are not created. However, this does separate agents according to their age and gives them contacts likely for their age specified by the social settings they are likely to participate in. College students may also be workers, however here they are only students and we assume that any of their contacts in the work environment are likely to look like their contacts at school.
- Parameters
popdict (dict) – dictionary of all individuals
n_contacts_dic (dict) – number of contacts to draw on average by setting
location (string) – name of the location
state_location (string) – name of the state the location is in
country_location (string) – name of the country the location is in
sheet_name (string) – name of the sheet in the excel file with contact patterns
activity_args (dict) – dictionary of age bounds for participating in different activities like going to school or working, also student-teacher ratio
network_distr_args (dict) – network distribution parameters dictionary for average_degree, network_type, and directionality, can also include powerlaw exponents, block sizes (re: SBMs), clustering distribution, or other properties needed to generate network structures. Checkout https://networkx.github.io/documentation/stable/reference/generators.html#module-networkx.generators for what’s possible Default ‘network_type’ is ‘poisson_degree’ for Erdos-Renyi random graphs in large n limit.
- Returns
A dictionary of individuals with attributes, including their age and the ids of their contacts drawn from given network distribution parameters and the ages of contacts drawn according to age mixing data. Multiple social settings or layers so contacts are listed for different layers.
-
synthpops.contacts.
rehydrate
(data)¶ Populate popdict with uids, ages and contacts from generated microstructure data that was saved to data object
- Parameters
data (pop object) –
- Returns
Popdict (sc.objdict)
-
synthpops.contacts.
save_synthpop
(datadir, contacts, location)¶ Save pop data object to file.
- Parameters
datadir (string) – file path to the data directory
contacts (dict) – dictionary of people with contacts
location (string) – name of the location
- Returns
None
-
synthpops.contacts.
create_reduced_contacts_with_group_types
(popdict, group_1, group_2, setting, average_degree=20, p_matrix=None, force_cross_edges=True)¶ Create contacts between members of group 1 and group 2, fixing the average degree, and the probability of an edge between any two groups controlled by p_matrix if provided. Forces inter group edge for each individual in group 1 with force_cross_groups equal to True. This means not everyone in group 2 will have a contact with group 1.
- Parameters
group_1 (list) – list of ids for group 1
group_2 (list) – list of ids for group 2
average_degree (int) – average degree across group 1 and 2
p_matrix (np.ndarray) – probability matrix for edges between any two groups
force_cross_groups (bool) – If True, force each individual to have at least one contact with a member from the other group
- Returns
Popdict with edges added for nodes in the two groups.
Notes
This method uses the Stochastic Block Model algorithm to generate contacts both between nodes in different groups
and for nodes within the same group. In the current version, fixing the average degree and p_matrix, the matrix of probabilities for edges between any two groups is not supported. Future versions may add support for this.
-
synthpops.contacts.
make_contacts_from_microstructure
(datadir, location, state_location, country_location, n, with_non_teaching_staff=True, with_school_types=False, school_mixing_type='random', average_class_size=20, inter_grade_mixing=0.1, average_student_teacher_ratio=20, average_teacher_teacher_degree=3, average_student_all_staff_ratio=15, average_additional_staff_degree=20, school_type_by_age=None, with_industry_code=False, verbose=False)¶ Make a popdict from synthetic household, school, and workplace files with uids. If with_industry_code is True, then individuals will have a workplace industry code as well (default value is -1 to represent that this data is unavailable). Currently, industry codes are only available to assign to populations within the US.
- Parameters
datadir (string) – The file path to the data directory
location (string) – The name of the location
state_location (string) – The name of the state the location is in
country_location (string) – The name of the country the location is in
n (int) – The number of people in the population
with_non_teaching_staff (bool) – If True, includes non teaching staff.
with_school_types (bool) – If True, creates explicit school types.
school_mixing_type (str or dict) – The mixing type for schools, ‘random’, ‘age_clustered’, or ‘age_and_class_clustered’ if string, and a dictionary of these by school type otherwise. ‘random’ means random graphs for each school, ‘age_clustered’ means random graphs but with students mostly mixing within the age/grade (inter_grade_mixing controls mixing between grades), ‘age_and_grade_clustered’ means students cohorted into classes with their own teachers.
average_class_size (float) – The average classroom size.
inter_grade_mixing (float) – The average fraction of mixing between grades in the same school for clustered school mixing types.
average_student_teacher_ratio (float) – The average number of students per teacher.
average_teacher_teacher_degree (float) – The average number of contacts per teacher with other teachers.
average_student_all_staff_ratio (float) – The average number of students per staff members at school (including both teachers and non teachers).
average_additional_staff_degree (float) – The average number of contacts per additional non teaching staff in schools.
with_industry_code (bool) – If True, assign workplace industry code read in from cached file
verbose (bool) – If True, print debugging statements.
- Returns
A popdict of people with attributes. Dictionary keys are the IDs of individuals in the population and the values are a dictionary for each individual with their attributes, such as age, household ID (hhid), school ID (scid), workplace ID (wpid), workplace industry code (wpindcode) if available, and the IDs of their contacts in different layers. Different layers available are households (‘H’), schools (‘S’), and workplaces (‘W’). Contacts in these layers are clustered and thus form a network composed of groups of people interacting with each other. For example, all household members are contacts of each other, and everyone in the same school is considered a contact of each other.
Notes
Methods to trim large groups of contacts down to better approximate a sense of close contacts (such as classroom sizes or smaller work groups are available via sp.trim_contacts() - see below).
-
synthpops.contacts.
make_contacts_from_microstructure_objects
(age_by_uid_dic, homes_by_uids, schools_by_uids, teachers_by_uids, workplaces_by_uids, non_teaching_staff_uids=None, with_school_types=False, school_mixing_type='random', average_class_size=20, inter_grade_mixing=0.1, average_student_teacher_ratio=20, average_teacher_teacher_degree=3, average_student_all_staff_ratio=15, average_additional_staff_degree=20, school_type_by_age=None, workplaces_by_industry_codes=None, verbose=False)¶ From microstructure objects (dictionary mapping ID to age, lists of lists in different settings, etc.), create a dictionary of individuals. Each key is the ID of an individual which maps to a dictionary for that individual with attributes such as their age, household ID (hhid), school ID (scid), workplace ID (wpid), workplace industry code (wpindcode) if available, and contacts in different layers.
- Parameters
age_by_uid_dic (dict) – A dictionary mapping id to age for all individuals in the population
homes_by_uids (list) – A list of lists, where each sublist is a household and the IDs of the household members.
schools_by_uids (list) – A list of lists, where each sublist represents a school and the ids of the students within it
teachers_by_uids (list) – A list of lists, where each sublist represents a school and the ids of the teachers within it
workplaces_by_uids (list) – A list of lists, where each sublist represents a workplace and the ids of the workers within it
non_teaching_staff_uids (list) – None or a list of lists, where each sublist represents a school and the ids of the non teaching staff within it
with_school_types (bool) – If True, creates explicit school types.
school_mixing_type (str or dict) – The mixing type for schools, ‘random’, ‘age_clustered’, or ‘age_and_class_clustered’ if string, and a dictionary of these by school type otherwise. ‘random’ means random graphs for each school, ‘age_clustered’ means random graphs but with students mostly mixing within the age/grade (inter_grade_mixing controls mixing between grades), ‘age_and_grade_clustered’ means students cohorted into classes with their own teachers.
average_class_size (float) – The average classroom size.
inter_grade_mixing (float) – The average fraction of mixing between grades in the same school for clustered school mixing types.
average_student_teacher_ratio (float) – The average number of students per teacher.
average_teacher_teacher_degree (float) – The average number of contacts per teacher with other teachers.
average_student_all_staff_ratio (float) – The average number of students per staff members at school (including both teachers and non teachers).
average_additional_staff_degree (float) – The average number of contacts per additional non teaching staff in schools.
school_type_by_age (dict) – A dictionary of probabilities for the school type likely for each age.
workplaces_by_industry_codes (np.ndarray or None) – array with workplace industry code for each workplace
verbose (bool) – If True, print debugging statements.
- Returns
A popdict of people with attributes. Dictionary keys are the IDs of individuals in the population and the values are a dictionary for each individual with their attributes, such as age, household ID (hhid), school ID (scid), workplace ID (wpid), workplace industry code (wpindcode) if available, and the IDs of their contacts in different layers. Different layers available are households (‘H’), schools (‘S’), and workplaces (‘W’). Contacts in these layers are clustered and thus form a network composed of groups of people interacting with each other. For example, all household members are contacts of each other, and everyone in the same school is considered a contact of each other.
Notes
Methods to trim large groups of contacts down to better approximate a sense of close contacts (such as classroom sizes or smaller work groups are available via sp.trim_contacts() - see below).
-
synthpops.contacts.
make_contacts_with_facilities_from_microstructure
(datadir, location, state_location, country_location, n, use_two_group_reduction=False, average_LTCF_degree=20, with_non_teaching_staff=True, with_school_types=False, school_mixing_type='random', average_class_size=20, inter_grade_mixing=0.1, average_student_teacher_ratio=20, average_teacher_teacher_degree=3, average_student_all_staff_ratio=15, average_additional_staff_degree=20, school_type_by_age=None, verbose=False)¶ Make a popdict from synthetic household, school, and workplace files with uids. If with_industry_code is True, then individuals will have a workplace industry code as well (default value is -1 to represent that this data is unavailable). Currently, industry codes are only available to assign to populations within the US.
- Parameters
datadir (string) – file path to the data directory
location (string) – name of the location
state_location (string) – name of the state the location is in
country_location (string) – name of the country the location is in
n (int) – number of people in the population
use_two_group_reduction (bool) – If True, create long term care facilities with reduced contacts across both groups
average_LTCF_degree (int) – default average degree in long term care facilities
with_non_teaching_staff (bool) – If True, includes non teaching staff.
with_school_types (bool) – If True, creates explicit school types.
school_mixing_type (str or dict) – The mixing type for schools, ‘random’, ‘age_clustered’, or ‘age_and_class_clustered’ if string, and a dictionary of these by school type otherwise. ‘random’ means random graphs for each school, ‘age_clustered’ means random graphs but with students mostly mixing within the age/grade (inter_grade_mixing controls mixing between grades), ‘age_and_grade_clustered’ means students cohorted into classes with their own teachers.
average_class_size (float) – The average classroom size.
inter_grade_mixing (float) – The average fraction of mixing between grades in the same school for clustered school mixing types.
average_student_teacher_ratio (float) – The average number of students per teacher.
average_teacher_teacher_degree (float) – The average number of contacts per teacher with other teachers.
average_student_all_staff_ratio (float) – The average number of students per staff members at school (including both teachers and non teachers).
average_additional_staff_degree (float) – The average number of contacts per additional non teaching staff in schools.
school_type_by_age (dict) – A dictionary of probabilities for the school type likely for each age.
verbose (bool) – If True, print debugging statements.
- Returns
A popdict of people with attributes. Dictionary keys are the IDs of individuals in the population and the values are a dictionary for each individual with their attributes, such as age, household ID (hhid), school ID (scid), workplace ID (wpid), workplace industry code (wpindcode) if available, and the IDs of their contacts in different layers. Different layers available are households (‘H’), schools (‘S’), and workplaces (‘W’), and long term care facilities (‘LTCF’). Contacts in these layers are clustered and thus form a network composed of groups of people interacting with each other. For example, all household members are contacts of each other, and everyone in the same school is considered a contact of each other. If use_two_group_reduction is True, then contracts within ‘LTCF’ are reduced from fully connected.
Notes
Methods to trim large groups of contacts down to better approximate a sense of close contacts (such as classroom sizes or smaller work groups are available via sp.trim_contacts() or sp.create_reduced_contacts_with_group_types(): see these methods for more details).
-
synthpops.contacts.
make_contacts_with_facilities_from_microstructure_objects
(age_by_uid_dic, homes_by_uids, schools_by_uids, teachers_by_uids, workplaces_by_uids, facilities_by_uids, facilities_staff_uids, non_teaching_staff_uids=None, use_two_group_reduction=False, average_LTCF_degree=20, with_school_types=False, school_mixing_type='random', average_class_size=20, inter_grade_mixing=0.1, average_student_teacher_ratio=20, average_teacher_teacher_degree=3, average_student_all_staff_ratio=15, average_additional_staff_degree=20, school_type_by_age=None, workplaces_by_industry_codes=None, verbose=False)¶ From microstructure objects (dictionary mapping ID to age, lists of lists in different settings, etc.), create a dictionary of individuals. Each key is the ID of an individual which maps to a dictionary for that individual with attributes such as their age, household ID (hhid), school ID (scid), workplace ID (wpid), workplace industry code (wpindcode) if available, and contacts in different layers.
- Parameters
age_by_uid_dic (dict) – dictionary mapping id to age for all individuals in the population
homes_by_uids (list) – A list of lists where each sublist is a household and the IDs of the household members.
schools_by_uids (list) – A list of lists, where each sublist represents a school and the ids of the students and teachers within it
teachers_by_uids (list) – A list of lists, where each sublist represents a school and the ids of the teachers within it
workplaces_by_uids (list) – A list of lists, where each sublist represents a workplace and the ids of the workers within it
facilities_by_uids (list) – A list of lists, where each sublist represents a skilled nursing or long term care facility and the ids of the residents living within it
facilities_staff_uids (list) – A list of lists, where each sublist represents a skilled nursing or long term care facility and the ids of the staff working within it
non_teaching_staff_uids (list) – None or a list of lists, where each sublist represents a school and the ids of the non teaching staff within it
use_two_group_reduction (bool) – If True, create long term care facilities with reduced contacts across both groups
average_LTCF_degree (int) – default average degree in long term care facilities
with_school_types (bool) – If True, creates explicit school types.
school_mixing_type (str or dict) – The mixing type for schools, ‘random’, ‘age_clustered’, or ‘age_and_class_clustered’ if string, and a dictionary of these by school type otherwise. ‘random’ means random graphs for each school, ‘age_clustered’ means random graphs but with students mostly mixing within the age/grade (inter_grade_mixing controls mixing between grades), ‘age_and_grade_clustered’ means students cohorted into classes with their own teachers.
average_class_size (float) – The average classroom size.
inter_grade_mixing (float) – The average fraction of mixing between grades in the same school for clustered school mixing types.
average_student_teacher_ratio (float) – The average number of students per teacher.
average_teacher_teacher_degree (float) – The average number of contacts per teacher with other teachers.
average_student_all_staff_ratio (float) – The average number of students per staff members at school (including both teachers and non teachers).
average_additional_staff_degree (float) – The average number of contacts per additional non teaching staff in schools.
school_type_by_age (dict) – A dictionary of probabilities for the school type likely for each age.
workplaces_by_industry_codes (np.ndarray or None) – array with workplace industry code for each workplace
verbose (bool) – If True, print debugging statements.
- Returns
A popdict of people with attributes. Dictionary keys are the IDs of individuals in the population and the values are a dictionary for each individual with their attributes, such as age, household ID (hhid), school ID (scid), workplace ID (wpid), workplace industry code (wpindcode) if available, and the IDs of their contacts in different layers. Different layers available are households (‘H’), schools (‘S’), and workplaces (‘W’), and long term care facilities (‘LTCF’). Contacts in these layers are clustered and thus form a network composed of groups of people interacting with each other. For example, all household members are contacts of each other, and everyone in the same school is considered a contact of each other. If use_two_group_reduction is True, then contracts within ‘LTCF’ are reduced from fully connected.
Notes
Methods to trim large groups of contacts down to better approximate a sense of close contacts (such as classroom sizes or smaller work groups are available via sp.trim_contacts() or sp.create_reduced_contacts_with_group_types(): see these methods for more details).
-
synthpops.contacts.
make_graphs
(popdict, layers)¶ Make a dictionary of Networkx by layer.
- Parameters
popdict (dict) – dictionary of individuals with attributes, including their age, household ID, school ID, workplace ID, and the ids of their contacts by layer
layers (list) – list of contact layers
- Retuns:
Dictionary of Networkx graphs, one for each layer of contacts.
-
synthpops.contacts.
write_edgelists
(popdict, layers, G_dic=None, location=None, state_location=None, country_location=None)¶ Write edgelists for each layer of contacts.
- Parameters
popdict (dict) – dictionary of individuals with attributes, including their age, household ID, school ID, workplace ID, and the ids of their contacts by layer
layers (list) – list of contact layers
G_dic (dict) – dictionary of Networkx graphs, one for each layer of contacts
location (string) – name of the location
state_location (string) – name of the state the location is in
country_location (string) – name of the country the location is in
- Returns
None
-
synthpops.contacts.
make_contacts
(popdict=None, n_contacts_dic=None, location=None, state_location=None, country_location=None, sheet_name=None, options_args=None, activity_args=None, network_distr_args=None)¶ Generates a list of contacts for everyone in the population. popdict is a dictionary with N keys (one for each person), with subkeys for age, sex, location, and potentially other factors. This function adds a new subkey, contacts, which is a list of contact IDs for each individual. If directed=False (default), if person A is a contact of person B, then person B is also a contact of person A.
Example output (input is the same, minus the “contacts” field):
popdict = { '8acf08f0': { 'age': 57, 'sex': 0, 'loc': (47.6062, 122.3321), 'contacts': {'M': 2, 34} }, '43da76b5': { 'age': 55, 'sex': 1, 'loc': (47.2473, 122.6482), 'contacts': {'M': 20, 8, 49} }, }
- Parameters
popdict (dict) – dictionary, should have ages of individuals if not using cached microstructure data
n_contacts_dic (dict) – average number of contacts by setting
location (string) – name of the location
state_location (string) – name of the state the location is in
country_location (string) – name of the country the location is in
sheet_name (string) – name of the sheet in the excel file with contact patterns
options_args (dict) – dictionary of flags to set different population and contact generating options
activity_args (dict) – dictionary of age bounds for participating in different activities like going to school or working, also student-teacher ratio
network_distr_args (dict) – network distribution parameters dictionary for average_degree, network_type, and directionality, can also include powerlaw exponents, block sizes (re: SBMs), clustering distribution, or other properties needed to generate network structures. Checkout https://networkx.github.io/documentation/stable/reference/generators.html#module-networkx.generators for what’s possible Default ‘network_type’ is ‘poisson_degree’ for Erdos-Renyi random graphs in large n limit.
- Returns
A dictionary of individuals with attributes, including their age and the ids of their contacts.
-
synthpops.contacts.
choose_contacts
(a, size)¶ Numbafy np.random.choice(); about twice as fast
-
synthpops.contacts.
trim_contacts
(contacts, trimmed_size_dic=None, use_clusters=False, verbose=False)¶ Trim down contacts in school or work environments from everyone.
- Parameters
contacts (dict) – dictionary of individuals with attributes, including their age and the ids of their contacts
trimmed_size_dic (dict) – dictionary of threshold values for the number of contacts in school (‘S’) and work (‘W’) so that for individuals with more contacts than this, we select a smaller subset of contacts considerd close contacts
use_clusters (bool) – If True, trimmed down contact networks will preserve clustering so that an individual’s close contacts in school or at work are also contacts of each other
verbose (bool) – If True, print average number of close contacts in school and at work
- Returns
A dictionary of individuals with attributes, including their age and the ids of their close contacts.
-
synthpops.contacts.
show_layers
(popdict, show_ages=False, show_n=20)¶ Print out the contacts for individuals in the different possible social settings or layers.
- Parameters
popdict (dict) – dictionary of individuals with attributes, including their age and the ids of their close contacts
show_ages (bool) – If True, show the ages of contacts, else show their ids
show_n (int) – number of individuals to show contacts for
- Returns
None