Social contact patterns and implications for infectious disease transmission – a systematic review and meta-analysis of contact surveys

Background: Transmission of respiratory pathogens such as SARS-CoV-2 depends on patterns of contact and mixing across populations. Understanding this is crucial to predict pathogen spread and the effectiveness of control efforts. Most analyses of contact patterns to date have focused on high-income settings. Methods: Here, we conduct a systematic review and individual-participant meta-analysis of surveys carried out in low- and middle-income countries and compare patterns of contact in these settings to surveys previously carried out in high-income countries. Using individual-level data from 28,503 participants and 413,069 contacts across 27 surveys, we explored how contact characteristics (number, location, duration, and whether physical) vary across income settings. Results: Contact rates declined with age in high- and upper-middle-income settings, but not in low-income settings, where adults aged 65+ made similar numbers of contacts as younger individuals and mixed with all age groups. Across all settings, increasing household size was a key determinant of contact frequency and characteristics, with low-income settings characterised by the largest, most intergenerational households. A higher proportion of contacts were made at home in low-income settings, and work/school contacts were more frequent in high-income strata. We also observed contrasting effects of gender across income strata on the frequency, duration, and type of contacts individuals made. Conclusions: These differences in contact patterns between settings have material consequences for both spread of respiratory pathogens and the effectiveness of different non-pharmaceutical interventions. Funding: This work is primarily being funded by joint Centre funding from the UK Medical Research Council and DFID (MR/R015600/1).

. When 104 stratifying by study methodology, median daily contacts was higher in diary-based surveys compared 105 to interview-/questionnaire-based surveys, which was true across all income strata (Table 1, Figure 1D), but no effect of gender on total daily contacts for 117  Figure 1D daily contact rates, we further explored the locations in which contacts were made. Contact location 142 was known for 314,235 contacts, 42.7% of which occurred at home (13.1% at work, 12.5% at school 143 and 31.7% in other locations). Across income-strata, there was significant variation in the proportion 144 of contacts made at home -being highest in LICs/LMICs (68.3%) and lowest in HICs (37.0%) ( Figure  145 2B). Age differences were also observed in the number of contacts made at home, particularly for 146 LICs/LMICs ( Figure 2C-2D). Relatedly, a higher proportion of contacts occurred at work and school 147 (14.6 % and 11.3%) in HICs compared to LICs/LMICs (3.9% and 5.2%, respectively; Supplementary 148 Figure 5). Strong, gender specific patterns of contact location were also observed. Across all income 149 strata males made a higher proportion of their contacts at work compared to females, although this 150 difference was largest for LICs/LMICs (Supplementary Figure 5). Further, we found significant variation 151 between income strata in median household size (7 in LICs/LMICs, 5 in UMICs and 3 in HICs). This trend of decreasing household size with increasing country income was consistent with global data (Figure  153 2E). The larger households observed for LIC/LMIC settings were also more likely to be 154 intergenerational -in LICs/LMICs, 59.4% of participants aged over 65 lived in households of at least 6 155 members compared to 17.5% in UMICs and only 2.2% in HICs. Twelve studies collected information on the gender of the contact and eight studies contained 201 information on age allowing assignment of contacts to one of the three age-groups described in 202 were assortative by gender for all income strata, as participants were more likely to mix with their 204 own gender (Supplementary Text 3). Mixing was also assortative by age, with participants more likely 205 to contact individuals who belonged to the same age group this degree of age-assortativity was lowest 206 for LICs/LMICs, where only 29% of contacts made by adults were with individuals of the same age 207 group. By contrast, in HICs we observed a higher degree of assortative mixing, with most contacts 208 (51.4%) made by older adults occurring with individuals belonging to the same age group. Across the collated studies, the total number of contacts was highest for school-aged children. This is 220 consistent with previous results from HICs (Béraud et  also. Interestingly however, we observed differences in patterns of contact in adults across income 223 strata. Whilst contact rates in HICs declined in older adults, this was not observed in LICs/LMICs, where 224 contact rates did not differ in the oldest age-group compared to younger ages. This is consistent with 225 variation in household structure and size across settings, with nearly two thirds of participants aged 226 65+ in included LIC/LMIC surveys living in large, likely intergenerational, households (6+ members), 227 compared to only 2% in HICs. HICs were also characterised by more assortative mixing between age-groups, with older adults in LICs/LMICs more likely to mix with individuals of younger ages, again 229 consistent with the observed differences between household structures across the two settings. These 230 results have important consequences for the viability and efficacy of protective policies centred 231 around shielding of elderly individuals (i.e. those most at risk from COVID-19 or influenza. In these 232 settings other strategies may be required to effectively shield vulnerable populations, as has been 233 previously suggested (Dahab et al., 2020).Our results support the idea of households as a key site for 234 transmission of respiratory pathogens (Thompson et al., 2021), with the majority of contacts made at 235 home. Our analysis highlights that the number of contacts made at home is mainly driven by 236 household size. However, the relative importance of households compared to other locations is likely 237 to vary across settings. We observed significant differences across income settings in the distribution 238 of contacts made at home, work and school. The proportion of contacts made at home was highest 239 for LIC/LMICs, where larger average household sizes were associated with more contacts, more 240 physical contacts, and longer lasting contacts. By contrast, participants in HICs tended to report more 241 contacts occurring at work and school. The lower number of contacts at work in LIC/LMIC may be 242 explained by the types of employment (e.g agriculture in rural surveys) and a selection bias (women 243 at home/homemakers more likely to be surveyed in questionnaire-based surveys). Our analyses 244 similarly highlighted significant variation in the duration and nature of contacts across settings. 245 Contacts made by female participants in LICs/LMICs were more likely to be physical compared to men, 246 whilst the opposite effect was observed for HICs and UMICs, potentially reflecting context-specific 247 gender roles. In all settings, we observed a general decline of physical contacts with age, except in the Heterogeneity between studies was larger for LICs/LMICs and UMICs, which we partly accounted for, 310 through fitting random study effects. These study differences may be attributed to the way individual 311 contact surveys were conducted, making comparisons of contact patterns among surveys more 312 difficult (e.g. prospective/retrospective diary surveys, online/paper questionnaires, face-to-313 face/phone interviews, and different contact definitions). For instance, there is evidence suggesting 314 that prospective reporting, which is less affected by recall bias, can often lead to a higher number of 315 contacts being reported (Mikolajczyk and Kretzschmar, 2008) and a lower probability of casual or 316 short-lasting contacts being missed. The relatively high contact rates observed in HICs may be 317 explained by the fact that all but two HIC surveys used diary methods. Our study highlights that a 318 unified definition of "contact" and standard practice in data collection could help increase the quality 319 of collected data, leading to more robust and reliable conclusions about contact patterns. Whilst we 320 aggregate results by income strata due to the limited availability of data (particularly in lower-and 321 middle-income countries), it is important to note that the outcomes considered here are likely to be 322 shaped by several different factors other than country-level income. Whilst some of these factors will 323 be correlated with a country's income status (e.g. household size(Walker et al., 2020)), many others 324 will be unique to a particular setting or geographical area or correlate only weakly with country-level 325 data. Examples include patterns of employment, the role of women, and other contextual factors. 326 These analyses are therefore intended primarily to provide indications of prevailing patterns, rather 327 than a definitive description of contact patterns in a specific context and highlight the significant need 328 for further studies to by carried out in a diversity of different locations.
Despite these limitations however, our results highlight significant differences in the structure and 330 nature of contact patterns across settings. These differences suggest that the comparative importance 331 of different locations and age-groups to transmission will likely vary across settings and have critical 332 consequences for the efficacy and suitability of strategies aimed at controlling the spread of 333 respiratory pathogens such as SARS-CoV-2. Most importantly, our study highlights the limited amount 334 of work that has been undertaken to date to better understand and quantify patterns of contact across 335 a range of settings, particularly in lower-and middle-income countries, which is vital in informing 336 control strategies reducing the spread of such pathogens.  Table 4). Collated records underwent title and abstract screening for relevance, before full-text 343 screening using pre-determined criteria. Studies were included if they reported on any type of face-344 to-face or close contact with humans and were carried out in LICs, LMICs or UMICs only. No restrictions 345 on collection method (e.g. prospective diary-based surveys or retrospective surveys based on a face-346 to-face/phone interview or questionnaire) were applied. Studies were excluded if they did not report 347 contacts relevant to air-borne diseases (e.g. sexual contacts), were conducted in HICs, were contact 348 tracing studies of infected cases, or were conference abstracts. All studies were screened 349 independently by two reviewers (AM and CW). Differences were resolved through consensus and  Table 3). Each item was attributed a zero or a one, and a quality score was assigned 372 to each study, ranging from 0% ("poor" quality) to 100% ("good" quality A negative binomial regression model was used to explore the association between the total number 384 of daily contacts and the participant's age, sex, employment/student status and household size, as 385 well as methodology and survey day. Incidence rate ratios from these regressions are referred to as 386 "Contact Rate Ratios" (CRRs). A sensitivity analysis was carried out that excluded additional contacts 387 (such as additional work contacts, group contacts, and number missed out, which were recorded 388 separately and in less detail by participants compared to their other contacts ( explore determinants of contact duration (<1hr/1hr+) and type (physical/non-physical), using the 391 same explanatory variables as in the total contacts analyses. There were differences in the contact 392 duration categories defined by studies, and the threshold of 1 hour for longer durations was used to 393 maximise sample size, by allowing inclusion of all available data. An additional sensitivity analysis, 394 weighing all studies equally within an income stratum, explored the impact of study size on the 395 estimated CRRs and ORs for all main outcomes (total contacts, duration and whether physical). The 396 proportion of contacts made at each location (home, school, work and other) was explored 397 descriptively and contacts made with the same individual in separate locations/instances were 398 considered as separate contacts. 399 400 All analyses were done in a Bayesian framework using the probabilistic programming language Stan, 401 using uninformative priors in all analyses and implemented in R via the package brms (Bürkner, 2018(Bürkner, , 402 2017). All analyses were stratified by three income strata (LICs and LMICs were combined to preserve 403 statistical power) and included random effects by study, to account for heterogeneity between 404 studies. The only exceptions to this were any models adjusting for methodology which did not vary by 405 study. The effect of each factor was explored in an age-and gender-adjusted model. All models exploring the effect of student status or employment status were restricted to children aged between 407 5 and 18 years and adults over 18, respectively. In the remaining models including all ages, age was 408 adjusted as a categorical variable (<15, 15 to 65 and over 65 years). CRRs, Odds Ratios (ORs) and their 409 associated 95% Credible Intervals are presented for all regression models. Here, we report estimates 410 adjusted for age and gender (referred to as adjCRR or adjOR). Studies which collated contact-level 411 data were used to assess assortativity of mixing by age and gender for different country-income strata 412 by calculating the proportions of contacts made by participants that are male or female and those that 413 belong to three broad age groups (children, adults, and older adults; Supplementary Text 3). Arinaitwe R, Mwanga-Amumpaire J, Boum Y, Nackers F, Checchi F, Grais RF, Edmunds WJ. 596 Table 1-Summary table of total daily contacts. The total number of observations, as well as the mean, median and interquartile range (p25 and p75) of total daily contacts shown by participant and study characteristics. Sample median total number of contacts shown by gender (right) and 5-year age groups up to ages 80+ shown for A) LICs/LMICs, B) UMICs and C) HICs. Grey lines denote individual studies, and the solid black line is the median across all studies of within that income group. Studies with a diary-based methodology are represented by a solid grey line and those with a questionnaire or interview design are shown as a dashed line. For UMICs, one study outlier with extremely high number of contacts is excluded (online Thai survey with a "snowball" design by Stein et al., 2014). Contact Rate Ratios and associated 95% Credible intervals from a negative binomial model with random study effects are shown in D (LICs/LMICs), E (UMICs) and F (HICs). All models were adjusted for age and gender and were ran separately for each key variable (weekday/weekend, household size, survey methodology, student/employment status).

Figure 2-Contact location and household size.
A) Sample median number of contacts by household size in review data, stratified by income strata. Shaded area denotes the interquartile range. B) sample mean % of contacts made at each location (home, school, work, other) by income group. C) total daily contacts (sample mean number) made at each location by 5-year age group. D) Sample median number of contacts made at home by 5-year age groups and income strata. Shaded area denotes the interquartile range. E) Average household size and GDP; red circles represent median household size in single studies from the review. GDP information was obtained from the World Bank Group and global household size data from the Department of Economic and Social Affairs, Population Division, United Nations.

Figure 3-Physical contacts.
Mean proportion of contacts that are physical shown by gender (right) and 5-year age groups up to ages 80+ shown for A) LICs/LMICs, B) UMICs and C) HICs. Grey lines denote individual studies, and the solid black line is the mean across all studies of within that income group. Studies with a diary-based methodology are represented by a solid grey line and those with a questionnaire or interview design are shown as a dashed line. Odds Ratios and associated 95% Credible intervals from a logistic regression model with random study effects are shown in D (LICs/LMICs), E (UMICs) and F (HICs). All models were adjusted for age and gender and were ran separately for each key variable (weekday/weekend, household size, survey methodology, student/employment status).

Figure 4-Contact duration.
Mean proportion of contacts that last at least an hour shown by gender (right) and 5-year age groups up to ages 80+ shown for A) LICs/LMICs, B) UMICs and C) HICs. Grey lines denote individual studies and the solid black line is the mean across all studies of within that income group. Studies with a diary-based methodology are represented by a solid grey line and those with a questionnaire or interview design are shown as a dashed line. Odds Ratios and associated 95% Credible intervals from a logistic regression model with random study effects are shown in D (LICs/LMICs), E (UMICs) and F (HICs). All models were adjusted for age and gender and were ran separately for each key variable (weekday/weekend, household size, survey methodology, student/employment status).