Characteristics of human encounters and social mixing patterns relevant to infectious diseases spread by close contact: a survey in Southwest Uganda

Background Quantification of human interactions relevant to infectious disease transmission through social contact is central to predict disease dynamics, yet data from low-resource settings remain scarce. Methods We undertook a social contact survey in rural Uganda, whereby participants were asked to recall details about the frequency, type, and socio-demographic characteristics of any conversational encounter that lasted for ≥5 min (henceforth defined as ‘contacts’) during the previous day. An estimate of the number of ‘casual contacts’ (i.e. < 5 min) was also obtained. Results In total, 566 individuals were included in the study. On average participants reported having routine contact with 7.2 individuals (range 1-25). Children aged 5-14 years had the highest frequency of contacts and the elderly (≥65 years) the fewest (P < 0.001). A strong age-assortative pattern was seen, particularly outside the household and increasingly so for contacts occurring further away from home. Adults aged 25-64 years tended to travel more often and further than others, and males travelled more frequently than females. Conclusion Our study provides detailed information on contact patterns and their spatial characteristics in an African setting. It therefore fills an important knowledge gap that will help more accurately predict transmission dynamics and the impact of control strategies in such areas. Electronic supplementary material The online version of this article (10.1186/s12879-018-3073-1) contains supplementary material, which is available to authorized users.


Background
Quantification of human interactions relevant to the spread of infectious diseases transmitted by close contact is essential to accurately predict their infection dynamics and better predict the impact of control strategies [1,2]. Detailed studies of social mixing patterns have now been undertaken in a number of settings [2][3][4][5][6][7][8][9][10][11][12]. Those studies have shown that people tend to mix with other individuals of their own age (i.e. assortative mixing); however, the frequency of contact, the degree of intergenerational mixing and the characteristics of mixing tend to vary between settings, depending on factors such as household size, population density and local activities, among others [3][4][5][6][7][8][9][10][11]13].
Importantly, while the burden of infectious diseases remains disproportionally high in low-income settings, social contact data to help improve our understanding of infectious disease dynamics in such settings remain scarce. Three studies in Africa have been published to date, including in Kenya, South Africa, and in Zimbabwe [10,12,13], but no study has been undertaken in Uganda. Evidence from the current studies show strong ageassortativity in young age groups (children tend to interact proportionally more with children in their age group than with others), but also important intergenerational mixing, more so than seen in European or other high-income settings [4].
In addition, with the exception of a recent study from China [11], the spatial dispersal of social contacts relevant for transmission has often been overlooked, and there isto our knowledgeno published information from low-income settings on the spatial characteristics of social contacts. Spatial mobility is particularly important for epidemic risk prediction of novel and reemergent diseases, and for the optimization of routine control programmes [14].
To address this knowledge gap, we set up a study of social contacts relevant to the spread of infections transmitted through the respiratory route or by close contact, in rural southwest Uganda.

Methods
The study was conducted in four sub-counties of Sheema North Sub-District (southwest Uganda), an area with a total of about 80,000 inhabitants. About half (49%) of the district's population is < 15 years. The area is primarily rural.

Study design
Between January and March 2014 survey teams undertook interviews of a subset of individuals who were also included in a survey of Streptococcus pneumoniae carriage [15], asking about their social contacts in the 24 h preceding the survey, including the frequency, type and duration of encounters.
Our target sample size was 687, including all 327 individuals aged ≥15 years included in the nasopharyngeal carriage study within which this study was nested [15] and a subsample of 90 children in each of the following age groups: < 2 year olds, 2 -4 years old, 5 -9 years and 10 -14 years old. Based on estimates from previous findings available at the time [12,16], such sample size provided a precision of just over 1 contact on the mean number of contacts per day, and enabled detection of a 20% difference in the average number of daily contacts by age group.
Individuals were selected from 60 clusters randomly sampled from the 215 villages and two small towns in the sub-county, with an inclusion probability proportional to the size of the village or town. Within each cluster 11 or 12 households were selected at random from a list of households. A household was defined as a group of individuals living under the same roof and sharing the same kitchen on a daily basis. One individual from each household was randomly selected from a list of predefined age groups to sample from within each cluster. When nobody in the household was from that age group, either someone from another age group was selected providing that the quota for that age group had not been reached in the cluster, or the closest neighbouring household was visited instead. In case of non-response, another attempt was made later in the day or the following Saturday. After the second attempt, the individuals were not replaced.

Data collection
Informed consent was sought for individuals aged > 17 years, and from a parent or carer for children < 18 years. In addition, assent was sought from children aged 7 -17 years. Participants were asked to recall information on the frequency, type and duration of social encounters from the time they woke up the day before the survey until when they woke up on the survey day (~24 h).
We defined contacts as two-way conversational encounters lasting for ≥5 min. Participants were first asked to list all the places they had visited in the previous 24 h, the number of people they had contact with, their relationship with each individual mentioned, the age (or estimated age) of each listed contact and how long the encounter lasted for. Contacts involving skin-to-skin touch or sharing utensils passed directly from mouth-tomouth were defined as 'physical' contacts. The questionnaire can be found in the Additional file 1.
We defined short contacts lasting less than 5 min as 'casual contacts'. Participants were only asked to estimate the number of casual contacts they had, based on pre-defined categories (< 10, 10-19, 20-29, ≥30), but were not asked to provide detailed information about the nature of the encounter or the socio-demographic characteristics of the person met. Casual contacts are generally inaccurately reported in social contact surveys [7], particularly in a retrospective design, and most contacts important for the transmission of respiratory infections are believed to be close rather than casual [6].
The questionnaire was designed in English, translated to Ruyankole, the local language, and back-translated to English for consistency. For children < 5 years, parents were asked about their child's encounters and whereabouts. Children aged 5 -14 years were interviewed directly, using a questionnaire with a slightly adapted wording from that used for adults.
Geographical coordinates from each participant's household and the centre of each village were taken using handheld GPS devices. The spatial identification of each location in the area was done by the research team during the preparation phase of the study. Georeferencing of each village, hamlet or town in the area, was done using GIS imagery as well as by travelling to the different villages to collect that information using handheld GPS devices. Given that some villages had very similar names, interviewers carried with them a list of all of those (> 300), so as to avoid data entry problems.
Questionnaires completed in the field were double entered on a preformatted data entry tool (www2. voozanoo.net, Epiconcept, France) by two data managers working independently. Data entry conflicts were identified automatically and resolved as the data entry progressed.

Characteristics of social contacts by time, person and place
We analysed the frequency distribution of contacts for a set of covariates, including age, sex, and occupation, day of the week, distance travelled, and type of contact. Encounters reported with the same individual in different settings counted as one contact only. Straight-line distances between the centre point of all villages and towns in the dataset were calculated, and these were then used to evaluate how far people travelled, based on the reported names of villages and town where each reported encounter took place, and their own village or town of residence.
We used negative binomial regression to estimate the ratio of the mean contacts as a function of the different covariates of interest. Negative binomial was preferred over Poisson regression given evidence of overdispersion (variance > mean, and likelihood ratio significant (P < 0.05) for the over-dispersion parameter). We considered variables associated with contact frequency at p < 0.10 for multivariable analysis, and retained them in multivariable models if they resulted in a reduction of the Bayesian Information Criterion (BIC).
Next, we explored whether people reporting a high frequency of casual contacts (≥10 casual contacts) differed from those reporting fewer contacts with regards to their socio-demographic characteristics. We did so using log-binomial regression to compute crude and adjusted relative risks (RRs) for having a high frequency. In all analyses we accounted for possible within-cluster correlation by using linearized based variance estimators [17]. Analyses were also weighted for the unequal probabilities of sampling selection by age group.

Age-specific social contact patterns
We analysed the age-specific contact patterns through matrices of the mean number of contacts between participants of age group j and individuals in age group i, adjusting for reciprocity, as in Melegaro et al. [6]. If x ij denotes the total number of contacts in age group i reported by individuals in age groups j, the mean number of reported contacts (m ij ) is calculated as x ij /p j , where p j is the study population size of age group j. At the population level the frequency of contacts made between age groups should be equivalent such that m ij P j = m ji P i . The expected number of contacts between the two groups is therefore-C ij = (m ij P j + m ji P i )/2. Hence, the mean number of contacts corrected for reciprocity ðm C ij Þ can be expressed as C ij /P j . We tested the null hypothesis of proportionate mixing by computing the ratio of observed mixing patterns to that of expected mixing patterns if social contact occurred at random. Under the assumption of random mixing, the probability of encounter between age groups thus depends on the population distribution in each age group, and the contact matrix under this random mixing hypothesis was calculated based on the percentage of population in each age group. The ratio of observed over expected contacts was then computed, and confidence intervals were obtained through bootstrapping, with replacement, for a total of 1000 iterations. This approach is similar to that taken by others [11].

Epidemic simulations
Finally, in order to explore the infection transmission dynamics resulting from our contact pattern data, we simulated the spread of an immunizing respiratory infection transmitted through close contact in a totally susceptible population, thus assuming a Susceptible-Infected-Recovered (SIR) model. The model contained nine mixing age groups, with a transmission rate β ij at which individuals in age group jcome into routine contact with individuals in age group i computed as β ij ¼ qm C ij =ω i , where ω i is the proportion of individuals in age group i, and qm C ij is the next generation matrix, with qrepresenting the probability of successful transmission per contact event [18]. We assumed q to be homogeneous and constant across all age groups and conducted a set of simulations for fixed values of q between 25% and 40%, in line with what has been reported with influenza pandemic strains [18,19]. The basic reproduction number (R 0 )which corresponds to the average number of people infected by one infectious individual in a totally susceptible populationwas calculated as the dominant eigenvalue of the next generation matrix. We took uncertainty estimates in the contact matrices (and hence final size outputs) into account by iterating the model on bootstrapped matrices.
We did not develop a model to run simulations to model the dynamic of the epidemic in each setting. Rather, we computed the final epidemic size (i.e. the number of individuals who would have been infected during the epidemic) for each specific age group, based on simple a mass action model adapted to account for multiple age classes, as described in Kucharski et al. [20], with the following equation: where F represents the final epidemic size by age group (i.e. the proportion of individuals who are infected in each age group).
Estimates obtained using the contact data from Uganda were compared to that of Great Britain, using data from the POLYMOD study [4] for the latter and a similar approach to compute the mixing matrix. The model was parameterised with social contact data on physical contacts only, lasting ≥5 min, rather than all contacts, given that physical contacts generally seem to better capture contact structures relevant for the transmission of respiratory infections [6]. Data for Great Britain were available for the same age groups, and for physical contacts specifically, in the same way that physical contacts were defined in our study, which made the data for physical contacts only more comparable between studies than that of overall contacts.

Study population
A total of 566 individuals participated in the survey. This corresponded to an overall response rate of 83%, higher among ≥15 years old (98%), and lower among under 2 s (68%), 2-4 year olds (64%), 5 -9y olds (82%) and 10 -14y olds (57%). There were more female (58%) than male respondents, but this differed by age group, with fewer females in young age groups and more adult females than males (Table S1 in Additional file 2).
The mean household size was 5.3 (median 5, range 1 -18). Almost all (98%) school-aged children aged 6 -14 years attended school or college. Among adults, agriculture was the main occupation and about 27% of the females were homemakers/housewives ( Table 1).

Characteristics of contacts
A total of 3965 contacts with different individuals were reported, corresponding to an average of 7.2 contacts per person (median 7, range 0 -25) (Fig. 1). The majority of contacts were physical, thus involving skin-to-skin contact (mean 5.1, median 5 (range 0 -18)). The age and sex distribution of study participants can be found in Table S1 (Additional file 2 with supplementary figures and tables).
Most contacts (82%) were with individuals who would be normally seen daily, 520 (13%) with people normally seen at least weekly, 4% with people met more rarely and 1% of the reported contacts were with people that the participants had never met before.
We found marked differences in the number of contacts by age group, but not by sex. School-aged children reported the highest daily number of contacts, while the elderly had the fewest (Table 1). There was no difference in the mean number of contacts for individuals living in the district towns of Kabwohe and Itendero (n = 43) and the 523 others living in surrounding villages (chi-square test P-value = 0.79). Table 1 provides further details about the population characteristics, the mean number of contacts by socio-demographic and other covariates, as well as the ratio of mean contacts by covariate. Age was the only confounding factor.
Overall, contacts tended to be assortative, as shown by the strong diagonal feature on Fig. 2, with most of the intergenerational mixing occurring within households (Fig. 3). Only teenagers and adults reported non-physical contacts (Fig. 3). The quantification of assortativity can be seen in Figures S2a-c (Additional file 2), which show the ratios of observed contacts, as obtained in the survey but corrected for reciprocity, to that of expected contacts under the proportionality assumption, for all contacts and physical contacts only. The results show ageassortativity of contacts, for all age groups (other than < 2 year olds) for all contacts, and primarily for school-aged children when considering physical contacts only.
Reciprocity correction accounted for differences in reporting of contacts by age groups, particularly a proportionally higher frequency of contacts reported by young children with older age groups than older age groups reported with young children, as shown in Figure  S3 (Additional file 2).
There was no statistical difference in the average number of contacts between weekend (Sunday) and weekdays (Monday, Tuesday, Thursday and Friday) ( Table 1). As shown in Table 1, mean number of reported contacts on Sundays was 7.50 (95%CI 6.56; 8.44), slightly higher on average than on weekdays, where the average was 7.14 (6. 79; 7.49), which was not statistically significant (P = 0.229). The balance of respondents reporting contacts from weekdays and weekends reflected the normal proportion of week vs. weekend days in a normal week.
About a quarter (n = 136 (24%)) of participants reported social encounters outside their village of residence, and about 12% of contacts occurred outside participants' village of residence. The majority (56%) of people who travelled outside their village went to places located within a 5 km radius from the centre point of their village of residence, and 90% stayed within 12 km (Fig. 4). Adult males tended to travel more than females (Fig. 4). Overall, 29% of males had contact with someone outside their village, compared to 20% females (χ 2 , P = 0.0406). Most (87%) children under 5 years of age stayed in their village, whereas about a quarter or more individuals travelled outside Overall, 30% of males travelled outside their village compared to 20% of females. When stratifying by age, the difference between sex were more marked, with no statistical difference between males and females < 5 years of age (P = 0.296) or 5 -14 year olds (P = 0.272), but marked differences among adults (≥15 years old), with 42% of males travelling outside of their village compared to 24% of females (P = 0.0037).
The proportion of individuals who travelled outside their village differed by occupation (although not statistically significantly, with shop keepers (38%), those working in agriculture (31%) and office  Most contacts made outside the household as well as those with individuals outside participants' village were mostly assortative (Fig. 3), and the proportion of contacts outside the village was different by age group (P < 0.001); higher among adults, increasingly so as distance from home increased (Fig. 4).

'Casual' contacts (< 5 min long)
Information on the number of casual contacts was reported by 490 (87%) participants. Among those, 64% (n = 315) estimated they had fewer than 10 different contacts, 24% reported between 10 and 19 casual contacts, 6% reported between 20 and 29 contacts and 6% reported an estimated 30 contacts or more.
Individuals who reported high levels (i.e. ≥10 contacts) of casual contacts also tended to report more contacts (Table 1). We found no difference between those reporting high number of casual contacts (≥10) and others, by age, sex or day of the week (Additional file 2: Table S2). However, people whose primary activity was at home tended to report fewer casual contacts than others, and there were about 60% more individuals reporting high levels of casual contacts among those who travelled outside their village.
The 76 (13%) study participants for whom the number of casual contacts was not known known reported more non-casual contacts than others, with a mean number of contacts of 8.7, compared to 7.0 among the 490 for whom the number of casual contacts was estimated (Ratio of means 1.24 (95%CI 1.11 -1.39). This may be due to the age distribution of individuals for whom estimates of casual contacts was missing, which was proportionally and significantly higher among school aged 5-9 year olds (29% with no information on number of casual contacts), who also report more contacts overall, and lower in all older age groups (Chi-square P-value< 0.001).

Epidemic simulations
Finally, we compared patterns of reported physical contacts in Uganda and Great Britain, and explored differences in the relative and absolute epidemic size by age group, as well as the corresponding R 0 , for a hypothetical respiratory infection in an immune-naive population.
The number of reported physical contacts was similar between Uganda and Great Britain, with the average  (Fig. 5a & b), some of which might be related to differences in household structures and number of household contacts, as contacts outside the household were mostly assortative (Fig. 3). The computed mean values of R 0 for a per contact infectivity value (q) ranging from 0.25 to 0.40 was slightly higher in Great Britain than in Uganda (1.51 to 2.41 vs. 1.40 to 2.24). Figure 5f shows the values for an infectivity parameter of 0.33. The proportion of people infected in younger age groups was also higher in Great Britain, and there were proportionally more adults infected in Uganda. However, given the differences in population structure, the total number of infections in the population was higher in Uganda than in Great Britain (Fig. 5ce).

Discussion
To our knowledge this is only the fourth study of its kind in Africa [10,12,13], and the first one to specifically explore spatial patterns of social contacts. The quantification of mixing patterns is key to accurately model transmission dynamics and inform infectious disease control strategies [4]. Having such data thus fills an important gap, particularly given the high burden of respiratory infections in low income settings [22,23], and the risk of emerging and reemerging diseases transmitted by close interpersonal contact, such as influenza [24], measles [25] or meningococcal meningitis [26].
Our findings share similarities with studies from Africa [10,12,13] and other low or lower-middle income settings [16,27], including the high contact frequency among school-aged children and that most contacts tend to be age-assortative. Age-assortativity was not confined to young age groups only, but was also prominent among adults, which contrasts with a recent study from Zimbabwe in which proportional more than ageassortative mixing was reported in older age groups. We also found substantial mixing between age groups, largely driven by intra-household mixing. This may result in a higher force of infection from children to adults than would be seen in high-income settings such as Great Britain, as our final size epidemic model suggests. The final size model should be seen as an illustration of how different social mixing patterns impact on disease epidemiology in different settings, rather than a specific quantification of the differences. It shows the importance of using setting-specific data when modelling disease dynamics and evaluate control strategies. Our data could be best applied to evaluate transmission dynamics and the impact of interventions for endemic diseases and current epidemics in non-naïve populations, for example as for the recent large measles outbreak in the Democratic Republic of the Congo [25]. In our final size model, it is also likely that our retrospective design resulted in underreporting compared to a prospective diary-based approach [28], which hampers comparisons between countries. In sensitivity analyses we explored the impact of potential underreporting in our retrospective survey design compared to a prospective diary-based approach [28], assuming a 25% under-ascertainment compared to a diary-based study, with homogeneous underreporting across age groups. In such scenario, the proportion of infections across all age groups is predicted to be higher in Uganda than in Britain, disproportionally so in adults, and theR 0 to be higher too (see Additional file 2: Figure S4).
Our results also provide important insights into the local spatial dynamics of routine daily human interactions in rural Uganda, showing that most contacts tend to occur within the vicinity of people's area of residence, that working age adult males travel most and young children and the elderly the least, and that contacts tend to be increasingly age assortative as people travel further away from home. Similar patterns were observed in rural and semiurban China [11]. Such findings have important implications to predict outbreak dynamics and control strategies given that interconnectedness between geographic patches is an essential factor driving epidemic extinction or persistence of epidemics hotspots and the effectiveness of control strategies. Studies of measles in Niger suggest that dynamics differ from that observed in high-income countries in the pre-vaccination era, likely due to different mixing patterns and weaker spatial connectivity [29,30]. This, together with important variations in vaccination coverage between local geographic patches [31][32][33], strengthens the need to account for spatial mobility when designing efficient control strategies in those settings. Optimal targeted interventions tailored to specific geographic clusters of high transmission have also been key considerations in recent cholera outbreaks in Africa, given the limited available vaccine doses [34,35]. Spatially targeted approaches are also central to outbreak control in the recent West African Ebola epidemic [36], and a recent measles epidemic in the Democratic Republic of the Congo, sustained in part due to inadequate coverage of populations in less accessible geographical clusters [25,37].
Importantly, our data provide a better basis for parameterising transmission models that such spatial dynamics into consideration (meta-population model), by providing quantitative evidence of mixing within and between local areas in a rural East African population.
In our study the frequency of contacts was about half that of the number of contacts reported in Kenya, [10] or South Africa [12]. Although differences between settings are expected, some of these are likely to be due to the exclusion of 'casual contacts' from our contact count. There might be further differences linked to the definition of social contacts, which was based on conversational encounters in our study but not in the Kenyan study [10]. When defining contacts based on conversational exchanges the household setting tends to dominate over other settings, compared to a more inclusive definition [8].
Both our contact definition and the retrospective study design may have resulted in more stable, regular contacts being reported over others. However, the extent to which a more inclusive definition reflects contact events relevant for transmission remains unclear. Modelling studies suggest that close interpersonal rather than short casual contacts matter more for transmission of respiratory infections [6]. In addition, for modelling purposes the age-specific structure of relative contact frequency matters more than the actual reported frequency, as matrices are scaled to fit epidemiological data. Our retrospective interview-based design thus offers a simpler and easier alternative to prospective diary based approaches, particularly in such settings. Further research should explore what contact information is most relevant and how such data should best be captured. An additional analysis, which uses data from this contact study with data from the pneumococcal carriage study alongside which our stuy was conducted [15] provides some insight into this question, by exploring contact types associated with pneumococcal carriage and acute respiratory symptoms using data collected at the same time from the same individuals (le Polain de Waroux et al., in preparation and available here [38]).
Selection bias may have occurred to some extent, particularly given that more adult women were included than men. However, there was no significant difference in the number of contacts reported between males and females, including at the weekend, suggesting that selection bias was unlikely to be major. We also tried to reduce selection bias by interviewing on Saturdays people who were initially absent on the survey.

Conclusion
In conclusion, our study fills an important gap for two main reasons. First, we provide information by detailed age groups about social contacts and mixing patterns relevant to the spread of infectious diseases in a region where such data are scarce. Second, we also provide some insights into spatial characteristics of social encounters. Although this has increasingly being recognized as an important component in evaluating epidemic risk and in the design of efficient control strategies, it has not previously been quantified in lowincome settings, and should be explored further. Our study thus provides essential evidence to inform further research and infectious disease modelling work, particularly in similar rural African settings.