Using wearable proximity sensors to characterize social contact patterns in a village of rural Malawi

Measuring close proximity interactions between individuals can provide key information on social contacts in human communities and related behaviours. This is even more essential in rural settings in low- and middle-income countries where there is a need to understand contact patterns for the implementation of strategies for social protection interventions. We report the quantitative assessment of contact patterns in a village in rural Malawi, based on proximity sensors technology that allows for high-resolution measurements of social contacts. Our results revealed that the community structure of the village was highly correlated with the household membership of the individuals, thus confirming the importance of the family ties within the village. Social contacts within households occurred mainly between adults and children, and adults and adolescents and most of the inter-household social relationships occurred among adults and among adolescents. At the individual level, age and gender social assortment were observed in the inter-household network, and age disassortativity was instead observed in intra-household networks. Moreover, we obtained a clear trend of the daily contact activity of the village. Family members congregated in the early morning, during lunch time and dinner time. In contrast, inter-household contact activity displayed a growth from the morning, reaching a maximum in the afternoon. The proximity sensors technology used in this study provided high resolution temporal data characterized by timescales comparable with those intrinsic to social dynamics and it thus allowed to have access to the level of information needed to understand the social context of the village.


Introduction
Describing close proximity interactions allows to create contact networks representing frequency of social contacts in human communities. Contact network analysis can be used to better understand social interaction patterns and related behaviours (Borgatti et al. [3]; Chami et al. [8]) and the transmission of diseases (Funk,et al. [17]; Danon et al. [12]). Col-lecting high resolution data on the contact rates between individuals is a major challenge in most settings, particularly in rural low-and middle-income areas. Furthermore, many infectious diseases have emerged or re-emerged in rural low-and middle-income settings in the last century (Fenollar et al. [15]). Including observed contact data, of these harder to reach populations, in stochastic models of transmissible diseases could help better predict epidemics and has utility in the design of preventive and control measures such as vaccination and social distancing (Mossong et al. [35]; Salathé et al. [41]).

Related works
Contact diaries have usually been used to record contact information in low-and middleincome countries. Few studies have investigated social contact patterns in the Asian continent. Social networks in household-structured communities have been estimated in Vietnam (Horby et al. [19]) and in rural India (Johny et al. [23]) through the use of paper-based questionnaires. Banerjee et al. [1] studied the social network of rural villages of Southern India using questionnaire and they developed a model of information diffusion. In Africa, a robust literature documents the use of contact diaries to understand and quantify households' connectivity (Hosegood and Timaeus [20]; Cassidy and Barnes [5]) and social contacts relevant for diseases transmission (Johnstone-Robertson et al. [22]; Crampin et al. [11]; Chami et al. [8]; Kiti et al. [26]). Specifically, in Malawi, Rock et al. [41] have studied the social networks and social participation of youth living in extreme poverty in rural sites, and Helleringer and Kohler [18] have investigated the relationship between the sexual networks of young adult population of several villages and the position of HIV-positive individuals within these networks. However, such survey-based approach permits obtaining only an approximate estimate of the number of contacts and of their durations, especially when repeated over many days or weeks. Moreover, young or illiterate participants may face difficulty in comprehending the process of completion of the questionnaire and need the assistance of trained workers (Johnstone-Robertson et al. [22]).
Recently, the use of proximity sensing technology to collect social contact information has emerged as an alternative. Mobile phones have been used to continuously collect proximity information within rural communities by Bluetooth devices within the scanning range of the user (typically 5-10 m) (Yoneki and Crowcroft [48]). On the other hand, wireless proximity sensors can gather proximity interactions less than 1-2 meters separation distance every few seconds (Cattuto et al. [6]) in an objective way. These proximity events are relevant to detect social interactions, such as conversations, or spread of infections through physical touch or via aerosol. Proximity sensors were successfully deployed in rural Kenya to characterize contact patterns that may shape the transmission of respiratory infections in schools and households (Kiti et al. [28]; Kiti et al. [27]).

Objectives of the study
In this study, we report on the use of wearable proximity sensors to measure face-to-face proximity and pattern of contacts between individuals in a village in rural Malawi. The present study is part of a wider project on Child and Youth Development Study funded by UNICEF Malawi (Leal Neto et al. [30]) aimed to track children's health and to better understand child development in poor country settings. The primary aim of this study is to estimate social contact patterns to understand the social context within a village in rural Malawi. Specifically, we investigated the social interactions within household and across households, to better understand the social relationships by age-categories. We analyzed separately the intra-and the inter-households' social relationships to assess the differences in contact patterns among individuals belonging to the same family group and the different family groups, in particular the relationships of children with the other age-categories. We measured group-level patterns by testing for community structure in the whole social network of the village, and we evaluated the performance of a community detection algorithm by their ability to find so-called ground truth communities (i.e., age, gender and household memberships), and the goal is to find communities that align with age, gender or household membership. We meant to deeply understand the sociality within the village by considering also the dyadic social interactions. Thus, we explored individual-level patterns by measuring the social assortment of individuals by age and by gender, to test the tendency of individuals to interact with individuals with whom they shared similar attributes. Finally, we assessed the social contact patterns at the temporal level, by studying the daily activity profile of contacts among individuals split in intra-household and interhousehold contacts. This study is an extension of the Kiti et al. [28] study that took place in rural Kenya. Here, we used a larger sample and we analyzed inter-household contact patterns.
The remaining part of the paper is organized as follows. Material and methods section presents the proximity sensing technology used, and it introduces the setting in which the data collection took place, including the ethical aspect of the research. Moreover, this section illustrates how the data was cleaned and the data analysis was performed. The empirical findings on the social structure of the village are provided in the Results section. Discussion section then discusses the results and describes the related works more extensively. Conclusions and limitations of the study are drawn in the Conclusion session.

Data collection
The present study is part of a wider project on Child and Youth Development Study funded by UNICEF Malawi (Leal Neto et al. [30]). The data collection was conducted between 16th December 2019 and 10th January 2020 in Mdoliro village in Dowa district in the Central Region of Malawi. Mdoliro is a small village with an estimated 2019 population of 147 distributed over 32 households (average household size is 4.5). The majority of the inhabitants were Christians, and the Chewas are the main ethnic group. Farming is the major source of income.
The data were obtained and processed using a proximity-sensing application previously used to measure individuals social contacts in a variety of real-world settings as hospital wards (Isella et al. [21]; Voirin et al. [46]), schools (Stehlé et al. [45]), social events (Smieszek,et al. [44]), households (Ozella et al. [38]; Kiti et al. [28]), and more recently it was used to detect proximity events between animals (e.g., Wilson-Aggarwal et al. [47]). This technology is based on wearable proximity sensors that exchange ultra-low power radio packets and the use of the sensors to detect face-to face interactions is described in detail in Cattuto et al. [6]. Sensors in close proximity exchange with one another a maximum of about 1 power packet per second, and the exchange of low-power radiopackets is used as a proxy for the spatial proximity of the individuals wearing the sensors (Cattuto et al. [6]). To estimate how close individuals are, the attenuation of the signals with distance is computed as the difference between the received and transmitted power. Proximity between individuals is asserted when the median attenuation over a given time interval exceeds a specified attenuation threshold (in dBm). In this study, we set the attenuation threshold at -75 dBm, this threshold was already used in previous deployments (e.g., Ozella et al. [37]) and allowed the detection of proximity events between devices situated in the range 1-1.5 m of one another. This distance between individuals allows detection of a close-contact situation during which social interactions might occur and a communicable disease infection might be directly transmitted. A 'contact event' between two individuals was identified when the devices exchanged at least one radio packet during a time interval of 20 sec. After a contact is established, it is considered ongoing as long as the devices continue to exchange at least one radio packet for every subsequent 20 s interval. Conversely, a contact was considered broken if a 20 s interval elapses with no exchange of radio packets (Stehlé et al. [45]; Kiti et al. [27]). Each device has a unique identification (ID) number that is used to link the information on the contacts established by the individual carrying the device. For the present study, the technology was operated in a distributed fashion: contact data were stored in the local memory of individual devices. After collecting the devices at the end of the study, data from individual devices were downloaded, and the (temporal) contact networks recorded by individual devices were combined together to build a time-resolved proximity graph. In addition to contact information, each device periodically logs its orientation in space as measured by a tri-axial accelerometer.
The participants wore a sensor enclosed in a pouch and pinned to the front of a blouse/shirt in order to detect close-range proximity. The low-power radio frequency in use cannot propagate through the human body, and the position of the sensor favours capturing face-to-face interactions. In addition, metadata on individuals were collected, i.e., gender, age and which household they belonged to, through the use of the app Survey CTO. A household was defined as the group of people living in the same house and eating from the same kitchen (Hosegood and Timaeus [20]). Participants were grouped into three age-categories: <10 years old (children), 11-18 years old (adolescents), and >18 years old (adults). Training sessions were conducted with Health Surveillance Assistants (HSAs) and volunteers in the use of sensors and how participants should have worn them over the study period, and HSAs visited the village in order to check if participants were wearing the sensors properly. The Ministry of Health (MOH) defines an HSA as a primary healthcare worker serving as a link between a health facility and the community (Chikaphupha et al. [9]). HSAs' tasks include community health, family health, environmental health, prevention and control of communicable diseases, and community case management. Specifically, two HSAs were involved in the study: one of them lived in Mdoliro (i.e., one adult participant that had also the role of HSA), and one HSA was external from the village's population, and visited the village at least once a week.

Ethical aspects
Only participants who gave their written consent (documented) were included in the research. In the case of children, consent was obtained from their guardians. In the case of adolescents, consent was obtained from both themselves and their guardians. The study was approved by Ethical Committee at the University of Zurich (OEC IRB #2018-046) and Ethical Committee at College of Medicine in Malawi (P.10/19/2825).

Data analysis
The proximity data were extracted from devices and cleaned by identifying anomalies in the recorded data that might point to sensors that were tampered with or suffered hardware/battery issues resulting in data loss or low-quality data. Participants were asked to remove the sensor overnight. Night contacts were disregarded from the analysis by using the tri-axial accelerometer data to identify the time periods during which the sensor did not move. This also allowed us to identify the time periods during which the sensors were not worn by the participants. This data was also disregarded from the analyses.

Network analysis and contact matrices
For each participant in the study, we computed the number of contact events and the duration of each contact. Time-aggregated, weighted contact networks were generated: nodes correspond to individuals, an edge between two individuals indicates that at least one contact event involving those individuals was recorded during the whole experimental period. The weight w ij of an edge between nodes i and j is defined as the cumulative duration of the contact events recorded between those individuals. Network edges are undirected and the weights on the edges are regarded as symmetric (w ij = w ji ). The degree k i of a node i in the above network corresponds to the number of distinct individuals with whom individual i has been in contact. Intra-household and inter-household contact matrices were generated based on the daily number and on the daily duration of contacts by age-category. We aimed to obtain the daily mean of the contact durations and the daily mean of contact events per capita for each pair of age-categories. To obtain the intra-household matrices, we divided the total contact durations and the total number of contacts by the days during which the family members wore sensors simultaneously (days of overlap), and by the number of persons belonging to the two age-categories (a and b): To obtain the inter-household matrices, we divided the total contact durations and the total number of contacts by the days during which all the participants wore the sensors simultaneously. The observed values were compared to those obtained by a null model: we shuffled the nodes attributes of the intra-and inter-household edge lists and we computed for those realizations the daily number and the daily duration of contacts by age-category.

Community detection
Community detection seeks to describe the large-scale structure of a network by dividing its nodes into communities (or groups), based only on the pattern of links among those nodes (Contisciani et al. [10]). Nodes belonging to communities are more highly connected to each other than to the rest of the network and probably share common properties. We used the Louvain algorithm (Blondel et al. [2]) to identify community structure in the aggregated networks. The modularity is a measure of the structure of networks which measures the strength of division of a network into communities, and this method maximizes a modularity score for each community. The algorithm assesses how much more closely connected the nodes within a group are, compared to how connected they would be in a random network (Borgatti et al. [3]; Lu et al. [32]). We used the Normalized Mutual Information (NMI) score to test the relationship between the community membership of participants and their attributes (i.e., gender, age-category and household membership). We aimed to evaluate if the communities detected by the algorithm corresponded to participants' attributes. NMI score ranges between 0 (no mutual information) and 1 (perfect correlation). NMI scores closer to 1 imply a greater overlap between community membership and attributes.

Assortativity
We studied the assortativity (i.e., the tendency of the individuals to associate with individuals of similar characteristics) in the gender and the age-category. Our aim was to understand if individuals with the same gender and the same age-category will be more likely to interact, and if there are differences between intra and inter-households contacts.
We computed the number of contacts and the total time in contacts (weights) between individuals in aggregated observed networks. We compared the values obtained by the observed networks, with those obtained from a null model. We created an ensemble of realizations of a null model by shuffling nodes' attributes and computing for those realizations the assortativity. Then, we compared the empirical results with the distribution of values obtained from the null model using the z-test.

Daily activity profiles
We studied the daily activity profile of contacts among individuals, extracting the probability of observing a contact as a function of the time along the day. We computed these activity profiles for each household, split in intra-household and inter-household contacts. Additionally, we create two aggregated datasets that join all the timestamps of the observed contacts: the aggregated data for intra-household contacts and the aggregated data for inter-household contacts. We computed the Kolmogorov-Smirnov statistical distance d KS between the aggregated data of all the households and the data observed for each household (for intra-household daily activity profile), and of all individuals and the data observed for each individual (for inter-households daily activity profile). The Kolmogorov-Smirnov distance between two probability distributions i and j depending on time τ is the maximum, among all the times, of the absolute difference between the cumulative distribution functions C i (τ ) and C j (τ ): This distance is bounded between 0, when the two compared distributions are identical, and 1, when the overlap between both distributions is null. In this particular case, we studied the distributions arising from the empirically observed contacts. The times of the contacts of each household (for intra-household data) and of each individual (for interhouseholds data) i are described by the cumulative distribution function C i given by where N i (t) is the number of observed contacts at time t, computed identically for the aggregated data. We considered a range of times of 24 h duration. We observe that the periodicity of this range may influence the measurement of dKS. For example, if we consider the origin of times at midnight and there are contacts between 23:00 and 01:00 in the aggregated dataset, but there is no contact at one specific household along this range, the maximum difference between the cumulatives would be influenced by the origin of times, such that this difference would be lower if the origin of times is included along the observed range. To tackle this issue, we generated the origin of times as a uniform random number between 0 and 24 h. We generated, for each pair of distributions, 100 different origins of times, computed the Kolmogorov-Smirnov distance for each origin of times, and the considered d KS was the minimum among all these samples.

Results
The whole study duration was 26 days (from 16th December 2019 to 10th January 2020), however, the proximity sensors were deployed and collected at different times within this time window. Therefore, each sensor had a different deployment period, and this varied from 16 days to 23 days (median 20 days). We included in the data analysis the contact data for all the participants when they wore the sensors simultaneously, and this overlapping deployment period was 13 days. On the other hand, we studied the intra-household contact data when the family members wore sensors simultaneously, and this overlapping deployment period was different according to the household (range 16-22 days). The initial number of sensors deployed was 99. However, a total of 86 sensors were included in the data analysis, since we excluded the sensors that had anomalies, the sensors that were not properly worn by the participants, and the sensors that did not register contacts for the entire duration of the study.

Intra-household contact data
Overall, 28 households were included in the data analysis, accounting for a total of 84 sensors, distributed as follows: 17 children, 24 adolescents, and 43 adults. The mean number of household members wearing a sensor was 3 (range 2-5). We generated contact matrices based on the duration of contacts (in seconds) and on the number of contacts (contact events) by age-category Fig. 2. The highest mean contact durations and mean number of contacts corresponded to the contacts between adults and children, and between adults and adolescents. However, we did not find significant differences between the empirical values observed between adolescents and adults and those obtained from the null model. On the other hand, the comparison between the empirical values and the null model showed that adults and children had significantly higher mean number of contacts and higher mean contact durations than we expected by chance. The mean contact durations and the mean number of contacts among adults living in the same household were significantly lower than we expected by chance. We did not find significant differences between the empirical values observed and those obtained from the null model for all the other age-categories (see Additional file 1).

Inter-household contact data
The aggregated contact network for the participants that had contacts with individuals not belonging to their household had 74 nodes and 264 edges. The nodes are distributed as follows: 10 children, 24 adolescents, 40 adults (one of them had also the role of Health Surveillance assistant (HSA)), and one HSA was external from the village's population. All the adolescents involved in the study had contacts with individuals not belonging to their household. Overall, the median degree (i.e., number of connections with other individuals) was 6 (range 1-33). The median degree of children was 7 (range 1-11) and of adults was 5 (range 1-33). The adolescents had the highest median degree: 8 (range 1-15). As we expected, the individual with the highest number of connections was an HSA (degree 33), however the HSA external from the village had links with 10 people. We also generated contact matrices based on daily mean of the contact durations and daily mean of contact Figure 3 Inter-household contact matrices. Contact matrices giving the daily mean contact durations (± standard deviation) in seconds per capita (left panel) and the daily mean contact events (± standard deviation) per capita (right panel) by age-category events per capita (Fig. 3). The highest mean contact durations and mean number of contacts corresponded to the contacts among adults and among adolescents. The comparison between the empirical values and the null model showed that the mean contact durations and the mean number of contacts among adults living in different households were higher than we expected by chance. We did not find significant differences between the empirical values observed and those obtained from the null model for all the other age-categories (see Additional file 1).

Community detection
We studied the community structure of the aggregated network by considering all the participants, both intra-household and inter-households contacts (86 nodes and 355 edges). Community analysis using Louvain's algorithm showed high modularity in the unweighted network (0.48) and the weighted network (0.75). The algorithm detected 8 communities for the unweighted network and 13 communities for the weighted network. Networks with high modularity have dense connections between the nodes within communities but sparse connections between nodes in different communities. The community membership was strongly correlated with household membership in both the unweighted network (NMI = 0.73) and weighted network (NMI = 0.82). Community membership had no significant relationship with either the individual's gender (unweighted NMI = 0.02; weighted NMI = 0.04) or age-category (unweighted NMI = 0.03; weighted NMI = 0.11).

Assortativity
We computed the fraction of contact events and the fraction of total time in contact involving individuals with same gender and the same age-category, intra and inter-households. We compared the fraction of contacts and the fraction of total time in contact obtained by observed networks, with those obtained from null models of graphs. To do so, we produced 1000 randomized equivalents of observed networks by shuffling individuals' characteristics. Then, we tested for the homophilous behaviour by computing the distribution of the fraction of contacts and fraction of total time in contact, linking nodes of same characteristics in the null model, and comparing it to the empirical value of this ratio. We considered the following hypothesis in relation to a null model: (H0) The observed fraction of contacts and fraction of time in contact involving individuals with same characteristic are included in the 95% confidence intervals of the null distribution. Regarding the gender, we do not reject the null hypothesis H0 for unweighted (p-value = 0.680) and weighted contacts (p-value = 0.424) among individuals living in the same household, and for weighted contacts (p-value = 0.426) among individuals living in different households, the empirical values are included in the 95% confidence intervals of the null distribution. We can reject the H0 for unweighted inter-households contacts (pvalue < 0.001), the empirical value is above the 95% confidence intervals of the null distribution, by demonstrating the preference of individuals living in different household to have contacts with individuals of same gender (i.e., gender assortativity) (Fig. 4, upper  panels).
Regarding the age-categories, we do not reject the null hypothesis H0 for unweighted contacts (p-value = 0.06) among individuals living in the same households, the empirical values are included in the 95% confidence intervals of the null distribution. We can reject the H0 for weighted intra-household contacts (p-value = 0.006), the empirical value is below the 95% confidence intervals of the null distribution, by demonstrating the preference of individuals living in the same household to spend time with individuals of different age (i.e., age dissortativity). We can also reject the H0 for unweighted and weighted  inter-households contacts (p-values < 0.001 and 0.023, respectively), the empirical value is above the 95% confidence intervals, by demonstrating the preference of individuals living in different households to interact with individuals of the same age (i.e., age assortativity) (Fig. 4, bottom panels).

Daily activity profiles
We aggregated the observed contacts at intra-household and inter-household level, focusing on the time that they occur along a daily scale. Figures 5 and 6 show the number of contacts happening at each hour divided by the total number of contacts observed. Intrahousehold contacts showed an intense growth during the early morning (from 5 am) and two activity peaks around lunch time (1 pm) and dinner time (7 pm) (Fig. 5). Likewise, inter-household contacts increased from 5 am, but with a moderate intensity, and showed a peak of activity around 4 pm (Fig. 6).

Discussion
With the present study, we report the quantitative assessment of contact patterns in a village in rural Malawi, based on proximity sensors technology that allows for highresolution measurements of social contacts. This technology provided information on community structure of the village, on social relationships and social assortment between individuals, and on daily contacts activity within the village, both intra and inter-households. We described the social context of the village and the social interactions between individuals with respect to the age-category and to the gender.

Community detection
Our findings revealed that the social network presented communities that were highly correlated with household membership, thus confirming the importance of family ties within the village. The household variable is the most important one in shaping the sub-groups of the network, while we did not find communities correlates with the age-category and the gender of the participants. The household is usually the fundamental social and economic unit in African villages, where individuals have more frequent and intense interactions (Hosegood and Timaeus [20]), and where a significant part of children's development occurs (Bradley and Putnick [4]). Families not only offer the access to basic necessities, such as food and shelter, but also safeguard a safe environment for young children and adolescents. The quality of care and parenting practices plays a key role in child growth (Engle et al. [14]) and a correct stimulation promotes optimal child development through responsive and appropriate interactions with caregivers (Landry et al. [29]). Moreover, households play a key role in the transmission of infectious diseases, and household composition may influence transmission risks (Fraser [16]; Horby et al. [19]).
Nevertheless, a household is not an independent entity, but it is embedded in the broader structure of the village, where there are kinships and other relationships between individuals. Understanding and quantifying interactions both within and across households can offer a general picture of social life. Children's early life experience is shaped not only by family contexts but also by the social ties formed within the village and social fabric characteristics play an important role in determining child development.

Network analysis and contact matrices
We obtained contact matrices stratified by age-category on the number of contact events and the time spent in proximity. Our results showed a clear difference between intra and inter-household interaction patterns. Adults had a greater number of contacts and time spent in proximity with children and adolescents living in the same household. This result suggests that the role of adults within the family is the care for the youngest. In addition to age, gender plays an important role in shaping the burden of youth care. In our study, adult women had more interaction with children and adolescents than male (see Additional file 1). In developing countries, women are more likely than men to take on caregiving activities, where mainly elderly women have to cope with the care of grandchildren and children (Schatz [43]; Kalomo and Besthorn [24]). However, we found that the role of men is not negligible, in particular in relation to children. Men in families represent an important resource for children's well-being. In West African countries, children are reared in large extended families with a clan-based kinship centered around a polygynous headman with little parental involvement (Nsamenang [36]). Nonetheless, his role is crucial to guarantee a social position to the young since he provides social connections with the rest of the clan (Nsamenang [36]).
On the other hand, our results showed that most of the inter-household interactions occurred among adults. Kiti et al. [28] found similar results in rural Kenya, where most of the contacts and the total time spent in proximity across households are recorded between adults. We did not find relevant gender differences, and this shows that both adult men and adult women have social contacts outside the household. Even adolescents had a high number of contact events and high time spent in proximity with individuals of the same age-category not belonging to their household. Moreover, adolescents had the highest median degree (i.e., number of distinct individuals with whom an individual has been in contact) in the inter-household network compared to the other age-categories, and this demonstrates the high sociality of youth individuals. Rock et al. [40] studied how the sociality of adolescents can positively influence their mental and physical health in a poor context in Malawi. However, a survey of sexual partnerships among young adults in several villages of Likoma, an island on Lake Malawi, showed that the high connectivity of social network leads to an increased risk of HIV infections (Helleringer and Kohler [18]).

Assortativity
At an individual level, we studied the tendency of participants to interact with individuals with whom they shared similar attributes (i.e., assortativity), in particular we tested the influence of the age and the gender. In human societies assortativity is known to be a prime bond formation factor between a pair of individuals (McPherson et al. [34]). Social assortment might have a large impact on several significant social phenomena like segregation (Leszczensky & Pink [31]), inequality (Karimi et al. [25]), and the transmission of information between groups of individuals (McPherson et al. [34]). The study of assortativity in rural low-and middle-income society is of paramount importance to understand the interactions people experience and which information they share. We observed age and gender assortativity in inter-household network, showing that individuals not belonging to the same family group prefer to interact with people with whom they share similar characteristics. Age disassortativity is instead observed in intra-household networks, this is easily explained by the parenthood and caregiving relationship between adults and youth within the family groups. Similarly, intra-household age assortativity was observed in rural Kenya by Kiti et al. [28], in particular between children aged 6-14 years old, while age disassortativity was observed for adult groups.

Daily activity profiles
We collected data continuously over 24 hours a day, and we obtained a clear trend of the temporal activity of the village. Family members congregate in the early morning, during lunch time and dinner time. These results agreed with those obtained in a study in rural Kenya (Kiti et al. [28]), thus suggesting a typical family behaviour in African rural villages. A growing contact activity from the morning reaching a maximum in the afternoon in individuals not belonging to the same household was observed. We suppose that people congregate in the afternoon for work or other engagements (e.g., play for the youth). However, we have contact information only for the study participants and we are not aware of any other potential contacts they had with people outside of the study.

Conclusion
The study has a number of limitations to be considered before findings can be generalized. Overall, it was performed based on data for a relatively small population and only for one village, and the high-resolution contact network data we used spans less than one month. Due to the spatially restricted data collection site and the relatively small participation rate (58.5%), it will not be possible to generalize with certainty these results to other rural locations or urban localities that have different social, economic, and demographic characteristics. Another limitation is that only bivariate analyses were performed. Future studies should test the multivariate relationships between social interactions and individual's attributes. Despite these limitations, the data collection infrastructure used in this study provided the level of information needed to understand the social context of the village in detail. Proximity sensors captured the dynamics of contacts by collecting high resolution temporal data characterized by timescales comparable with those intrinsic to social dynamics, without the influences of recall bias as instead is the case with paper diaries. Bias due to non-compliance, such as not wearing correctly a sensor and behavior change when participants wore the sensors were potential problems. In our study the participants wore the sensor enclosed in a pouch and pinned to the front of a blouse/shirt in order to minimize attracting attention to themselves. There was relevant anecdotal evidence that these biases did not affect the data collection, and this could be attributed to familiarity as individuals become accustomed to wearing the sensor over time.
The proximity interactions detected by the sensors can provide key information on transmission opportunities of infections within the village. It is well known the extreme vulnerability of African communities to threats of infectious diseases (Fenollar and Mediannikov [15]) and the weaknesses of the disease surveillance systems in several African countries (Marston et al. [33]). Specifically, in Malawi much needs to be done for responses to pandemics with the necessity for substantial improvements in preparedness in key areas such as local surveillance and control guidelines (Sambala and Manderson [42]). The successful transmission of an infectious disease in a population is dependent on many factors, and one of the key factors is the frequency of contacts among infected and susceptible individuals (Horby et al. [19]; Cauchemez et al. [7]). The collection of reliable proximity data can provide crucial information on network properties, such as the presence of superspreaders who are more likely to spread infections based on the number and duration of their interactions. Mathematical models of infectious disease transmission can help to understand the transmission dynamics and to investigate the impact of intervention strategies, and these models generally incorporate age as the key structural feature governing transmission patterns (Mossong et al. [35]). We obtained contact matrices stratified by age that have a crucial role for the evaluation of empirically driven mathematical models that aim to inform interventions strategies. In rural areas of the developing world, access to healthcare is scarce, therefore, it is paramount to be able to understand transmission of communicable diseases and to identify and control measures such as vaccination and social distancing.
Our study represents a first step to track proximity interactions in rural Malawi which we hope will provide the basis for more detailed and expansive studies. Describing and understanding close proximity interactions can provide key information on behavioural patterns of individuals such as social isolation and social conflicts. Previous studies showed how social interaction dynamics can negatively influence mental health leading to depressive symptoms (Elmer & Stadtfeld [13]), including in early adolescence (Pachucki et al. [39]). Although family members are centrally important as a primary source of care and source of support, household-level networks can be informative on negative relationships and family conflicts that might be detrimental to proper development and quality of life of children. Moreover, the social network at village-level can show the existence of separation between individuals or population groups and identify social segregation.
For future work, it would be interesting to consider more inter-connected villages, in order to understand the social contacts and to simulate the spread of infection in a wider and more inter-connected community. A larger and more representative sample would be desirable, and the study design should take into account a longer period of data collection for each village. Future studies should aim at characterizing social contact patterns across different spatial regions in Malawi and elsewhere (e.g., high-income countries), particularly in the urban setting, which is rapidly growing, in order to generate more generalizable insights into network characteristics of different regions.