Dynamic network analysis of contact diaries ✩

of


Introduction
Interpersonal relations are shaped by, and at the same time shape membership of actors in informal groups (Breiger, 1974). Such latent group membership is often reflected in observable patterns of coparticipation to social events (Davis et al., 1941;Freeman, 2003). Previous work has suggested a variety of analytic methods (Borgatti and Everett, 1997;Seidman, 1981;Everett and Borgatti, 1993;Everett et al., 2018) and statistical models (Wang et al., 2013;Koskinen and Edling, 2012;Snijders et al., 2013;Butts, 2008;Stadtfeld and Geyer-Schulz, 2011;Conaldi et al., 2012;Conaldi and Lomi, 2013;Lerner and Lomi, 2020;Chodrow and Mellor, 2020) for two-mode networks orequivalently -hypergraphs. As we discuss more formally in Section 2.1, a hypergraph contains hyperedges that may link any number of nodes (Bretto, 2013). In contrast, edges in a graph link exactly two nodes.
Events typically have associated time points (or time intervals) and this is -almost by definition -true for meeting events transcribed from contact diaries whose entries have associated dates and/or times (Fu et al., 2013;Yen et al., 2016). A major shortcoming of most themselves particularly well to illustrate the applicability and empirical value of RHEM. According to Alaszewski (2006, p.1), a diary is ''a document created by an individual who has maintained a regular, personal and contemporaneous record''. While appointment diaries regularly appear in the private papers of politicians, civil servants, artists (Hackett, 1989), and business executives, they have rarely been analyzed through the lens of longitudinal network models explaining co-participation of alters in meetings with the diarist. Instead, diary research in the social sciences and humanities has typically focused on the narrative content of unsolicited personal diaries, or on their solicited use as a research tool; for overviews, see Mehl and Conner (2012), Bartlett and Milligan (2015). Personal and contact diaries have been used mostly, albeit not exclusively, as sources of data for qualitative research (Alaszewski, 2006). By specifying RHEM for the study of appointment diaries, however, it will be possible to forge a new and more factual understanding of prominent public figures, based on the evolving structural dynamics of their interpersonal relationships. Findings from such studies would directly inform debates in contemporary history, political science, and elite studies, thus opening new avenues of research in those fields.
We illustrate the empirical value of RHEM in an analysis of event data extracted from former British Prime Minister (PM) Margaret Thatcher's appointment diaries. While the objective of this paper is methodological, we illustrate how the RHEM framework could help to illuminate existing research debates on topics such as the nature of the policy process in the British core executive (Heffernan, 2005;Richardson, 2018), the development of Thatcherism (Williamson, 2015;Jackson and Saunders, 2012), and many others. We highlight the analytical potential of RHEM by presenting an exploratory analysis of Thatcher's meetings with cabinet ministers during the term of the 1979 Parliament (May 1979to June 1983. We hypothesize that participation in meetings -in which ministers can potentially influence the PM and/or can take influence on the PM's interaction with other ministers -can reflect and establish latent competition and power differences among cabinet ministers. More specifically, we demonstrate how the following three research questions could be rigorously addressed with RHEM. Based on a binary actor-level covariate labeling ''dry'' ministers, 1 we test whether: (i) The PM has a preference (or reluctance) to meet with dry ministers and (ii) she has a preference (or reluctance) to meet with homogeneous groups of ministers that are either mostly dry or mostly non-dry. Furthermore, as an example of a structural effect in hyperevent networks we test whether (iii) there is a significant tendency for or against triadic closure in the PM meeting diary data. We conjecture that a negative closure effect -together with a tendency to partially repeat meetings -could lead to the emergence of stable structural holes (Burt, 1992) and might be a signal of competition and systematic power differences among minsters. We emphasize that it is not the goal of this paper to obtain conclusive evidence on these questions -but rather to illustrate how such questions could be tackled with RHEM.

Background
In this paper we propose statistical models for networks of timestamped multi-actor events. Formally, the data we consider involve a sequence of events = ( 1 , … , ), where each event = ( , ℎ ) ∈ comprises the event time and a set of participants ℎ ⊆ , 1 Cabinet ministers who fully supported Thatcher's free-market agenda, which prioritized controlling inflation over controlling unemployment, were labeled ''dry''. On the other hand, ministers who were critical of her hardline economic policies were labeled ''wet'', because Thatcher perceived their inclination towards consensus politics and state intervention to be a weakness (Heppell, 2020). In our analysis we make a binary split between dry and nondry ministers, where the latter category contains Wets but also ministers that are neither labeled dry nor wet. which is a subset from a given sample of actors at the event time. For example, in the case of contact diaries that we examine in the empirical case study of this paper, the actors are the participants to the various meetings recorded in the diary, that is, the actors are the diarist's alters. In the following we recall in Section 2.1 the wellknown equivalence between hypergraphs and two-mode actor-event networks (Seidman, 1981) and in Section 2.2 we discuss the applicability of related statistical network models to networks of time-stamped multi-actor events.

Hypergraphs and two-mode actor-event networks
The underlying data structure for RHEM is a hypergraph containing hyperedges that may connect any number of nodes (Bretto, 2013). Therefore, hypergraphs may be viewed as a generalization of graphs because every edge in a graph can only connect two nodes. Formally, a hypergraph = ( , ) comprises a set of nodes (representing, e. g., the actors of a social network) and a set of hyperedges ⊆ ( ), where each hyperedge ℎ ∈ is a subset ℎ ⊆ of any size (representing, e. g., the set of actors attending a social event). A relational hyperevent = ( , ℎ) is a hyperedge ℎ ⊆ (recording the participants to the event) together with a timestamp (recording the time when the event takes place). In the data that we consider in this paper, the same hyperedge ℎ (i. e., the identical set of participants) can experience more than one event at different points in time.
It is well known that a hypergraph ( , ) gives rise to a bipartite, two-mode actor-event network and vice versa (Seidman, 1981). The first set of nodes of the bipartite network (comprising the ''actor nodes'') is identical with the node set of the hypergraph, the second set of nodes (comprising the ''event nodes'') is identical with the set of hyperedges , and an actor node ∈ is connected to an event node ℎ ∈ in the bipartite graph if and only if the hyperedge ℎ contains the actor in the hypergraph. Fig. 1 displays a sequence of relational hyperevents (middle) and the associated two-mode actorevent network (left ) and hypergraph (right ). The one-mode projection (to the set of actors) of a two-mode network is a graph whose nodes are the actor nodes of the two-mode network and in which two nodes and are connected by an edge if and only if there is an event node in the two-mode network that is connected to both andin other words, if and co-participate in a common event. Fig. 1 (right ) displays the one-mode projection by dashed lines within the hyperedges. We recall that two-mode networks and hypergraphs are equivalent representations of the same data. In contrast, a one-mode projection does not uniquely represent a two-mode network or its associated hypergraph. This can be seen in the example from Fig. 1, where the triads { , , } and { , , } are identically connected in the one-mode projection but are structurally different in both the twomode network and the hypergraph. Note that actors in the first triad experience one common event while actors in the second triad only have pairwise co-participation in different events.
One possibility to analyze two-mode networks, or hypergraphs, is to consider the associated one-mode projection (Borgatti and Everett, 1997), which, however, has its drawbacks. First, as discussed above, one-mode projections do not uniquely represent the original two-mode data. Second, one-mode projections introduce structural artifacts such as high local density and clustering. For instance, a single event with 22 participants (the largest events in our illustrative data) introduces 231 actor-actor ties and 1,540 closed triangles in the one-mode projection. Thus, analyzing one-mode projections can bias findings -at leasttowards ''detecting'' a tendency for triadic closure. For such reasons network analytic methods have been developed that operate directly on two-mode networks or -equivalently -hypergraphs. These include methods to compute centrality or clusters (Borgatti and Everett, 1997), role colorings (Everett and Borgatti, 1993), blockmodels (Doreian et al., 2004), information flows (Everett et al., 2018), or the structure of overlapping subsets (Foster and Seidman, 1982) in two-mode networks Middle: Stylized example of five hyperevents 1 , … , 5 occurring at event times 1 < ⋯ < 5 . Participating actors are given by uppercase letters , … , . Left: Representation of the sequence of hyperevents as a two-mode actor-event network. Events are displayed as rectangular nodes ordered from top to bottom by their time (older events are displayed in a darker shade than younger events) and are connected to their participants by solid lines. Right: Representation of the sequence of hyperevents as a hypergraph. Events are represented by hyperedges displayed as convex hulls enclosing their participants. Dashed lines represent ties in the one-mode projection to the set of actors. or hypergraphs. These network analytic methods have a different objective than the statistical models that we propose in this paper. Our framework allows analysts to conduct statistical tests whether certain characteristics of sets of actors (such as being homogeneous with respect to an actor-level covariate or being indirectly connected via third actors) increase or decrease the probability to experience common events (compare the three exemplary research questions formulated in Section 4). Thus, we need methods that estimate interaction probabilities for hyperedges (i. e., sets of actors) by comparing those on which events are observed with alternative hyperedges that could potentially have experienced events. We therefore turn our discussion to related statistical models defining probability distributions on a given space of two-mode networks (or hypergraphs).

Related statistical models for two-mode networks and relational event networks
Statistical models for 2-mode networks (or hypergraphs) have been proposed in different frameworks for (temporal) ERGM (Wang et al., 2013;Krivitsky and Handcock, 2014), SAOM (Koskinen and Edling, 2012;Snijders et al., 2013), relational event models (REM) (Butts, 2008;Conaldi et al., 2012;Lerner and Lomi, 2020;Valeeva et al., 2020), or by defining configuration models for hypergraphs that condition on observed degree sequences (Chodrow and Mellor, 2020). However, (T)ERGM, SAOM, or the configuration model are designed for networks of relational states and cannot cope with the fine-grained time information associated with contact events, where each event might have its own unique time stamp.
REM, on the other hand, are explicitly designed to model networks of relational events with fine-grained time information. However, REM typically specify event rates for dyads having one sending node and one receiving node -which is inappropriate for multi-actor interaction resulting from contact diaries requiring that event rates are specified for all subsets of actors (i. e., hyperedges) in the risk set. We point out that, in the case of meeting events transcribed from contact diaries, the ''event nodes'' in the two-mode actor-event representation are not exogenously given but arise -simultaneously with their ties -in an endogenous process. Event nodes are ''active'' in a single point in time in which they are created by the joint interaction among actors, simultaneously with all their incident ties. This marks a considerable difference to settings in which dyadic relational events connect single actors to existing nodes in the second mode: for instance, Valeeva et al. (2020) specify and estimate event rates for dyads comprising one director and one company board (one event recording that a director joins a board); Lerner and Lomi (2020) estimate dyadic REM for two-mode networks connecting Wikipedia users to the articles they contribute to (one event recording that a user edits an article). In contrast, events in our data do not ''exist'' over extended periods of time (event duration is very short compared to the observation period and events do not overlap) in which actors could decide to join or leave; in our data, events record simultaneous interaction of a set of actors. The distinction whether events are associated with dyads or with groups of actors of any size is also reflected in the size of the risk set, that is, the set of instances that could experience a common event. If events are associated with dyads, the risk set can comprise pairwise combinations of nodes, leading to a maximal risk set size that is quadratic in the number of nodes. If events are associated with hyperedges, the maximal risk set size can be exponential in the number of nodes, since any subset can constitute the participant list of the next event. Moreover, if events are associated with groups of any size, models can test for more complex effects, such as subset repetition of order larger than two (see Section 3.2.1), that cannot be included in dyadic RHEM. Being related to our paper in a different way, Marcum and Butts (2015) propose REM for data extracted from diaries -but their events are about time-use data, rather than contact events.
As discussed in Lerner et al. (2019), some attempts to generalize REM to multi-actor interaction exist. Butts (2008) recommends to create ''virtual'' nodes representing subsets of elementary nodes to treat interaction with multiple senders and/or receivers. This can be a feasible approach, e. g., to represent the entire set of actors by a node that is the receiver of broadcast messages (Gibson, 2005;DuBois et al., 2013) -but it is easy to see that this strategy is restricted to a limited number of predefined sets of actors since there are exponentially many possible subsets. Following a different approach, Kim et al. (2018) propose the hyperedge event model for directed multicast events that have exactly one sender and any number of receivers (or conversely any number of senders and exactly one receiver). Their model specifies for every possible sender-receiver dyad ( , ) an intensity function . These dyadic intensity functions then stochastically determine (1) who is the sender of the next event and (2) which is the set of receivers, given the chosen sender. The difference to RHEM proposed in Lerner et al. (2019), and elaborated for co-attendance data in this paper, is that in the model from Kim et al. (2018) the intensities are defined for dyads, while RHEM can define separate intensities for all hyperedges of any size. Bartolucci et al. (2018) define models for bipartite event networks including first-order effects (e. g., propensity of individual actors to participate in events) and second-order effects (e. g., propensity of pairs of actors to co-participate in events). However, generalizing their model to higher-order effects (like the ones that can be included in RHEM) seems to lead to inefficient estimation algorithms.
Another related model has been defined by Hoffman et al. (2020), who propose a model for the dynamics of face-to-face interaction, such as the formation and dissolution of informal conversation groups during social gatherings. This model maintains a two-mode network of actors and groups in which -at any point in time -each actor is connected to exactly one group (which may be a single-actor group representing an isolate). Actors who are currently isolates can decide to connect to existing groups; actors who are currently members of some multi-actor group can decide to leave the group. Similar to REM for two-mode networks, discussed above, the model from Hoffman et al. (2020) seems not to apply to empirical data of the kind that we consider in this paper. The atomic observations in data that we gather from contact diaries comprise meeting events jointly experienced by groups of actors -rather than events in which individual actors join or leave ongoing meetings. Indeed, we consider the two models as complementary, where Hoffman et al. (2020) model individual actors' decisions to join and leave groups, while we model events jointly experienced by groups of actors.

Relational hyperevent models for networks of time-stamped multi-actor events
In this section we develop relational hyperevent models for timestamped multi-actor events extracted from contact diaries, building on the framework proposed in Lerner et al. (2019). Contact diaries (Fu, 2007) differ from the more common personal diaries (Bolger et al., 2003). The latter record personal histories that may or may not include systematic accounts of contacts with others. Personal diaries are typically retrospective, i. e., based on the diarist's memories of time past. The former are specifically kept to record systematic information about relational events involving multiple alters simultaneously. Contact diaries are typically prospective, i. e., their objective is to organize and structure the diarist's time in the future.
We assume that we have a time-varying population of actors that could potentially appear in the participant list of events at time , denoted by . Moreover, we assume that we are given a sequence of observed relational hyperevents = ( 1 , … , ), where each event = ( , ℎ ) ∈ comprises the event time and a set of participants ℎ ⊆ . For instance, in our exemplary empirical data, discussed in Section 4, is the set of all cabinet ministers at time and an event = ( , ℎ ) represents a meeting with the PM (i. e., the diarist) that takes place at and whose participants are the cabinet ministers listed in ℎ .

General model framework
Following Lerner et al. (2019), RHEM specify an event intensity ( ; ℎ) (also event rate or hazard rate) associated with hyperedges ℎ, which intuitively is the expected number of events on ℎ in a time interval of length one, starting at (Lawless, 2003). Formally, if ( ; ℎ) = |{ ∈ ∶ ≤ ∧ ℎ = ℎ}| is the number of events on hyperedge ℎ up to and including time , the intensity is defined by Given a sequence of relational hyperevents = ( 1 , … , ) and a point in time , we denote by [ ; ] the network of past events which is a function of all events from that happen strictly before (Brandes et al., 2009). We specify the likelihood of the sequence of hyperevents based on the Cox proportional hazard model (Cox, 1972). We decompose the event rate ( ; ℎ) into a time-varying baseline rate 0 ( ) and a relative event rate 1 ( ; ℎ; ; [ ; ]), conditional on a vector of hyperedge statistics ( ; ℎ; [ ; ]) ∈ R , being a function of the network of past events, and a vector of associated parameters ∈ R : The baseline rate 0 represents time-variation in the intensity of events in the whole network and is left unspecified. The partial likelihood based on the observed event sequence is where the ⊆ ( ) are a suitable definition of the risk sets at the event times. As pointed out in Lerner et al. (2019), this framework for relational hyperevents is very close to some specifications of dyadic REM (Butts, 2008;Perry and Wolfe, 2013;Vu et al., 2015;Lerner and Lomi, 2020). The difference is that separate event rates are specified for all hyperedges in the risk set, rather than for all pairs of nodes as in dyadic REM, and that the statistics are functions of sets of any number of nodes, rather than being functions of dyads. The hyperedge statistics ( ; ℎ; [ ; ]) can operationalize hypothetical effects explaining multiactor interaction, for instance, by preferential attachment (popularity effects), prior shared activity (familiarity), closure, or covariate effects (e. g., homophily). The associated parameters -estimated by maximizing the likelihood function -provide statistical tests for such hypotheses.
We make two further decisions in defining the model framework. The first decision -constraining the risk set at the time of the event = ( , ℎ ) to those hyperedges that have the same size as ℎ -is due to a very strong dependence of event intensity on hyperedge size. The second decision is taken to achieve computational tractability and consists of sampling from the risk sets.
Conditioning the model on the size of the observed hyperevents seems to be necessary -at least until we have found better ways to control for size -to avoid the confounding effect of hyperedge size on event intensity. To illustrate this, we note that the baseline probability to experience events typically depends strongly on the size of hyperedges. For instance, in our exemplary empirical data (compare Fig. 6) hyperedges of size one have a baseline event intensity that is about 50 times higher than the intensity on hyperedges containing exactly two actors and more than 500,000 times higher than the baseline intensity on hyperedges of size 11. Such examples illustrate that event intensities should not be compared between hyperedges of different size. Indeed, the preliminary analysis in Lerner et al. (2019) has shown that models with unconstrained risk sets yield results that strongly depend on how models control for the effect of hyperedge size on event intensity.
While we do not claim that this approach is preferable in all settings, we consider in this paper conditional-size models (Lerner et al., 2019), where the risk set at the time of an observed event = ( , ℎ ) consists of all hyperedges of size |ℎ |, i. e., = {ℎ ⊆ ; |ℎ| = |ℎ |}. This implies that for each of the factors in Eq. (3) the hyperedges ℎ appearing in the denominator have the same size as the hyperedge ℎ in the numerator on which the event has been observed. Thus, event intensities are only compared among hyperedges of the same size. The alternative hyperedges ℎ can be obtained from the event hyperedge ℎ by removing a subset of actors ℎ ′ ⊆ ℎ of any size from ℎ and then adding a set of new participants ℎ ′′ of the same size as ℎ ′ . Formally, the event hyperedge ℎ is only compared with alternative hyperedges Thus, as discussed in Lerner et al. (2019), conditional-size models are consistent with the point of view that (groups of) actors compete for participation in meetings. We further discuss the implications of conditional-size models -and the need for future work on this issue -after having presented the empirical results in Section 5.
The second decision -taken to maintain computational tractability -is that the risk sets at the event times are replaced by sampled risk sets̃ (Lerner et al., 2019), obtained via case-control sampling (Borgan et al., 1995). Sampling has become established in estimating approximate parameters for dyadic REM on large networks. For instance, Butts (2008) points out that estimating parameters via approximate likelihood functions, obtained by sampling, could make REM applicable to larger networks. Vu et al. (2015) specifically apply casecontrol sampling (Borgan et al., 1995), where all events and for each event a fixed number of randomly drawn ''controls'' (non-events from the risk set) are considered. Lerner and Lomi (2020) experimentally assess the reliability of case-control sampling for REM estimated on large networks. Following Lerner et al. (2019), we draw -for a given number of non-events per event -sampled risk sets̃that contain the hyperedge ℎ of the observed event = ( , ℎ ) and non-event hyperedges drawn uniformly at random from = {ℎ ⊆ ; |ℎ| = |ℎ |}. This leads to the following sampled likelihood function: Given the values of the statistics ( ; ℎ; [ ; ]) for all hyperedges ℎ in the sampled risk sets̃, maximum likelihood estimates for the parameters in Eq. (4) can be computed with standard statistical software. For the empirical analysis reported in this paper we compute hyperedge statistics with an extension of eventnet 2 (Lerner and Lomi, 2020) and use the R package survival 3 (Therneau and Grambsch, 2013) to estimate parameters.

Model specification: network effects
Network effects included in the empirical model specification that we estimate in our illustrative analysis fall into three classes. First, we control for the tendency to repeat previous events, either in exact repetition with the identical set of participants or in partial repetition where a subset of participants of a previous meeting co-participates in a future meeting, potentially with yet other participants. Second, we introduce hyperedge statistics based on actor-level covariates, assessing first-order effects and homophily effects of the respective covariate.
(In our exemplary case study we use the binary covariate labeling ministers as dry or non-dry.) Third, we define a hyperedge statistic for how strongly groups of actors have previously co-participated in meetings with common third actors to assess a tendency for or against triadic closure in meeting data. Technically, effects are added to the model by defining the vector of hyperedge statistics ( ; ℎ; [ ; ]) in the specification of the relative event rate in Eq. (2). These statistics assign real numbers to hyperedges ℎ at time based on the network of past events [ ; ].

Repetition and subset-repetition
A very basic effect in hyperevent networks is that events are repeated or partially repeated (Lerner et al., 2019). In all models reported in this paper we let the effect of past events decay over time in the way as it has been suggested in Brandes et al. (2009). For a given half life period 1∕2 , we set the decay factor ( ) for the time difference to ) .
The repetition statistic associated with a given hyperedge ℎ at time counts the number of previous events in < = { = ( , ℎ ) ∈ ; < } whose set of participants ℎ is identical with ℎ, weighted by the respective decay factor. Formally, it is defined by where is the indicator function that is one if the argument is true and zero else. Often the set of participants of one event is not exactly repeated in another event -as it is required in the definition of -but only partially and possibly in co-participation with yet other actors. We first define a time-varying indicator, denoted as hypergraph degree, assessing to what extent a given set of actors ℎ co-participated in past events. This indicator counts the number of events in which all actors in ℎ coparticipated -weighted by the respective decay factor -and is formally defined by ℎ .
( ; ℎ; 2 https://github.com/juergenlerner/eventnet. 3 https://CRAN.R-project.org/package=survival. Subset repetition is a family of hyperedge statistics parametrized by the order (also size or cardinality) of subsets. Formally, for a given integer ∈ N, subset repetition of order is defined by denotes the set of all subsets of ℎ that have exactly elements. The formula above takes the average hypergraph degree over all subsets of size of ℎ. Thus, subset-repetition of order can assess whether sets of actors who have co-participated in previous events (potentially with other participants) are more or less likely to co-participate in future events (potentially with other participants).
Repetition and subset-repetition of various order is illustrated in Fig. 2. Note that subset-repetition of order one assigns the identical value to all three triads, subset-repetition of order two can distinguish between ℎ 1 on one hand and ℎ 2 and ℎ 3 on the other hand but fails to distinguish between the latter two, subset-repetition of order three can recognize that ℎ 3 has one previous joint event, in contrast to ℎ 1 and ℎ 2 .
We emphasize that subset-repetition statistics are not merely ''control'' variables, but may be able to uncover important aspects of participant selection based on familiarity of different order. Subset-repetition of order one accounts for individual past activity (i. e., the number of prior events of individual actors) and can model a tendency for or against preferential attachment. Subset-repetition of order two accounts for prior shared events on dyads (dyadic familiarity); subset-repetition of order three considers prior shared events on triads (triadic familiarity), and so on.
Although we do not have event types or weights in the empirical data analyzed in Section 4, we note that types or weights could be taken into account in the repetition or subset-repetition statistics in a straightforward way. Different statistics could count only past events of certain types and/or could add up the weights of past events. This would be very similar to the approach for typed and weighted dyadic events proposed in Lerner et al. (2013a).

Covariate effects (first-order and homophily effects)
We assume that we are given a binary, time-constant actor-level covariate ∶ → {0, 1}. (In the exemplary case study of this paper we use the binary covariate dry, where ( ) = 1 indicates that is a dry minister.) We define two hyperedge statistics modeling (1) a first-order effect of the covariate , assessing whether actors with ( ) = 1 are more likely to be among the participants of meetings than actors ′ with ( ′ ) = 0 and (2) a second-order effect on homophily with respect to . Both statistics are independent of time and independent of the network of past events.
For a hyperedge ℎ, the statistic average-is the ratio of actors with ( ) = 1 in ℎ; formally In our exemplary case study (with operationalizing whether ministers are dry) this statistic can assess whether dry ministers do more often participate in meetings -or, from the other point of view, whether the PM has a preference to meet with Dries. If a positive parameter is associated with . , then in the example given in Fig. 3, the right-most hyperedge would be predicted to have the highest event rate, followed by the hyperedge in the middle, and then the left-most hyperedge.
For  If |ℎ| is odd, then it is not possible that the two groups have the same size, so that zero (minimally homogeneous) would not be attainable. To correct this, we take the value In our exemplary case study (with operationalizing whether ministers are dry), the statistic -homogeneity can assess whether meetings reveal a homophily effect with respect to the dry/non-dry characteristic of ministers -or, from the another point of view, whether the PM has a preference to meet dry ministers separately from non-dry ministers. If a positive parameter is associated with .ℎ , then in the example given in Fig. 3, the right-most and the left-most hyperedge would be predicted to have a higher event rate than the hyperedge in the middle. If .ℎ has a negative parameter, then the ''mixed'' hyperedge in the middle would be predicted to have a higher event rate than the two ''pure'' hyperedges.

Closure
The closure statistic measures to what extent the members of a hyperedge ℎ have co-participated in previous events with common third actors . In contrast to subset-repetition of order three, these third actors may be outside of the focal hyperedge ℎ and different members of ℎ may have co-participated with in different past events. Formally, ( 4 ; ℎ 4 ) = 1. Understanding the implication of the closure effect in hyperevent networks -and its interplay with subset-repetition of various orderis challenging. For illustration we use again Fig. 4 which recalls five hyperevents (assumed to be observed in the past) and defines two additional hyperedges, ℎ = { , , } and ℎ ′ = { , , }, on which events could happen at a time point > 5 .
A superficial look at the one-mode projection (given by the dashed lines) seems to suggest that the five observed events reveal a tendency 5 The closure statistic in our paper would have been denoted by (1,1,1) in Lerner et al. (2019) who define closure statistics of varying order -but do not estimate any closure effect in their empirical analysis. Another difference is that Lerner et al. (2019) propose a different normalization by also dividing over all possibilities of third actors; we deviate from this to be more consistent with usual definitions of closure in networks of dyadic events. for triadic closure, since the one-mode projection contains many closed triangles. However, a more detailed inspection of the actual hyperevents tells us that just one of the five events (namely 4 connecting actors and ) has a closure statistic different from zero ( and have co-participated in past events with actor ). None of the other four events closes any two-path that was present before the respective event. Most triangles in the one-mode projection are actually created by hyperevents of size three or larger. Thus, a first insight is that the existence of densely connected groups, local clustering, or an overrepresentation of closed triangles in hyperevent networks does not give any evidence for a positive closure effect but can alternatively be explained by events of size three or larger -or by the tendency to (partially) repeat such events.
Next we consider the two hyperedges ℎ = { , , } and ℎ ′ = { , , }, on which events could potentially occur in the next time point and discuss whether such events would provide evidence for a closure effect in a model that controls for subset-repetition of various order. A hypothetical event on the hyperedge ℎ = { , , } would give evidence for closure. Indeed , and have previously co-participated each in one event with the common third actor . Subset-repetition, on the other hand, does not appear to explain a hypothetical event on ℎ. We note that , and individually have participated in only one event each so that they are rather inactive actors, compared to others in the same network. Among the three unordered pairs in { , , } there is just one that has one previous joint event so that subset-repetition of order two on the hyperedge ℎ is equal to 1∕3 and therefore is also below average (which is 12∕ ( 8 2 ) = 0.43 in this network). Finally { , , } have never jointly co-participated in any event, so that sub-repetition of order three (or higher) could not explain an event on ℎ either. Indeed, closure seems to be the only satisfying explanation for a hypothetical event on ℎ -beyond random chance. This is very different for a hypothetical event on ℎ ′ = { , , }. While the closure statistic on ℎ ′ has a positive value (actually, it is equal to two), a hypothetical event on ℎ ′ could alternatively be explained by subset-repetition. Indeed, the triad { , , } has co-participated in a joint event before and subset-repetition of order two and three on ℎ ′ are above average, both being equal to one. Thus, assuming that we have positive subset-repetition effects, an event on ℎ ′ would provide no evidence for closure. (It is obvious that from observing just a handful of events, as in this stylized example, we could not statistically separate the effects of closure from subset-repetition. However, in our exemplary empirical data we have thousands of events and -as it will turn outwe can actually separate these effects.) Closure and its interplay with subset-repetition has implications for the macro-structure of hyperevent networks on one hand and for the existence of stable structural holes on the other. First we note that repetition and subset-repetition leads to a reinforcement of densely connected groups of actors. For instance, the data from Fig. 4(left ) suggests the dense groups { , , }, { , , , }, and { , , }. Subsetrepetition and repetition predicts that these groups will be re-inforced.
Crucially, these groups are overlapping in two ''bridging'' actors and , respectively. These two actors are surrounded by structural holes which could point to a position of power (Burt, 1992). Given such a precondition, the future evolution of the network crucially depends on whether we have a positive or a negative closure effect. A positive closure effect -which could lead for instance to an event on ℎ = { , , } -would imply that overlapping dense groups have a tendency to merge over time. In turn, this would close structural holes and and would potentially lose their power positions. In contrast, a negative closure effect would prevent, for instance, events on ℎ = { , , }. In turn, overlapping dense groups would typically not merge and structural holes would remain open. In conclusion, a negative closure effect -in combination with positive (subset-)repetition -could explain overlapping but stable dense groups in dynamic coparticipation networks or, from another point of view, is a way to sustain structural holes and thus power positions of actors. We take up this point again when discussing the empirical results on the closure effect in Section 5.

Illustrative case study: Thatcher meetings
We seek to establish the empirical value of RHEM in an analysis of empirical data sourced from Margaret Thatcher's appointment diaries. While the contribution of our paper is methodological, we provide in this section a brief overview of the empirical setting from which the exemplary data stems. Indeed, to demonstrate that RHEM can be fruitfully applied in empirical social network analysis, we illustrate how relevant research questions could be addressed and how findings could advance existing empirical research. While details of the empirical setting are necessarily context-specific, we emphasize that RHEM provide a general model framework for time-stamped multi-actor events from contact diaries.

Background on the illustrative empirical setting
Network studies are firmly established in political science research (Victor et al., 2017;Ward et al., 2011). The majority are based on crosssectional designs, but dynamic models such as temporal exponential random graph models (TERGM) and stochastic actor-oriented models (SAOM) have also been applied to longitudinal network data; for an overview, see Desmarais and Cranmer (2017). Longitudinal studies using time-stamped relational event data are increasingly adopted also in the political sciences (Lerner et al., 2013a;Stadtfeld et al., 2017;Brandenberger, 2019) -but models are typically for dyadic events and do not directly apply to multi-actor events transcribed from contact diaries. This is a severe limitation for political science since appointment diaries exist widely and contain masses of information about the working lives of individuals, particularly leaders and managers. Indeed, the empirical data on which we illustrate our models are sourced from the appointment diaries of former British PM Margaret Thatcher, which are publicly accessible and cover a significant portion of her premiership (Margaret Thatcher Foundation, 2019). Such data makes it possible to study the British core executive using advanced network analytic methods.
Many research themes have emerged in the literature that relate either directly or indirectly to the executive branch of the British government. For example, there are long-standing debates on whether policy implementation is driven by a prime ministerial or collectivist process (Bennister and Heffernan, 2012;Burch and Holliday, 2004;Burnham and Jones, 1993;Byrne and Theakston, 2019;Heffernan, 2003), and investigations into different leadership styles (Bowles et al., 2007;Kaarbo and Hermann, 1998;Theakston, 2011). Our case study illustrates how RHEM can add to this literature by specifically focusing on power dynamics within Thatcher's early cabinets. When she entered Downing Street in May 1979, Thatcher faced a lack of internal support from several members of her conservative party -known as the ''Wets'' -who were critical of her hard-line economic policies and considered her to be an untried extremist with a limited shelf life as PM (Cannadine, 2016, p. 29). To increase the chances of implementing her domestic policy agenda, therefore, Thatcher placed her small band of cabinet supporters -i. e., the Thatcherites, or ''Dries'' -into key economic positions. Then, at subsequent cabinet reorganizations, she took the step of using promotions almost exclusively for policy ends, to gradually shift the balance of power in her favor (King, 1985, p. 132).

Orienting questions
It is within this context that we demonstrate the potential of the RHEM framework. To illustrate how RHEM could be applied in empirical research, we pose exemplary questions to explore the power balance between the Dries and Non-dries (i. e., Wets and other ministers not labeled as dry) in Thatcher's early cabinets, and how it impacted her approach to ministerial meetings. Our first question relates to a first-order effect of the actor-level covariate dry on the propensity to participate in events. Essentially, it seeks to determine whether the PM preferred to meet with dry ministers over non-dry ministers: RQ1 Does the chance of a group of ministers jointly meeting the PM increase as the proportion of dry ministers in that group increases?
Our second question also explores the general composition of Thatcher's ministerial meetings. It addresses a second-order effect of the same covariate to test whether ministerial meetings tend to be homogeneous (mostly dry or mostly non-dry ministers) or heterogeneous (a balance of dry and non-dry ministers) in nature: RQ2 Does the chance of a group of ministers jointly meeting the PM increase if the group is homogeneous, that is, if it is composed mostly of dry or mostly of non-dry ministers?
Our final question tests for the structural effect of triadic closure in the data. Positive triadic closure would imply that ministers who coattended meetings with common third ministers have an increased probability to co-attend meetings themselves. Negative triadic closure, on the other hand would imply that brokers (i. e., those who bridge between different groups) typically keep their power positions since their different contacts are kept separate: RQ3 Does the chance of a group of ministers jointly meeting the PM increase if they have previously co-attended meetings with common third ministers?
The models that we fit later in this paper to address these research questions control for other basic effects in networks of time-stamped multi-actor events, such as the tendency to repeat meetings with the identical or overlapping participant lists. We note that a tendency against triadic closure could point to the existence of stable structural holes and in turn to competition or power differences among cabinet ministers. This aspect has been already discussed in Section 3.2.3 and we will discuss it again in light of our results in Section 5.

Illustrative empirical data
Our illustrative analysis is based on a sequence of 1,989 meeting events of Margaret Thatcher with her cabinet ministers, which took place between 5th May 1979 and 8th June 1983, i. e., Thatcher's full first term. These events are sourced from the PM's appointment diaries, which are publicly accessible and cover a significant portion of her premiership (Margaret Thatcher Foundation, 2019). The cabinet consisted of 21 ministers (in addition to the PM), with the exception of eight months from September 1981 when there were 22 ministers. Cabinet composition -that is, the population of actors -changed in four cabinet reorganizations with a total of eight ministers exiting and eight ministers entering cabinet. Individual meeting events in the data are listed by the minute and do not overlap in time. The participant lists of events (i. e., their hyperedges) comprise all attending ministers except the PM, since she attended every meeting by definition. Meetings, therefore, involve one, several, or all cabinet ministers. In many cases, other non-cabinet members were involved in meetings. These have been removed from the data to focus explicitly on the PM's interactions with ministers. In this paper we use a binary actor-level covariate that labels ministers as ''dry'' (that is, supportive of the PM's economic agenda) or ''non-dry''. Of the 29 ministers who featured in cabinet during Thatcher's first administration, eight were Dries. 6 Note that the category of ''non-dries'' contains ministers considered Wets (those who were opposed to Thatcherism) and those that were considered neither dry nor wet. This covariate does not vary over time.
In Fig. 5 we visualize the co-attendance network after selected points in time (to avoid visual clutter we only display the networks after the first few events in the sequence of almost 2,000 meetings). We display only events with two or more attending participants (multiactor events), since meetings with only one attending minister result in uninformative event nodes of degree one. The layout has been computed by an algorithm for dynamic network visualization proposed by Brandes and Mader (2011) and implemented in the visone 7 software (Baur et al., 2001). Note that this is an offline algorithm in which node positions for the network at time take into account network data from time points ′ > . We visualize the co-attendance networks for the first 14 multi-actor events in the Appendix in Fig. 7.
The number of sets of actors that could potentially experience joint events depends exponentially on the size of the population. In our illustrative empirical analysis we get 2 21 ≈ 2.1 million different sets for cabinets with 21 members and for cabinets with 22 members we get 2 22 ≈ 4.2 million different sets of ministers that could potentially constitute the participant list of any meeting. (Strictly speaking we have to subtract one from these numbers since the empty subset, i. e., a meeting with no participating minister, would not be considered an event in our data.) The distribution of meeting sizes in our empirical data is given in Fig. 6 (left ). This distribution of observed event sizes is contrasted in Fig. 6 (right ) with the distribution of the sizes of subsets that are drawn uniformly at random from the set of all cabinet ministers at event times. (The latter are draws from the binomial distribution ( ) = ( ) with = 21 or = 22, respectively.) It can be seen that small meetings are over-represented in the empirical data, meetings of intermediate size are very infrequent compared to the large number of subsets of intermediate size, and large meetings are also over-represented -although not to the same degree as small events. As discussed above, the strong dependency of event frequency on subset size is the main reason to consider in this empirical study only RHEM with size-constrained risk sets. We discuss this aspect again after having presented the results. The largest size-constrained risk set in our empirical data contains more than 700,000 hyperedges and results from meetings with 11 participating ministers from a cabinet of size 22.

Results
We estimate RHEM parameters from the empirical PM meeting data by maximizing the sampled likelihood given in Eq. (4). For each observed event = ( , ℎ ) we sample 100 non-event hyperedges from the size-constrained risk set (i. e., sets of cabinet ministers at Table 1 Cox proportional hazard models for event intensities associated with hyperedges (sets of ministers). All three models are estimated on 1,989 events and 101,246 observations (the number of observations is the number of events plus the number of sampled controls).
Repetition 0.38 (0.02)*** 0.37 (0.02)*** 0.38 (0.02)*** Sub.rep (1) 0.07 (0.04) a 0.10 (0.04)** 0.07 (0.04) a Sub.rep (2) 2.34 (0.13)*** 2.37 (0.14)*** 2.36 (0.14)*** Sub.rep (3) 2.33 (0.14)*** 2.31 (0.14)*** 2.30 (0.14)*** closure of the same size as ℎ ). If the size-constrained risk set has less than 100 elements (which happens for instance for observed events of size one), we take the full risk set. We set the half life for the decay of past events to 30 days. Thus, an event counts close to one right after the event, one month later it counts 1/2, and so on. 8 Effects in the hyperevent model therefore assess the impact of recent events. Before estimating the parameters, we standardize statistics to mean equal to zero and standard deviation equal to one. Since statistics have no natural units, this facilitates the interpretability of the relative effect sizes. Table 1 reports estimated parameters of three models. All models include repetition, subset-repetition of order one, two, and three, and the closure statistic. The first model additionally includes the statistic average-dryness of sets of ministers, the second model includes dryhomogeneity, and the third contains both covariate effects. Below, we discuss the results mostly independent of the specific empirical context. Recall that it is not the objective of this paper to actually draw conclusive empirical insight, but rather to present a model that can be fruitfully applied to this and related empirical settings.
Repetition and subset-repetition. The parameter associated with is significantly positive. Thus, there is a tendency to repeat (recent) meetings with the identical set of participants. We also find a positive effect of subset-repetition of order one (in two of the models, this effect is only significant at the 10%-level). Thus, actors who (recently) participated in more meetings, are more likely to be included among the participants of future events. This can be seen as a preferential attachment effect where ''popular'' actors (i. e., those who were more frequent participants in the past) accumulate future interaction at a higher rate. Moreover, we find a positive effect of subset-repetition of order two and three. Thus, sets of actors containing dyads (or triads) that previously co-participated in joint meetings experience future meetings at a higher rate. This can be understood as a familiarity effect: actors have a tendency to co-attend meetings with others they have recently met before. This effect is not restricted to dyadic familiarity, where it just matters whether actors have pairwise met before, but also applies to triadic familiarity. For instance, in the example given in Fig. 2 the hyperedge ℎ 3 would be predicted to have a higher event rate than the hyperedge ℎ 2 and ℎ 2 would be predicted to have a higher rate than ℎ 1 . As discussed above, subset-repetition and repetition can reinforce dense groups of actors, or local clustering.
Covariate effects: dry and homogeneous meetings. The parameter associated with the average-dryness statistic -instantiating the statistic ''average-'' in this case study -is significantly positive. Thus, dry ministers are more likely to be among the participants of meetings than non-dry ministers. This finding could be interpreted in the sense that the dry ministers are in more powerful positions (having more opportunities to meet the PM), or -from another angle -that the PM has a preference to meet dry ministers. Moreover, we find a significantly negative parameter associated with dry-homogeneity. Thus, there seems to be a tendency to mix dry and non-dry ministers when assembling the participants of meetings. Taking these effects together, and looking at the example given in Fig. 3, we would expect a tendency for meetings somewhere ''right of the middle'' (i. e., towards the ''drier'' hyperedges) but not at the extreme right. These two covariate effects keep their sign when both are included in the same model. We note that there are fewer dry ministers than non-dry ministers in the cabinets (i. e., population of actors) and therefore, if this covariate had no effect on meeting frequency, then by chance alone we would expect more Non-dries than Dries among the participants of observed meetings. Thus, the negative effect of dry-homogeneity -which pushes hyperedge composition towards balanced meetings with about the same number of dry and non-dry ministers -tends to also increase the number of dry meeting attendees over random chance.
Closure effects. We consistently find a negative closure effect. Thus, looking at the example given in Fig. 4, the hyperedge ℎ, which closes several open two-paths but could not be well explained by subsetrepetition, would be predicted a rather low probability to experience meetings. As discussed in Section 3.2.3, a negative closure effecttogether with positive subset-repetition -can explain overlapping but stable (i. e., non-merging) dense subgroups and the maintenance of stable structural holes. This, in turn could point to actors that maintain power positions by bridging structural holes (Burt, 1992).
Interplay between subset-repetition and closure. How does the estimated negative closure effect depend on whether we control for subsetrepetition or not? To shed light on this question, we fit models, reported in Table 2, that control only for subset-repetition of order one (i. e., for past activity of individual actors), that control for subset-repetition of order one and two, and compare these to the model including subset-repetition up to order three.
The crucial difference is between the leftmost model, which controls only for individual activity, and the two others which also control for prior shared events of dyads or triads, respectively. If we control only for past individual participation in meetings, but fail to control for past co-participation, we spuriously estimate a positive closure effect. This can be well explained with the example given in Fig. 4. The hyperedge ℎ ′ closes several triangles -but could alternatively be explained by subset-repetition of order two or three. However, a model that fails to control for prior shared events misses the difference between events on the hyperedge ℎ (which would point to closure) and events on ℎ ′ (which would give no evidence for closure since they can be explained by subset-repetition or order two or higher). Since events on hyperedges like ℎ ′ are apparently frequent in our data (due to subset-repetition), the first model in Table 2 falsely estimates a positive closure effect.
Explanatory power of estimated RHEM. We might ask how well the fitted models succeed in ''recognizing'' the true observed events among their associated alternatives. Here we do not consider out-of-sample prediction, since we believe that picking the right event out of 4 million alternatives (or up to 700,000 for conditional-size models) is a fairly impossible task. We also argue that RHEM -as we specify and apply them in this paper -are made for testing hypotheses in dynamic co-attendance networks, rather than to foresee the future. This view is consistent with the observation in Block et al. (2018) that in longitudinal network studies, models with the best performance in out-of-sample prediction often have limited relevance for inferential network modeling. Therefore, we fit RHEM to the whole observed data and then compare -separately for each observed event = ( , ℎ )the implied relative event rate on the event hyperedge ℎ with the implied rate on all associated controls sampled from the risk set . The relative event rate of a hyperedge ℎ ′ at the time (implied by a fitted model, that is, with given parameters ) is , compare Eq. (4). We report results for the largest model; see the right-most column of Table 1. This model seems to succeed fairly well in assigning high relative event rates to hyperedges of observed events. More than 39% of all observed events are assigned the highest event rate among all sampled alternatives (note that more than one hyperedge can be assigned the same maximum event rate). Moreover, we compute for each observed event = ( , ℎ ) the percentile of the associated relative rate .
(ℎ ) in the vector of values ( . (ℎ)) ℎ∈̃. For comparison, under random guessing we would expect percentiles equal to 0.5. The model seems to perform reasonably well also from that perspective: the median percentile over all observed events is more than 0.95, implying that the ''typical'' event hyperedge is surpassed by less than 5% of its associated alternatives. Note that this is nevertheless consistent with our view that predicting the correct event hyperedge is a difficult task, since 5% of 4 million (or up to 700,000 in conditional-size models) is still a large number. We think that developing more sophisticated methods to assess the fit of RHEM is a relevant task for future work.

Limitations of the empirical case study
We recall that the contribution of this paper is methodological, that the empirical case study has been included for illustrating the usefulness of RHEM, and that it is not our objective here to draw conclusive insights into Thatcher's government. In fact, the concrete model specification that we applied to our empirical data is quite representative for basic RHEM and is almost independent of the concrete empirical setting. It seems plausible that (subset-)repetition, closure, and covariate effects (main effect and homophily) are present in most instances of relational hyperevent networks -although the nature of the covariate will most likely be different in other settings. We avoided to include effects that are specific to the concrete setting. For that reason, we discuss some of the limitations of our case study and how these limitations could be tackled in future work that seeks to draw empirical insights. Fig. 7. Co-attendance networks at the time of the first 14 multi-actor events in the Thatcher meeting data. Circles represent ministers (circles with a dark shade represent dry ministers), squares represent events, and lines connect events to their participants.
In our study we considered only one covariate -''dry'', which intuitively identifies ministers supporting Thatcher's political agendaand we did not distinguish any type of meetings. A more sophisticated analysis might consider ministerial roles, membership in committees, or restricted access to some meetings, e. g., due to national security interests. Some meeting events are ''formal'' cabinet meetings or cabinet committee meetings which, in the medium term, have static attendee lists, while other events correspond to ''ad-hoc'' meetings with unconstrained participant lists. There are a number of possibilities to extend RHEM in order to control for such constraints. Ministerial roles or committee membership could be included via covariate effects and it is possible to specify separate models for, e. g., formal and ad-hoc meetings, or to interact effects with dummy variables indicating the type of meetings. If attendee lists of meetings are constrained -either by mandatory participation or restricted access -then this could be incorporated by specifying the risk set accordingly: if it is known that, say, minister necessarily has to attend a meeting event = ( , ℎ ), then the risk set at time should comprise only hyperedges that contain ; if it is known that minister must not attend such a meeting (for instance, due to national security interests), then the associated risk set should comprise only hyperedges not containing . Since such constraints could explain latent dense groups or separation between groups of actors, not including them in the model might distort findings on effects such as repetition, homogeneity, or closure.
Another issue worth mentioning is our approach to condition the risk sets to the observed event size. We argue that in our empirical data the event rate associated with hyperedges depends strongly on hyperedge size, with factors up to 500,000. Conditional-size RHEM seem to be a way to avoid comparison of hyperedges with such different baseline rates. The decision to condition on hyperedge size, however, has implications for the interpretation of results. For instance, a finding such as the preference for meeting dry ministers in a conditional-size RHEM does not imply a tendency to add more and more dry ministers to the set of attendees (since this would change the size of the meeting event) but rather a tendency to add dry ministers whilst removing the same number of non-dry ministers, thereby increasing the overall proportion of dry attendees at the meeting event. Thus, the point of view in conditional-size models is that actors compete for participation in meetings. Another implication is that hypotheses concerning event size (such as, a preference for having large or small meetings) cannot be tested with conditional-size RHEM. It is worth noting that the approach to condition hyperedge size cannot be justified by substantive arguments in our empirical data, as it seems unlikely that the PM can only meet with groups of ministers of a given size at a given point in time. We consider conditional-size RHEM rather as a way to prevent that hyperedges of observed events are compared with alternative hyperedges of different size that are known to have very different baseline intensities. For robustness checks we estimated RHEM with unconstrained risk sets, controlling for a linear and quadratic effect of hyperedge size on meeting intensity -similar to the approach taken by Lerner et al. (2019). Some of these findings were consistent with those made via conditional-size models: we found positive effects for repetition, subset-repetition of order two, and average-dryness and a negative closure effect. However, subset-repetition of order one and three, as well as dry-homogeneity, were no longer significant in RHEM with unconstrained risk sets. Moreover, findings made via unconstrained models are more sensitive to the inclusion or exclusion of effects. In summary, while some of the empirical findings seem to be robust, we argue that conditional-size models are a way to filter out the strong dependency of event rates on hyperedge size. An alternative to this approach would be to develop RHEM with unconstrained risk sets that better control for the effect of size. In general, the question whether to condition RHEM on observed event sizes needs future work (also see the discussion below).

Conclusion
Relational event models enable the analysis of time-stamped social interaction data that frequently represent the micro-structure of social networks (Pallotti et al., 2020). Because of their emphasis on temporally ordered event sequences, available relational event models are ill-suited as models for clusters of simultaneous events like those recorded in contact diaries where one person (the diarist) typically meets simultaneously multiple alters at a specific point in time giving rise to a cluster of dyadic events with the same time-stamp. This oneto-many and many-to-one ''multicasting'' situation is more common than our empirical illustration might suggest. For example, situations where ego interacts with multiple alters simultaneously are commonly encountered in empirical studies of conversation (Gibson, 2005) and in the analysis of venture capital investment syndicates (Sorenson and Stuart, 2008).
The main methodological contribution of this paper was to extend REMs by proposing RHEM as a general statistical framework for the analysis of time-stamped events involving multiple actors simultaneously. We have shown how RHEM may be used to estimate event frequencies associated with all subsets (i. e., hyperedges) of a given set of actors. By specifying appropriate hyperedge statistics, it is possible to rigorously test a wide range of hypotheses about how characteristics of a group may influence the probability that group members experience common events. Examples of characteristics of a group that may be of theoretical or empirical interest include its socio-demographic composition and its embeddedness in the network of prior interaction events.
We have exemplified the empirical applicability of RHEM using data extracted for the contact diaries of Ms. Margaret Thatcher. Our illustrative analysis revealed relevant patterns in the structure and dynamics of hyperevent networks. We discussed that an over-representation of closed triangles in one-mode projections of co-attendance networks does not necessarily give evidence for triadic closure -but can in a more straightforward manner be explained by ''large'' events (i. e., events involving three or more actors), or by the tendency to (partially) repeat such events. In our empirical analysis we rather find a tendency against triadic closure, given that we control for repetition and subsetrepetition. We argued that negative triadic closure can explain the existence of overlapping but stable (i. e., non-merging) dense groups and, in turn, the existence of actors occupying stable broker positions. In contrast, positive triadic closure would predict that overlapping groups tend to merge over time, thereby closing structural holes.
In closing, it may be useful to repeat that while we demonstrated the value of RHEM in a specific empirical context, the model itself is general and may be applied to data produced by a variety of social settings. RHEM apply to network data taking the form of a sequence of relational hyperevents, if the following scope conditions hold. Event times have to be available at a sufficient level of resolution to allow considering events as conditionally independent of each other, given the network of previous events (Lerner et al., 2013b). The framework allows simultaneous events, if conditional independence can be credibly assumed. Moreover, the approach to specify event rates at a given time as a function of all previous events assumes that decisions regarding participant lists are done at event time -an assumption that might be violated if events (with fixed set of attendees) are scheduled significantly in advance. A possible way to deal with such situations would be to specify event rates dependent on past events at the time of planning the event. Another scope condition deserving notice is that event participants have to be extracted from a larger set of ''possible contacts'' containing a population of known actors. This set itself may vary over time, but the identities of the actors at risk of experiencing an event must be known at the time of observed events.
Further developing RHEM opens up several avenues for future work. As we discussed above, the approach to condition RHEM on observed event sizes may not be appropriate in all settings. An alternative would be to control for the effect of event size on relative rates by including (functions of) hyperedge size in the model statistics. A preliminary analysis suggests that quadratic polynomials of hyperedge size do not seem to be sufficient -at least not for the empirical data we considered in this paper. Future work might explore fixed effects for (some) event sizes or interacting other explanatory variables with size. Another direction for advancing RHEM methodology would be to further develop methods for model selection and assessment of model fit. For instance, we might compare statistics of possible events predicted by a RHEM with those of observed events to assess whether models are likely to generate plausible events.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix. Visualization of co-attendance networks for the first 14 events
In Fig. 7 we visualize the co-attendance network before and after the first 14 multi-actor events. To avoid visual clutter we choose the earliest time points in the sequence of almost 2000 events. We display only events with two or more attending participants (multi-actor events), since events with only one attending minister result in uninformative event nodes of degree one. The layout has been computed by an algorithm for dynamic network visualization proposed by Brandes and Mader (2011) and implemented in the visone 9 software (Baur et al., 2001). Note that this is an offline algorithm in which node positions for the network at time take into account network data from time points ′ > .