Advances in exponential random graph (p*) models applied to a large social network
Section snippets
Data set: add health school group 42
The dataset we analyze here is one of the sets of schools from the National Longitudinal Study of Adolescent Health (AddHealth). AddHealth was a stratified school-based sample of US students in grades 7–12. It featured extensive questionnaires on individual characteristics, as well as a module on friendship networks. Students were provided a roster listing all students in the school by name and by unique ID number, and asked to list the ID for up to five best male and five best female friends
Methods
The ERG modeling class defines the probability of a network with a given set of actors n as
The notation gA(y) represents any possible network statistic, where A indexes the multiple statistics included in a model vector g(y); we will see numerous examples in the next section. ηA represents the coefficients for these terms; their value reflects the change in the conditional log-odds of a tie for each unit increase in gA that the tie would create. κ represents the
Model terms
The ERG model class is general; it includes an infinite number of potential network statistics. Here we focus on statistics that are common in the literature (including many of those discussed in earlier papers in this edition), which are theoretically relevant to these nondirected friendship data, and which are feasible to calculate for networks of size ∼1600 actors. Emphasis is on relatively “local” statistics (those in which the probability of a given edge is directly dependent on only a
Model selection and goodness of fit
In order to examine the goodness of fit models, we use three general approaches:
- 1.
Check for degeneracy and model convergence: a minimum requirement for a model to fit well is for estimation of parameters to converge on finite parameter values. It must also be non-degenerate, that is, not place all of its probability mass on a few networks entirely unlike the observed network, such as a full or empty network.
- 2.
Compare the Akaike information criterion (AIC) between models. Models that exhibit dyadic
Results
We begin with the Bernoulli model (model 1), that with only a single term to capture the density of the network. The AIC for this model is listed in Table 1; the parameter estimates in Table 2 (along with their standard errors estimated using the method of Geyer, 1994), and the goodness-of-fit plots in Fig. 1. Not surprisingly, this simplistic model does not capture the larger statistics of the original network compared in the goodness of fit plots.
We next consider the standard Markov model
Discussion
The advances discussed in earlier papers in this issue have placed generalized statistical inference for dependence models in large social networks on a firmer footing than have previously existed. In the process of applying these models to the friendship network in this paper, we have obtained some insight into the underlying social processes that could (or could not) have generated that set of friendships. First, there appears to be both exogenous (attribute-based) and endogenous (shared
References (22)
- et al.
An introduction to exponential random graph (p*) models for social networks
Social Network
(2007) - et al.
Recent developments in exponential random graph (p*) models for social networks
Social Networks
(2007) Spatial interaction and the statistical analysis of lattice systems
Journal of the Royal Statistical Society B
(1974)On the statistical analysis of dirty pictures
Journal of the Royal Statistical Society Series B
(1986)- et al.
Markov graphs
Journal of the American Statistical Association
(1986) On the convergence of Monte Carlo maximum likelihood calculations
Journal of the Royal Statistical Society Series B
(1994)- et al.
Constrained Monte Carlo maximum likelihood for dependent data
Journal of the Royal Statistical Society Series B
(1992) - Goodreau, S.M., Hunter, D.R., Morris, M., 2005. Statistical Modeling of Social Networks: Practical Advances and...
Statistical models for social networks: degeneracy and inference
- Handcock, M.S., 2003. Assessing Degeneracy in Statistical Models of Social Networks. Center for Statistics and the...