Abstract

The modeling and analysis of networks and network data has seen an
explosion of interest in recent years and represents an exciting direction for
potential growth in statistics. Despite the already substantial amount of work
done in this area to date by researchers from various disciplines, however,
there remain many questions of a decidedly foundational nature — natural
analogues of standard questions already posed and addressed in more
classical areas of statistics — that have yet to even be posed, much less addressed.
Here we raise and consider one such question in connection with
network modeling. Specifically, we ask, “Given an observed network, what
is the sample size?” Using simple, illustrative examples from the class of
exponential random graph models, we show that the answer to this question
can very much depend on basic properties of the networks expected under
the model, as the number of vertices nV in the network grows. In particular,
adopting the (asymptotic) scaling of the variance of the maximum likelihood
parameter estimates as a notion of effective sample size, say neff, we show
that whether the networks are sparse or not under our model (i.e., having
relatively few or many edges between vertices, respectively) is sufficient to yield an order of magnitude difference in neff, from O(nV) to O(n2V). We
then explore some practical implications of this result, using both simulation
and data on foodsharing from Lamalera, Indonesia.