Summary
When working with large-scale network data, the interconnected entities often have additional descriptive information. This additional metadata may provide insight that can be exploited for detection of anomalous events. In this paper, we use a generalized linear model for random attributed graphs to model connection probabilities using vertex metadata. For a class of such models, we show that an approximation to the exact model yields an exploitable structure in the edge probabilities, allowing for efficient scaling of a spectral framework for anomaly detection through analysis of graph residuals, and a fast and simple procedure for estimating the model parameters. In simulation, we demonstrate that taking into account both attributes and dynamics in this analysis has a much more significant impact on the detection of an emerging anomaly than accounting for either dynamics or attributes alone. We also present an analysis of a large, dynamic citation graph, demonstrating that taking additional document metadata into account emphasizes parts of the graph that would not be considered significant otherwise.