Research
A mutually exciting model for continuous-time networks
Abstract
Networks and temporal point processes serve as fundamental building blocks for modeling complex dynamic relational data in various domains. We propose the latent space Hawkes (LSH) model, a novel generative model for continuous-time networks of relational events, using a latent space representation for nodes. We model relational events between nodes using mutually exciting Hawkes processes with baseline intensities dependent upon the distances between the nodes in the latent space and sender and receiver specific effects. We demonstrate that our proposed LSH model can replicate many features observed in real temporal networks including reciprocity and transitivity, while also achieving superior prediction accuracy and providing more interpretable fits than existing models.
Motivition and Contributions
Latent space approach and Hawkes process models has been used in dynamic networks. To the best of our knowledge, there isn't much work on implementing latent space approach on continuous time relational data while it is common in real life. Yang et al. proposed a dual latent space model by utilizing multiple latent spaces to both homophily and reciprocity. It provides a richer model that also leads to improved link prediction accuracy. However, much of the interpretability of the latent space, which was the original motivation of the latent space model, is lost by using multiple high dimensional latent spaces.
We consider using a single latent space representation to provide a more interpretable model. The single latent space limits the flexibility of the model compared to the DLS, so we increase flexibility by adding self excitation and sender and receiver effects. We demonstrate that our proposed latent space Hawkes (LSH) model is competitive with other models in predictive and generative tasks on 4 real network datasets while providing more interpretable and stable model fits.
Results
Fig. 1 2-D latent space plot for MID data. Countries with higher frequent incidents tend to be placed closer together (this is zoomed version).
Fig. 2 2-D latent space plot for MID data. Countries in the same continent tend to be clustered together. (zoomed version can be found in our full paper)
Case study on MID dataset
We apply our proposed LSH model to explore the Militarized Interstate Disputes (MID) incident network, a real continuous-time network consisting of timestamped edges that correspond to individual incidents in disputes between countries. As negative relationships are indicated by incidents in the MID network, we might expect the network to be disassortative.
The most active nodes appear centrally, and the node pairs with the most frequent incidents tend to be close together. In the figure, colored nodes and pairs with arrows indicate countries with the most incidents in the dataset, such as Israel (ISR) and Lebanon (LEB), whose latent positions are close together, which makes sense given their high number of incidents in the dataset.
Additionally, countries that are geographically close do mostly appear close together in the latent space. This can be seen from Figure 2, where nodes are colored by continent.
We can also modify the LSH model to make the relationships between latent node positions and relations to be assortative where it has positive relationships. In this case, node pairs with the most frequent incidents now tend to be far apart. For example, Israel and Lebanon will be placed on opposite sides of the latent space. More discussion can be found in our full paper.
Prediction Accuracy
Tasks:
Future events prediction (mean test log-likelihood).
Dynamic link prediction: predict whether an edge appearing between each pair of nodes in the test (future) period by computing AUC according to the predicted probabilities given by the model.
DLS does not scale to the FB-forum data. CTDNE is not generative so test loglikelihood is not applicable. We can see in both tasks, our LSH outperforms the other models on most datasets.
Table 2: Evaluation metrics for predictive accuracy on real network datasets. Bold entry denotes highest accuracy for each metric on a dataset. Test log-lik. shows the mean test set log-likelihood per event and the number of latent dimensions d or blocks K that maximize it. The AUC column shows the mean (standard deviation) of the AUC across 100 time points for dynamic link prediction.