TY - GEN
T1 - Learning mixed kronecker product graph models with simulated method of moments
AU - Moreno, Sebastian
AU - Neville, Jennifer
AU - Kirshner, Sergey
N1 - Publisher Copyright:
Copyright © 2013 ACM.
PY - 2013/8/11
Y1 - 2013/8/11
N2 - There has recently been a great deal of work focused on de- veloping statistical models of graph structure|with the goal of modeling probability distributions over graphs from which new, similar graphs can be generated by sampling from the estimated distributions. Although current graph models can capture several important characteristics of social network graphs (e.g., degree, path lengths), many of them do not generate graphs with sufficient variation to reect the natural variability in real world graph domains. One exception is the mixed Kronecker Product Graph Model (mKPGM), a generalization of the Kronecker Product Graph Model, which uses parameter tying to capture variance in the underlying distribution [10]. The enhanced representation of mKPGMs enables them to match both the mean graph statistics and their spread as observed in real network populations, but un- fortunately to date, the only method to estimate mKPGMs involves an exhaustive search over the parameters. In this work, we present the first learning algorithm for mKPGMs. The O(jEj) algorithm searches over the contin- uous parameter space using constrained line search and is based on simulated method of moments, where the objective function minimizes the distance between the observed moments in the training graph and the empirically estimated moments of the model. We evaluate the mKPGM learning algorithm by comparing it to several different graph models, including KPGMs. We use multi-dimensional KS distance to compare the generated graphs to the observed graphs and the results show mKPGMs are able to produce a closer match to real-world graphs (10-90% reduction in KS distance), while still providing natural variation in the generated graphs.
AB - There has recently been a great deal of work focused on de- veloping statistical models of graph structure|with the goal of modeling probability distributions over graphs from which new, similar graphs can be generated by sampling from the estimated distributions. Although current graph models can capture several important characteristics of social network graphs (e.g., degree, path lengths), many of them do not generate graphs with sufficient variation to reect the natural variability in real world graph domains. One exception is the mixed Kronecker Product Graph Model (mKPGM), a generalization of the Kronecker Product Graph Model, which uses parameter tying to capture variance in the underlying distribution [10]. The enhanced representation of mKPGMs enables them to match both the mean graph statistics and their spread as observed in real network populations, but un- fortunately to date, the only method to estimate mKPGMs involves an exhaustive search over the parameters. In this work, we present the first learning algorithm for mKPGMs. The O(jEj) algorithm searches over the contin- uous parameter space using constrained line search and is based on simulated method of moments, where the objective function minimizes the distance between the observed moments in the training graph and the empirically estimated moments of the model. We evaluate the mKPGM learning algorithm by comparing it to several different graph models, including KPGMs. We use multi-dimensional KS distance to compare the generated graphs to the observed graphs and the results show mKPGMs are able to produce a closer match to real-world graphs (10-90% reduction in KS distance), while still providing natural variation in the generated graphs.
KW - Kronecker models
KW - Link analysis
KW - Method of moments estimation
KW - Statistical graph models
UR - http://www.scopus.com/inward/record.url?scp=84963756237&partnerID=8YFLogxK
U2 - 10.1145/2487575.2487675
DO - 10.1145/2487575.2487675
M3 - Conference contribution
AN - SCOPUS:84963756237
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 1052
EP - 1060
BT - KDD 2013 - 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
A2 - Parekh, Rajesh
A2 - He, Jingrui
A2 - Inderjit, Dhillon S.
A2 - Bradley, Paul
A2 - Koren, Yehuda
A2 - Ghani, Rayid
A2 - Senator, Ted E.
A2 - Grossman, Robert L.
A2 - Uthurusamy, Ramasamy
PB - Association for Computing Machinery
T2 - 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013
Y2 - 11 August 2013 through 14 August 2013
ER -