TY - GEN
T1 - Sampling of attributed networks from hierarchical generative models
AU - Robles, Pablo
AU - Moreno, Sebastian
AU - Neville, Jennifer
N1 - Publisher Copyright:
© 2016 ACM.
PY - 2016/8/13
Y1 - 2016/8/13
N2 - Network sampling is a widely used procedure in social network analysis where a random network is sampled from a generative network model (GNM). Recently proposed GNMs, allow generation of networks with more realistic structural characteristics than earlier ones. This facilitates tasks such as hypothesis testing and sensitivity analysis. However, sampling of networks with correlated vertex attributes remains a challenging problem. While the recent work of [16] has provided a promising approach for attributed-network sampling, the approach was developed for use with relatively simple GNMs and does not work well with more complex hierarchical GNMs (which can model the range of characteristics and variation observed in real world networks more accurately). In contrast to simple GNMs where the probability mass is spread throughout the space of edges more evenly, hierarchical GNMs concentrate the mass to smaller regions of the space to reflect dependencies among edges in the network|this produces more realistic network characteristics, but also makes it more difficult to identify candidate networks from the sampling space. In this paper, we propose a novel sampling method, CSAG, to sample from hierarchical GNMs and generate networks with correlated attributes. CSAG constrains every step of the sampling process to consider the structure of the GNM|in order to bias the search to regions of the space with higher likelihood. We implemented CSAG using mixed Kronecker Product Graph Models and evaluated our approach on three real-world datasets. The results show that CSAG jointly models the correlation and structure of the networks better than the state of the art. Specifically, CSAG maintains the variability of the underlying GNM while providing a ≥ 5X reduction in attribute correlation error.
AB - Network sampling is a widely used procedure in social network analysis where a random network is sampled from a generative network model (GNM). Recently proposed GNMs, allow generation of networks with more realistic structural characteristics than earlier ones. This facilitates tasks such as hypothesis testing and sensitivity analysis. However, sampling of networks with correlated vertex attributes remains a challenging problem. While the recent work of [16] has provided a promising approach for attributed-network sampling, the approach was developed for use with relatively simple GNMs and does not work well with more complex hierarchical GNMs (which can model the range of characteristics and variation observed in real world networks more accurately). In contrast to simple GNMs where the probability mass is spread throughout the space of edges more evenly, hierarchical GNMs concentrate the mass to smaller regions of the space to reflect dependencies among edges in the network|this produces more realistic network characteristics, but also makes it more difficult to identify candidate networks from the sampling space. In this paper, we propose a novel sampling method, CSAG, to sample from hierarchical GNMs and generate networks with correlated attributes. CSAG constrains every step of the sampling process to consider the structure of the GNM|in order to bias the search to regions of the space with higher likelihood. We implemented CSAG using mixed Kronecker Product Graph Models and evaluated our approach on three real-world datasets. The results show that CSAG jointly models the correlation and structure of the networks better than the state of the art. Specifically, CSAG maintains the variability of the underlying GNM while providing a ≥ 5X reduction in attribute correlation error.
KW - Attributed networks
KW - Complex networks
KW - Network sampling
KW - Statistical network models
UR - http://www.scopus.com/inward/record.url?scp=84985020793&partnerID=8YFLogxK
U2 - 10.1145/2939672.2939808
DO - 10.1145/2939672.2939808
M3 - Conference contribution
AN - SCOPUS:84985020793
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 1155
EP - 1164
BT - KDD 2016 - Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PB - Association for Computing Machinery
T2 - 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016
Y2 - 13 August 2016 through 17 August 2016
ER -