TY - GEN
T1 - Fast generation of large scale social networks while incorporating transitive closures
AU - Pfeiffer, Joseph J.
AU - La Fond, Timothy
AU - Moreno, Sebastian
AU - Neville, Jennifer
PY - 2012
Y1 - 2012
N2 - A key challenge in the social network community is the problem of network generation - that is, how can we create synthetic networks that match characteristics traditionally found in most real world networks? Important characteristics that are present in social networks include a power law degree distribution, small diameter, and large amounts of clustering. However, most current network generators, such as the Chung Lu and Kronecker models, largely ignore the clustering present in a graph and focus on preserving other network statistics, such as the power law distribution. Models such as the exponential random graph model have a transitivity parameter that can capture clustering, but they are computationally difficult to learn, making scaling to large real world networks intractable. In this work, we propose an extension to the Chung Lu random graph model, the Transitive Chung Lu (TCL) model, which incorporates the notion transitive edges. Specifically, it combines the standard Chung Lu model with edges that are formed through transitive closure (e.g., by connecting a 'friend of a friend'). We prove TCL's expected degree distribution is equal to the degree distribution of the original input graph, while still providing the ability to capture the clustering in the network. The single parameter required by our model can be learned in seconds on graphs with millions of edges, networks can be generated in time that is linear in the number of edges. We demonstrate the performance of TCL on four real-world social networks, including an email dataset with hundreds of thousands of nodes and millions of edges, showing TCL generates graphs that match the degree distribution, clustering coefficients and hop plots of the original networks.
AB - A key challenge in the social network community is the problem of network generation - that is, how can we create synthetic networks that match characteristics traditionally found in most real world networks? Important characteristics that are present in social networks include a power law degree distribution, small diameter, and large amounts of clustering. However, most current network generators, such as the Chung Lu and Kronecker models, largely ignore the clustering present in a graph and focus on preserving other network statistics, such as the power law distribution. Models such as the exponential random graph model have a transitivity parameter that can capture clustering, but they are computationally difficult to learn, making scaling to large real world networks intractable. In this work, we propose an extension to the Chung Lu random graph model, the Transitive Chung Lu (TCL) model, which incorporates the notion transitive edges. Specifically, it combines the standard Chung Lu model with edges that are formed through transitive closure (e.g., by connecting a 'friend of a friend'). We prove TCL's expected degree distribution is equal to the degree distribution of the original input graph, while still providing the ability to capture the clustering in the network. The single parameter required by our model can be learned in seconds on graphs with millions of edges, networks can be generated in time that is linear in the number of edges. We demonstrate the performance of TCL on four real-world social networks, including an email dataset with hundreds of thousands of nodes and millions of edges, showing TCL generates graphs that match the degree distribution, clustering coefficients and hop plots of the original networks.
KW - network generation
KW - social network analysis
UR - http://www.scopus.com/inward/record.url?scp=84873690863&partnerID=8YFLogxK
U2 - 10.1109/SocialCom-PASSAT.2012.130
DO - 10.1109/SocialCom-PASSAT.2012.130
M3 - Conference contribution
AN - SCOPUS:84873690863
SN - 9780769548487
T3 - Proceedings - 2012 ASE/IEEE International Conference on Privacy, Security, Risk and Trust and 2012 ASE/IEEE International Conference on Social Computing, SocialCom/PASSAT 2012
SP - 154
EP - 165
BT - Proceedings - 2012 ASE/IEEE International Conference on Privacy, Security, Risk and Trust and 2012 ASE/IEEE International Conference on Social Computing, SocialCom/PASSAT 2012
T2 - 2012 ASE/IEEE International Conference on Social Computing, SocialCom 2012 and the 2012 ASE/IEEE International Conference on Privacy, Security, Risk and Trust, PASSAT 2012
Y2 - 3 September 2012 through 5 September 2012
ER -