1、数据来源:http://snap.stanford.edu/data/egonets-Twitter.html(Social circles: Twitter);http://snap.stanford.edu/data/higgs-twitter.html(Higgs Twitter Dataset)
2、时间跨度:twitter
3、区域范围:全国
4、指标说明:
(1)Social circles: Twitter
该数据集由Twitter的“圆圈”(或“列表”)组成。Twitter数据来自公共资源。数据集包括节点要素(轮廓),圆和自我网络。
Dataset statistics |
Nodes | 81306 |
Edges | 1768149 |
Nodes in largest WCC | 81306 (1.000) |
Edges in largest WCC | 1768149 (1.000) |
Nodes in largest SCC | 68413 (0.841) |
Edges in largest SCC | 1685163 (0.953) |
Average clustering coefficient | 0.5653 |
Number of triangles | 13082506 |
Fraction of closed triangles | 0.06415 |
Diameter (longest shortest path) | [size=10.0000pt]7 |
90-percentile effective diameter | 4.5 |
Citation:
J. McAuley and J. Leskovec. Learning to Discover Social Circles in Ego Networks. NIPS, 2012.
(2) Higgs Twitter Dataset
希格斯(Higgs)数据集的建立是在2012年7月4日宣布发现具有希格斯玻色子玻色子特征的新粒子之前,之中和之后监视Twitter上的传播过程而建立的。和2012年7月7日。
此处提供的四个定向网络已从Twitter中的用户活动中提取为:
1. 转推(转推网络)
2. 回复(回复网络)现有推文
3. 提及(提及网络)其他用户
4. 参与上述活动的用户之间的朋友/追随者社交关系
5. 关于希格斯玻色子发现期间Twitter活动的信息
值得一提的是,用户ID已被匿名化,并且所有网络都使用相同的用户ID。这种选择允许将Higgs数据集用于有关大规模相互依存/互连的多路复用/多层网络的研究,其中一层负责社会结构,三层负责编码不同类型的用户动态。
此数据集最终更新于2015年3月31日更新。
Social Network statistics |
Nodes | 456626 |
Edges | 14855842 |
Nodes in largest WCC | 456290 (0.999) |
Edges in largest WCC | 14855466 (1.000) |
Nodes in largest SCC | 360210 (0.789) |
Edges in largest SCC | 14102605 (0.949) |
Average clustering coefficient | 0.1887 |
Number of triangles | 83023401 |
Fraction of closed triangles | 0.002901 |
Diameter (longest shortest path) | 9 |
90-percentile effective diameter | 3.7 |
Retweet Network statistics |
Nodes | 256491 |
Edges | 328132 |
Nodes in largest WCC | 223833 (0.873) |
Edges in largest WCC | 308596 (0.940) |
Nodes in largest SCC | 984 (0.004) |
Edges in largest SCC | 3850 (0.012) |
Average clustering coefficient | 0.0156 |
Number of triangles | 21172 |
Fraction of closed triangles | 0.0001085 |
Diameter (longest shortest path) | 19 |
90-percentile effective diameter | 6.8 |
Reply Network statistics |
Nodes | 38918 |
Edges | 32523 |
Nodes in largest WCC | 12839 (0.330) |
Edges in largest WCC | 14944 (0.459) |
Nodes in largest SCC | 322 (0.008) |
Edges in largest SCC | 708 (0.022) |
Average clustering coefficient | 0.0058 |
Number of triangles | 244 |
Fraction of closed triangles | 0.0001561 |
Diameter (longest shortest path) | 29 |
90-percentile effective diameter | 10 |
Mention Network statistics |
Nodes | 116408 |
Edges | 150818 |
Nodes in largest WCC | 91606 (0.787) |
Edges in largest WCC | 132068 (0.876) |
Nodes in largest SCC | 1801 (0.015) |
Edges in largest SCC | 7069 (0.047) |
Average clustering coefficient | 0.0825 |
Number of triangles | 23068 |
Fraction of closed triangles | 0.0002417 |
Diameter (longest shortest path) | 18 |
90-percentile effective diameter | 6.5 |
Citation:
M. De Domenico, A. Lima, P. Mougel and M. Musolesi. The Anatomy of a Scientific Rumor. (Nature Open Access) Scientific Reports 3, 2980 (2013).
相关研究:
[1]Boyd, Ellison N B . Social Network Sites: Definition, History, and Scholarship[J]. Journal of Computer-Mediated Communication, 2007, 13(1, article 11).
[1] Pachucki M A , Jacques P F , Christakis N A . Social Network Concordance in Food Choice Among Spouses, Friends, and Siblings[J]. American Journal of Public Health, 2011, 101(11):2170-2177.
[3] Gordon I R , Mccann P . Industrial Clusters: Complexes, Agglomeration And/Or Social Networks[J]. Urban Studies, 2013, 37(3):513-532.
[4]GR ∗, Pattison P , Kalish Y , et al. An introduction to exponential random graph models for social networks[J]. Social Networks, 2007, 29(2):173-191.