全部版块 我的主页
论坛 提问 悬赏 求职 新闻 读书 功能一区 经管文库(原现金交易版)
1580 0
2021-06-01

1、数据来源:http://snap.stanford.edu/data/egonets-Twitter.html(Social circles: Twitter);http://snap.stanford.edu/data/higgs-twitter.html(Higgs Twitter Dataset)

2、时间跨度:twitter

3、区域范围:全国

4、指标说明:

(1)Social circles: Twitter

该数据集由Twitter的“圆圈”(或“列表”)组成。Twitter数据来自公共资源。数据集包括节点要素(轮廓),圆和自我网络。

Dataset statistics

Nodes

81306

Edges

1768149

Nodes in largest WCC

81306 (1.000)

Edges in largest WCC

1768149 (1.000)

Nodes in largest SCC

68413 (0.841)

Edges in largest SCC

1685163 (0.953)

Average clustering coefficient

0.5653

Number of triangles

13082506

Fraction of closed triangles

0.06415

Diameter (longest shortest path)

[size=10.0000pt]7

90-percentile effective diameter

4.5

Citation:

J. McAuley and J. Leskovec. Learning to Discover Social Circles in Ego Networks. NIPS, 2012.

(2) Higgs Twitter Dataset

希格斯(Higgs)数据集的建立是在2012年7月4日宣布发现具有希格斯玻色子玻色子特征的新粒子之前,之中和之后监视Twitter上的传播过程而建立的。和2012年7月7日。

此处提供的四个定向网络已从Twitter中的用户活动中提取为:

1. 转推(转推网络)

2. 回复(回复网络)现有推文

3. 提及(提及网络)其他用户

4. 参与上述活动的用户之间的朋友/追随者社交关系

5. 关于希格斯玻色子发现期间Twitter活动的信息

值得一提的是,用户ID已被匿名化,并且所有网络都使用相同的用户ID。这种选择允许将Higgs数据集用于有关大规模相互依存/互连的多路复用/多层网络的研究,其中一层负责社会结构,三层负责编码不同类型的用户动态。

此数据集最终更新2015年3月31日更新。

Social Network statistics

Nodes

456626

Edges

14855842

Nodes in largest WCC

456290 (0.999)

Edges in largest WCC

14855466 (1.000)

Nodes in largest SCC

360210 (0.789)

Edges in largest SCC

14102605 (0.949)

Average clustering coefficient

0.1887

Number of triangles

83023401

Fraction of closed triangles

0.002901

Diameter (longest shortest path)

9

90-percentile effective diameter

3.7

Retweet Network statistics

Nodes

256491

Edges

328132

Nodes in largest WCC

223833 (0.873)

Edges in largest WCC

308596 (0.940)

Nodes in largest SCC

984 (0.004)

Edges in largest SCC

3850 (0.012)

Average clustering coefficient

0.0156

Number of triangles

21172

Fraction of closed triangles

0.0001085

Diameter (longest shortest path)

19

90-percentile effective diameter

6.8

Reply Network statistics

Nodes

38918

Edges

32523

Nodes in largest WCC

12839 (0.330)

Edges in largest WCC

14944 (0.459)

Nodes in largest SCC

322 (0.008)

Edges in largest SCC

708 (0.022)

Average clustering coefficient

0.0058

Number of triangles

244

Fraction of closed triangles

0.0001561

Diameter (longest shortest path)

29

90-percentile effective diameter

10

Mention Network statistics

Nodes

116408

Edges

150818

Nodes in largest WCC

91606 (0.787)

Edges in largest WCC

132068 (0.876)

Nodes in largest SCC

1801 (0.015)

Edges in largest SCC

7069 (0.047)

Average clustering coefficient

0.0825

Number of triangles

23068

Fraction of closed triangles

0.0002417

Diameter (longest shortest path)

18

90-percentile effective diameter

6.5

Citation

M. De Domenico, A. Lima, P. Mougel and M. Musolesi. The Anatomy of a Scientific Rumor. (Nature Open Access) Scientific Reports 3, 2980 (2013).

相关研究:

[1]Boyd,  Ellison N B . Social Network Sites: Definition, History, and Scholarship[J]. Journal of Computer-Mediated Communication, 2007, 13(1, article 11).

[1] Pachucki M A ,  Jacques P F ,  Christakis N A . Social Network Concordance in Food Choice Among Spouses, Friends, and Siblings[J]. American Journal of Public Health, 2011, 101(11):2170-2177.

[3] Gordon I R ,  Mccann P . Industrial Clusters: Complexes, Agglomeration And/Or Social Networks[J]. Urban Studies, 2013, 37(3):513-532.

[4]GR ∗,  Pattison P ,  Kalish Y , et al. An introduction to exponential random graph models for social networks[J]. Social Networks, 2007, 29(2):173-191.

附件列表

样本量几十万,两份twitter社交网络数据集!(1)

大小:76 Bytes

只需: RMB 3 元  马上下载

仅供学术科研用途,勿用于商业,如有不妥请联系删除

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群