斯坦福大规模网络数据集大全(Stanford Large Network Dataset Collection,SNAP),由斯坦福助理教授Jure Leskovec整理。免费,清理过,可下载。十多种不同类型的网络数据集(社交;在线社区;电子邮件;引用;Web, ...)。 Friendster数据集有6千5百万节点,18亿条边。
地址:https://snap.stanford.edu/index.html
Social networks : online social networks,edges represent interactions between people
Networks with ground-truth communities :ground-truth network communities in social and information networks
Communication networks : emailcommunication networks with edges representing communication
Citation networks : nodes represent papers,edges represent citations
Collaboration networks : nodes representscientists, edges represent collaborations (co-authoring a paper)
Web graphs : nodes represent webpages andedges are hyperlinks
Amazon networks : nodes represent productsand edges link commonly co-purchased products
Internet networks : nodes representcomputers and edges communication
Road networks : nodes representintersections and edges roads connecting the intersections
Autonomous systems : graphs of the internet
Signed networks : networks with positiveand negative edges (friend/foe, trust/distrust)
Location-based online social networks :Social networks with geographic check-ins
Wikipedia networks and metadata : Talk,editing and voting data from Wikipedia
Twitter and Memetracker : Memetrackerphrases, links and 467 million Tweets
Online communities : Data from onlinecommunities such as Reddit and Flickr
Online reviews : Data from online reviewsystems such as BeerAdvocate and Amazon
Information cascades : ...