全部版块 我的主页
论坛 数据科学与人工智能 数据分析与数据科学 数据分析与数据挖掘
1168 0
2019-08-08
下载地址:https://u20150046.ctfile.com/fs/20150046-392010184

大小:5.21M

格式:pdf

Patterns for Learning from Data at Scale

In this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors bring Spark, statistical methods, and real-world data sets together to teach you how to approach analytics problems by example.
You’ll start with an introduction to Spark and its ecosystem, and then dive into patterns that apply common techniques—classification, collaborative filtering, and anomaly detection among others—to fields such as genomics, security, and finance. If you have an entry-level understanding of machine learning and statistics, and you program in Java, Python, or Scala, you’ll find these patterns useful for working on your own data applications.
Patterns include:
  • Recommending music and the Audioscrobbler data set
  • Predicting forest cover with decision trees
  • Anomaly detection in network traffic with K-means clustering
  • Understanding Wikipedia with Latent Semantic Analysis
  • Analyzing co-occurrence networks with GraphX
  • Geospatial and temporal data analysis on the New York City Taxi Trips data
  • Estimating financial risk through Monte Carlo simulation
  • Analyzing genomics data and the BDG project
  • Analyzing neuroimaging data with PySpark and Thunder
  • 电子书以试读为主,如果你喜欢请支持正版。

    读书改变生活,读书点亮人生,共勉



二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群