全部版块 我的主页
论坛 数据科学与人工智能 大数据分析 Hadoop论坛
1639 1
2017-01-05
Beginning Apache PigBig Data Processing Made EasyBalaswamy Vaddeman

Hyderabad, Andhra Pradesh, India




ISBN 978-1-4842-2336-9e-ISBN 978-1-4842-2337-6
DOI 10.1007/978-1-4842-2337-6
Library of Congress Control Number: 2016961514
© Balaswamy Vaddeman 2016

Contents

Chapter 1:​ MapReduce and Its Abstractions
Small Data Processing

Parallel Computing

Problems with MapReduce
Summary



Chapter 2:​ Data Types
Simple Data Types
Summary of Simple Data Types


Complex Data Types
Summary of Complex Data Types


Schema

Casting
Casting Error


Comparison Operators

Identifiers

Boolean Operators

Summary


Chapter 3:​ Grunt
Summary of Commands

Auto-completion

Summary


Chapter 4:​ Pig Latin Fundamentals
Running Pig Latin Code
Grunt Shell

Pig -e

Pig -f

Embed Pig Code in a Java Program

Hue


Pig Operators and Commands
Load

store

dump

version

Foreach Generate

filter

Limit

Assert

SPLIT

SAMPLE

FLATTEN

import

define

distinct

RANK

Union

ORDER BY

GROUP

Stream

MAPREDUCE

CUBE


Parameter Substitution
-param

-paramfile


Summary


Chapter 5:​ Joins and Functions
Summary


Chapter 6:​ Creating and Scheduling Workflows Using Apache Oozie
Types of Oozie Jobs
Workflow


Using a Pig Latin Script as Part of a Workflow
Writing job.​properties

workflow.​xml

Uploading Files to HDFS

Submit the Oozie Workflow


Scheduling a Pig Script
Writing the job.​properties File

Writing coordinator.​xml

Upload Files to HDFS

Submitting Coordinator


Bundle

oozie pig Command

Command-Line Interface
Job Submitting, Running, and Suspending

Killing Job

Retrieving Logs

Information About a Job


Oozie User Interface

Developing Oozie Applications Using Hue

Summary


Chapter 7:​ HCatalog
Summary


Chapter 8:​ Pig Latin in Hue
Pig Module
My Scripts

Pig Helper

Auto-suggestion

UDF Usage in Script

Query History


File Browser

Job Browser

Summary


Chapter 9:​ Pig Latin Scripts in Apache Falcon
Summary


Chapter 10:​ Macros
Structure

Macro Use Case

Macro Types
Internal Macro

External Macro


dryrun

Macro Chaining

Macro Rules
Define Before Usage

Valid Macro Chaining

No Macro Within Nested Block

No Grunt Shell Commands

Invisible Relations


Macro Examples
Macro Without Input Parameters Is Possible

Macro Without Returning Anything Is Possible


Summary


Chapter 11:​ User-Defined Functions
Summary


Chapter 12:​ Writing Eval Functions
MapReduce and Pig Features


Chapter 13:​ Writing Load and Store Functions

Chapter 14:​ Troubleshooting
Chapter 15:​ Data Formats




Chapter 16:​ OptimizationChapter 17:​ Hadoop Ecosystem Tools



Apache Spark
Core

SQL


Apache Tez

Presto
Architecture

Connectors

Pushdown Operations


Summary


Appendix A: Built-in Functions
Appendix B: Apache Pig in Apache Ambari
Modifying Properties
Service Check
Installing Pig
Pig Status
Check All Available Services
Summary
Appendix C: HBaseStorage and ORCStorage Options
HBaseStorage
Row-Based Conditions
Timestamp-Based Conditions
Other Conditions
OrcStorage
BegApaPigByVadBal.zip
大小:(1.76 MB)

只需: 5 个论坛币  马上下载

本附件包括:

  • BegApaPigByVadBal.epub




English | 2016 | EPUB


二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

全部回复
2017-1-12 20:51:54
谢谢楼主!!!
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

相关推荐
栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群