Pig design patterns simplify Hadoop programming to create complex end to end enterprise big data solutions with Pig 1st Edition by Pradeep Pasupuleti- Ebook PDF Instant Download/Delivery: 978-1783285556, 1783285559
Full download Pig design patterns simplify Hadoop programming to create complex end to end enterprise big data solutions with Pig 1st Edition after payment

Product details:
ISBN 10: 1783285559
ISBN 13: 978-1783285556
Author: Pradeep Pasupuleti
Simplify Hadoop programming to create complex endtoend Enterprise Big Data solutions with Pig
About This Book
Quickly understand how to use Pig to design end-to-end Big Data systems
Implement a hands-on programming approach using design patterns to solve commonly occurring enterprise Big Data challenges
Enhances users’ capabilities to utilize Pig and create their own design patterns wherever applicable
Who This Book Is For
The experienced developer who is already familiar with Pig and is looking for a use case standpoint where they can relate to the problems of data ingestion, profiling, cleansing, transforming, and egressing data encountered in the enterprises. Knowledge of Hadoop and Pig is necessary for readers to grasp the intricacies of Pig design patterns better.
What You Will Learn
Understand Pig’s relevance in an enterprise context
Use Pig in design patterns that enable data movement across platforms during and after analytical processing
See how Pig can co-exist with other components of the Hadoop ecosystem to create Big Data solutions using design patterns
Simplify the process of creating complex data pipelines using transformations, aggregations, enrichment, cleansing, filtering, reformatting, lookups, and data type conversions
Apply knowledge of Pig in design patterns that deal with integration of Hadoop with other systems to enable multi-platform analytics
Comprehend design patterns and use Pig in cases related to complex analysis of pure structured data
In Detail
Pig Design Patterns is a comprehensive guide that will enable readers to readily use design patterns that simplify the creation of complex data pipelines in various stages of data management. This book focuses on using Pig in an enterprise context, bridging the gap between theoretical understanding and practical implementation. Each chapter contains a set of design patterns that pose and then solve technical challenges that are relevant to the enterprise use cases.
The book covers the journey of Big Data from the time it enters the enterprise to its eventual use in analytics, in the form of a report or a predictive model. By the end of the book, readers will appreciate Pig’s real power in addressing each and every problem encountered when creating an analytics-based data product. Each design pattern comes with a suggested solution, analyzing the trade-offs of implementing the solution in a different way, explaining how the code works, and the results.
Table of contents:
Chapter 1: Setting the Context for Design Patterns in Pig
Understanding design patterns
The scope of design patterns in Pig
Hadoop demystified – a quick reckoner
The enterprise context
Common challenges of distributed systems
The advent of Hadoop
Hadoop under the covers
Understanding the Hadoop Distributed File System
HDFS design goals
Working of HDFS
Understanding MapReduce
Understanding how MapReduce works
The MapReduce internals
Pig – a quick intro
Understanding the rationale of Pig
Understanding the relevance of Pig in the enterprise
Working of Pig – an overview
Firing up Pig
The use case
Code listing
The dataset
Understanding Pig through the code
Pig’s extensibility
Operators used in code
The EXPLAIN operator
Understanding Pig’s data model
Primitive types
Complex types
Summary
Chapter 2: Data Ingest and Egress Patterns
The context of data ingest and egress
Types of data in the enterprise
Ingest and egress patterns for multistructured data
Considerations for log ingestion
The Apache log ingestion pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
The Custom log ingestion pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
The image ingress and egress pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
The ingress and egress patterns for the NoSQL data
MongoDB ingress and egress patterns
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
The HBase ingress and egress pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
Table c
The ingress and egress patterns for structured data
The Hive ingress and egress patterns
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
The ingress and egress patterns for semi-structured data
The mainframe ingestion pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
XML ingest and egress patterns
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
JSON ingress and egress patterns
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
Summary
Chapter 3: Data Profiling Patterns
Data profiling for Big Data
Big Data profiling dimensions
Sampling considerations for profiling Big Data
Sampling support in Pig
Rationale for using Pig in data profiling
The data type inference pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Pig script
Java UDF
Results
Additional information
The basic statistical profiling pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Pig script
Macro
Results
Additional information
The pattern-matching pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Pig script
Macro
Results
Additional information
The string profiling pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Pig script
Macro
Results
Additional information
The unstructured text profiling pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Pig script
Java UDF for stemming
Java UDF for generating TF-IDF
Results
Additional information
Summary
Chapter 4: Data Validation and Cleansing Patterns
Data validation and cleansing for Big Data
Choosing Pig for validation and cleansing
The constraint validation and cleansing design pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
The regex validation and cleansing design pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
The corrupt data validation and cleansing design pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
The unstructured text data validation and cleansing design pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
Summary
Chapter 5: Data Transformation Patterns
Data transformation processes
The structured-to-hierarchical transformation pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
The data normalization pattern
Background
Motivation
Use cases
Pattern implementation
Results
Additional information
The data integration pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
The aggregation pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
The data generalization pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
Summary
Chapter 6: Understanding Data Reduction Patterns
Data reduction a quick introduction
Data reduction considerations for Big Data
Dimensionality reduction – the Principal Component Analysis design pattern
Background
Motivation
Use cases
Pattern implementation
Limitations of PCA implementation
Code snippets
Results
Additional information
Numerosity reduction – the histogram design pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
Numerosity reduction – sampling design pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
Numerosity reduction – clustering design pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
Summary
Chapter 7: Advanced Patterns and Future Work
The clustering pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
The topic discovery pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
The natural language processing pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
The classification pattern
Background
Motivation
Use cases
Pattern implementation
Code snippets
Results
Additional information
Future trends
Emergence of data-driven patterns
The emergence of solution-driven patterns
Patterns addressing programmability constraints
Summary
People also search for:
3d pig template
simplicity pattern 4249
big 4 design patterns
3d pig cake
3d paper pig template
Tags: Pradeep Pasupuleti, Pig design, Hadoop programming, complex end


