Programming for Big Data Analytics
Introduction
Big Data
Hadoop
Introduction to Hadoop
Hadoop Components
Hadoop Common
Hadoop YARN
Hadoop HDFS
Hadoop MapReduce
Google File System
Typical Large Data Problem
GFS vs HDFS
Complex Data Types in Hadoop
Hadoop Programming
Pig
Overview
Architecture
Data Types
Execution Modes
Programming in Pig : Pig Latin
LOAD and STORE Operations
Diagnostic Operations
Group Operations
Join Operation
Filter Operation
Distinct Operation
FOREACH Operation
Order Operation
Limit Operation
Spark
Powered by
GitBook
GFS vs HDFS
GFS vs HDFS:
File Structure:
HDFS:
Divided in 128MB Blocks
NameNode holds block replica as 2 files:One for Data , One for checksum
GFS:
Divided in 64MB Chunks
Chunks divided in 64kb blocks
Each block has 32bit checksum
HDFS is slower than GFS
results matching "
"
No results matching "
"