Big Data - A Game Changer!!! - Rahul Aggarwal's EdTech

With ever increasing data, Big data analysis and Hadoop platform is gaining popularity day by day. The Hadoop Architecture is perfect to handle the huge data with cheap commodity hardware resources.

HDFS and MapReduce are 2 building blocks of Hadoop.
1. HDFS [Hadoop Distributed File System] is the Storage layer that stores the all the huge data files in a highly available and fault tolerant system.
2. MR [MapReduce] is the Processing Layer for the Huge data on the commodity machines in a parallel fashion for higher throughput and lower latency.

But to understand Hadoop, we should know the other ecosystem technologies as mentioned below:
1. Oozie – Workflow Scheduler: for end to end process execution
2. Sqoop – Data Ingestion Tool for static [at rest] data
3. Flume/Kafka – Data Ingestion Tool for dynamic [at motion] data i.e. Real Time Data
4. Pig – Data Pre-Processing Tool mainly for ETL/ELT kind of logic by using PigLatin scripting
5. Hive – Database or Data-Warehouse like storage for Hadoop
6. HBase – NoSQL database for Hadoop.

Apache Spark [By DataBricks]:

Faster framework for Data Processing as compared to traditional MapReduce.

Works In Memory and based on DAGs [Directed Acyclic Graphs].

Learn RDD, Dataframe and Dataset APIs for Data Processing.

Built in Scala, but it Supports Scala, Java and Python Languages.

In upcoming blog posts, I will elaborate on these latest technologies further.

Watch this video for more details

Big Data – A Game Changer!!!

Big Data – A Game Changer!!!

Like this:

Related

Rahul Aggarwal

http://guardiancoder.in

1 comment so far

techno

Leave a ReplyCancel reply

Big Data – A Game Changer!!!

Share this post:

Like this:

Related

Rahul Aggarwal

http://guardiancoder.in

1 comment so far

techno

Leave a ReplyCancel reply

Discover more from Rahul Aggarwal's EdTech