Hive and SQL

To start with any SQL Database, please learn:

DDL – create, alter, drop, truncate
DML – insert, update, delete
DQL – select query
DCL – grant, revoke
TCL – commit, rollback, savepoint

Hive is SQL DB, but not like traditional DB/RDBMS. Its a datastore available within Hadoop ecosystem and stores all the data in HDFS – Hadoop Distributed File System.

HQL is known as Hive Query Language and is quite similar to SQL i.e. Structured Query Language.

But the major difference lies in the way backend processing works, here Hive internally uses Java based MapReduce framework for faster processing of Big Data.

With Hive Version 2.0, it introduced concepts like Hive LLC, Vectorization, etc. that further improves the performance for data processing.

Hive supports parallel/distributed processing via MapReduce and is quite faster as compared to traditional RDBMS like MySQL, Oracle when dealing with Terabytes of data.

Important Concepts in Hive:
1. Internal Table
2. External Table
3. Partitioning and Bucketing

DDL for a sample Hive Table

drop table if exists Passengers;
create table if not exists Passengers (
PassengerId int,
Name string,
Age int,
SeatNum string,
Pclass int)
row format delimited
fields terminated by ','
;

Hive is facing competition from:

Kite
Presto
Kudo
Apache Phoenix, etc.

Refer my GitHub account for BigData and Hive examples and codebase

Refer below GitHub link for the usecases on housing data analytics along with the solution

Hive and SQL

DDL for a sample Hive Table

Like this:

Related

Rahul Aggarwal

http://guardiancoder.in

Leave a ReplyCancel reply

Hive and SQL

DDL for a sample Hive Table

Share this post:

Like this:

Related

Rahul Aggarwal

http://guardiancoder.in

Leave a ReplyCancel reply

Discover more from Rahul Aggarwal's EdTech