Partitioning in Hive - Demo

Implement Partitioning in Hive

The partitioning in Hive means dividing the table into some parts based on the values of a particular column like date, course, city or country.

Let's assume we have a data of 10 million students studying in an institute. Now, we have to fetch the students of a particular course. If we use a traditional approach, we have to read the entire data leads to performance degradation. The better approach will be to partitioning the table in Hive and divide the data among the different datasets based on particular columns. The advantage of partitioning is that since the data is stored in slices, the query response time becomes faster.

Types of Partitioning

There are 2 Types of Partitioning in Hive

Static Partitioning

It is required to pass the values of partitioned columns manually while loading the data into the table.
Insert input data files individually into each partition table is Static Partition

Dynamic Partitioning

Single insert to partition table (all partitions in one go) is known as a dynamic partition.
Usually, dynamic partition loads the data from the non-partitioned table.