Tag Archives: partition

Partitioning and Bucketing Hive table

In previous article, we use sample datasets to join two tables in Hive. To promote the performance of table join, we could also use Partition or Bucket. Let’s first create a parquet format table with partition and bucket:

Then import data into it:

But it reports error:

Read more »

Some tips about Hive

      No Comments on Some tips about Hive

Found some tips about Hive in my learning progress: 1. When I start “bin/hive” at first time, these errors report:

The solution is simple:

Actually, we’d better use mysql instead of derby for multi-users environment. 2. Control the number of mappers for SQL jobs. If a SQL job… Read more »