Pig Overview

Hive Vs Pig Feature Hive Pig Language SQL-like PigLatin Schemas/Types Yes (explicit) Yes (implicit) Partitions Yes No Server Optional (Thrift) No User Defined Functions (UDF) Yes (Java) Yes (Java) Custom…

Hive Internal & External Table

A Hive table is a logical concept that’s physically comprised of a number of files in HDFS. Tables can either be Hive Internal Table: Internal table—If our data available into…

Hive Services

Cli  ---The command line interface to Hive (the shell). This is the default service.Hiveserver --Runs Hive as a server exposing a Thrift service, enabling access from a range of  clients…

Hive Shell is run on two modes

The shell is the primary way that we will interact with Hive, by issuing commands in HiveQL. HiveQL is Hive’s query language, a dialect of SQL. It is heavily influenced…

Aggregate Functions in Hive

The following are built-in aggregate functions are supported in Hive:count(*), count(expr), count(DISTINCT expr[, expr_.])count(*) - Returns the total number of retrieved rows, including rows containing NULL values; count(expr) - Returns…

Hive Built-In Functions

Functions in Hive are categorized as below. Numeric and Mathematical Functions: These functions mainly used to perform mathematical calculations. Date Functions: These functions are used to perform operations on date…

Hive Data Types

Hive data types are categorized into two types. They are the primitive and complex data types. The primitive data types include Integers, Boolean, Floating point numbers and strings. The below…