Pig Latin
The Pig Latin is a data flow language used by Apache Pig to analyze the data in Hadoop. It is a textual language that abstracts the programming from the Java MapReduce idiom into a notation.
Pig Latin Statements
The Pig Latin statements are used to process the data. It is an operator that accepts a relation as an input and generates another relation as an output.
- It can span multiple lines.
- Each statement must end with a semi-colon.
- It may include expression and schemas.
- By default, these statements are processed using multi-query execution
Pig Latin Conventions
Convention |
Description |
( ) |
The parenthesis can enclose one or more items. It can also be used to indicate the tuple data type.
Example - (10, xyz, (3,6,9)) |
[ ] |
The straight brackets can enclose one or more items. It can also be used to indicate the map data type.
Example - [INNER | OUTER] |
{ } |
The curly brackets enclose two or more items. It can also be used to indicate the bag data type
Example - { block | nested_block } |
... |
The horizontal ellipsis points indicate that you can repeat a portion of the code.
Example - cat path [path ...] |
Latin Data Types
Simple Data Types
Type |
Description |
int |
It defines the signed 32-bit integer.
Example - 2 |
long |
It defines the signed 64-bit integer.
Example - 2L or 2l |
float |
It defines 32-bit floating point number.
Example - 2.5F or 2.5f or 2.5e2f or 2.5.E2F |
double |
It defines 64-bit floating point number.
Example - 2.5 or 2.5 or 2.5e2f or 2.5.E2F |
chararray |
It defines character array in Unicode UTF-8 format.
Example - javatpoint |
bytearray |
It defines the byte array. |
boolean |
It defines the boolean type values.
Example - true/false |
datetime |
It defines the values in datetime order.
Example - 1970-01- 01T00:00:00.000+00:00 |
biginteger |
It defines Java BigInteger values.
Example - 5000000000000 |
bigdecimal |
It defines Java BigDecimal values.
Example - 52.232344535345 |
Complex Types
Type |
Description |
tuple |
It defines an ordered set of fields.
Example - (15,12) |
bag |
It defines a collection of tuples.
Example - {(15,12), (12,15)} |
map |
It defines a set of key-value pairs.
Example - [open#apache] |
|