Tag Archives: Pig

Hadoop Introduction by easydata

Big Data and Hadoop training course is designed to provide knowledge and skills to become a successful Hadoop Developer. In-depth knowledge of concepts such as Hadoop Distributed File System, Hadoop Cluster- Single and multi node, Hadoop 2.0, Flume, Sqoop, Map-Reduce, PIG, Hive, Hbase, Zookeeper, Oozie etc. will be covered in the course.

Hadoop Hangover: Introduction To Apache Bigtop and Installing Hive, HBase and Pig

Hadoop Hangover: Introduction To Apache Bigtop and Installing Hive, HBase and Pig

In the previous post we learnt how easy it was to install Hadoop with Apache Bigtop!
We know its not just Hadoop and there are sub-projects around the table! So, lets have a look at how to install Hive, Hbase and Pig in this post.

How to use parameter substitution with Pig Latin and PowerShell

How to use parameter substitution with Pig Latin and PowerShell

When running Pig in a production environment, you’ll likely have one or more Pig Latin scripts that run on a recurring basis (daily, weekly, monthly, etc.) that need to locate their input data based on when or where they are run. For example, you may have a Pig job that performs daily log ingestion by geographic region. It would be costly and error prone to manually edit the script to reference the location of the input data each time log data needs to be ingested. Ideally, you’d like to pass the date and geographic region to the Pig script as parameters at the time the script is executed. Fortunately, Pig provides this capability via parameter substitution. There are four different mechanisms to define parameters that can be referenced in a Pig Latin script:

  • Parameters can be defined as command line arguments; each parameter is passed to Pig as a separate argument using -param switches at script execution time
  • Parameters can be defined in a parameter file that’s passed to Pig using the -param_file command line argument when the script is executed
  • Parameters can be defined inside Pig Latin scripts using the “%declare” and “%default” preprocessor statements

You can use none, one or any combination of the above options.