Pig & Hive in Hadoop

Pig & Hive in Hadoop

Pig – is an Apache open-source project and one of the components of the Hadoop eco-system.
Pig – is a high-level data flow scripting language and runs on the Hadoopclusters.
Pig – uses HDFS for storing and retrieving data and Hadoop MapReduce for processing Big Data.

Hive – is a data warehouse system for Hadoop.
Hive – facilitates ad hoc queries and aids analysis of data sets stored in Hadoop.
Hive – provides an SQL like language called HiveQL(HQL)

Hadoop Cloudera

Hadoop Cloudera

Cloudera – is a commercial vendor for deploying Hadoop in an enterprise.
Cloudera – offers ClouderaManager for system management, ClouderaNavigator for data management.

Apache Mahout

Apache Mahout

Apache Mahout is library of machine learning algorithams, helps in clustering and Clustering allows the system to group various entities into separate clusters or groups based on certain characteristics or features.