top of page

This Web UI allows the user to access to the Data of the cluster,
execute tools to analyse/process the data, and define the analysis workflow.

This Web UI helps the user to monitor the cluster. Furhermore, it boards security key components

This tool allows the user to define an analysis workflow. Ex: Every day, import the data, transform it, analyze it, and export the result

Enables to clean and format the data so they can lok similar in the HBase.

Takes the data from a Hbase and enable the user to manipulate them so they can be correlated in the way he wants.

This tool allows the user to analyse / process the data in different ways. Hive helps to do statistics on the data.

Tool that gives the possibility to apply algorithms or do machine learning on the statistics or on the data itself.

MapReduce is a programming model and an associated implementation for processing and generating large data sets with a parallel, distributed algorithm on a cluster.

Apache Spark is an open-source data analytics cluster computing framework and promises performance up to 100 times faster than Hadoop MapReduce.

Unstructured data Accessible through the HBase/HDFS API

To store the semi-structured (HBase) the Web UI (HUE), or query tools (PIG / Hive / Impala)

This tool allows the user to import data from external datasources into the Hadoop 

Hive
bottom of page