Big Data is the talk of the street and Hadoop is
emerging as the platform of choice for running analysis on both structured and unstructured
Big Data.
One of the main strengths of Hadoop is ad-hoc
massive scale analysis against the data stored in Hadoop. In a typical Hadoop
usage, enterprises will dump majorityof their unstructured in HDFS and periodically
run Map-Reduce analysis to gain insights into new data and optionally structure
it for storage in other external data sources for reuse by other applications.
Following diagram (though overly simplified)
reflects this usage of Map-Reduce.
While this model works quite well for
offline-batched analytics, its serious limitation is that it cannot be used for
real time decision-making. Business use cases that demand quick action on their
data (e.g. security markets, fraud detection, fault detection, location-based
services, Facebook Insights, Twitter trends etc.), cannot leverage Hadoop
Map-Reduce for immediate real time analytics on their new data and leverage
alternate technologies to meet the needs.
There is a popular belief that Hadoop Map-Reduce cannot be real time, which is true so far. HFlame (www.hflame.com, a product from Data Advent) breaks the shackle without reinventing Hadoop or any of its components. HFlame transforms customer’s existing Hadoop infrastructure (e.g. Apache Hadoop, CDH, HDP) with real time data analysis infrastructure.
Following diagram explains the change in
Map-Reduce processing with HFlame –
HFlame Map-Reduce jobs are continuously running
(i.e. job is still active even when no data is available in HDFS to process).
As soon as new data is written to HDFS, it is immediately passed to the
appropriate real time Map-Reduce jobs. Real time Map-Reduce will either
- Produce the immediate insights on the new data or
- Collect the new data for specific amount of time and produce analytics results on the collective data.
HFlame
continuous analysis places Hadoop right in the center of real time business
solutions. Businesses can analyze the data stream instantaneously and leverage
patterns like continuous query, complex event processing without introducing
any further complexity to their infrastructure.
HFlame
is completely transparent to the Hadoop users and works with their own Hadoop
distribution and installation. HFlame leverages the core of Hadoop HDFS and
Map-Reduce data processing framework.
Check
out http://www.hflame.com or http://www.dataadvent.com for more details.