
Having up-to-date information, reducing delays, and using improved speed for competitive advantage, have always been key requirements of any enterprise. With rapidly changing scenarios of the global economy, the requirement of shrinking decision cycles and having shorter time-to-market cycles has been heightened. Be it a bank, a retail chain or a manufacturing company, decision makers would like to be better equipped to make informed decisions, by having access to data that is current and relevant, rather than data that is older and hence less relevant.
Conventional business intelligence solutions primarily deal with data from the past. By using these solutions, enterprises cannot process data ''instantaneously'' or ''on arrival'', due to technical limitations. As per an Aberdeen Group survey, enterprises with real-time visibility have seen noticeably higher performance across several tactical operating metrics such as Growth in Operating Cash Flow, Increase in Inventory Turns and Reduction in Operating Cost.
As a combined effect of business needs and technology trends, a variety of technologies are available for deriving business insight from raw operational data. Such technologies range from simple operational dashboards based on conventional Database Management Systems to advanced techniques like In-Memory Real Time Data Analytics.
To bridge the gap between the operational and analytical systems, the concept of Data Stream Processing has been developed, where transient data is processed as soon as it arrives (even before it is persisted). The premise of the concept is to process and analyze all data on-the-fly. The concept of Data Stream Processing is built around Stream Computing technology – a computer programming paradigm based on the ''Single Instruction Multiple Data (SIMD)'' parallel programming design pattern. In particular, this paradigm utilizes the concept of ''Pipeline Parallel Processing''.

To help understand the concept, refer Figure on the left. In most cases, enterprises continuously receive data that needs to be processed. This data can be viewed as a ''Stream of Data over Time''. For a real-time response, this stream of data needs to be processed, refined and acted upon in real-time. The concept of Data Stream Processing enables real-time processing of such continuous data streams.
The concept differs from conventional data processing frameworks and solutions in several ways, as below:
1. Data streams are usually unbounded.
2. No assumption can be made on data arrival order.
3. Size and time constraints make it difficult to store and process data stream elements after their arrival.
Business Benefits
1. Smarter integration of real-time business intelligence across the organization.
2. Ability to take decisions in real-time.
3. Improved business agility, business innovation, and business continuation.
4. Reduction in the development time as well as cost of real-time data integration by using standards-based SQL.
5. Reduced storage cost, as data is not required to be persisted before it can be analyzed.
Technology Benefits
\r\n
1. Allows for real-time data collection, transformation, aggregation and reporting.
2. Lower latency, as data can be analyzed in-memory, before it is persisted to the storage medium.
3. Data independence, that is, logical/physical separation, leading to loosely coupled applications that need lesser tuning and are more flexible.
4. Can be integrated with multiple stream processing solutions like StreamBase, SQLstream and Esper, to name a few.
Conclusion:
The Real Time Analysis Blueprint aims to augment existing data warehouse and business intelligence solutions, to enable real-time data processing, rather than seeking to replace them. This allows enterprises to effectively use their existing solutions, but add the capability of real-time data processing, helping them manage their data effectively.
Reference:
- http://goto.jackbe.com/rs/leadmdjackbe/images/Aberdeen_Operational_Intelligence_Part_1.pdf
- http://gigaom.com/cloud/big-data-in-real-time-is-no-fantasy/
- http://www.neilconway.org/talks/stream_intro.pdf
- www.cc.gatech.edu/~lingliu/courses/cs4440/notes/6.DataStream.ppt
- http://en.wikipedia.org/wiki/Stream_processing#Parallel_Stream_paradigm_.28SIMD.2FMIMD.29
*If you find something is misleading or not correct then please throw some light on it.





















