High performance In-Memory Database

Almost every major database management system product in the market, tries to keep the portion of database resident in the main memory and the rest on the primary Persistent storage device (HDD or SSD). An In-memory database (IMDB) resides completely in the main memory. If you have the required memory, you will benefit from faster access to data. Random access to large datasets with very low latency will simplify development and enable new applications to access large data more intensively than has ever been possible.

IMDBs have all the properties of a traditional RDBMS, but are fine tuned for data to reside in main memory. Unlike NoSQL databases, IMDBs can cater for systems which are already written for earlier generation of RDBMS. These technologies are already reshaping the BI and analytics segment and they will significantly impact transactional and operational processing workloads as well.

The traditional RDBMS was designed to support databases which were much larger than the available main memory. IMDB, on the other hand, is designed to support databases which fit entirely in the main memory. There are a number of IMDBs which support databases which can be larger than the available main memory. IMDB have been around for a while. They were originally used in performance centric financial applications. There is a revival of interest around IMDB technology and we are beginning to see IMDB reach a tipping point.

Architectural changes:

Indexing data structure & Algorithm: While traditional RDBMS primarily employ B-Tree for indexing/algorithm, IMDB primarily employ T-Tree as a primary indexing data structure/algorithm. This is because B-Trees are primarily optimised for cases where index and data is stored on block storage devices and T-Trees are optimised for cases where index and data is stored on main memory.
Data Representation: Internal data representation is memory-based which will rely on the huge memory addressing. Direct memory pointers will be primarily referenced. There won’t be a need for complex memory management algorithms which manage large data in limited memory.
Access: Traditional databases offer client server over sockets for access. With no disk (or persistence storage device), if In-Memory database only provides sockets for access, it will become a bottleneck. Therefore, most IMDBs offer shared-memory access as a primary access method.

Challenges to In-Memory:

History has always shown that there will always be more data than what will fit in the available main memory. Therefore the core problem will always be: how does one get the right chunk of data into the main memory. The ''right'' chunk would be all the data.

Final Thoughts:

Given the continuing trend of data growth in operational and transaction processing systems, new generation of In-Memory solution is required to cater to them. Traditional RDBMSs have to be reengineered from scratch to meet this expectation. Tipping points occurs when technology provides significant enhancement in service along with several new dimensions.

References :

http://www.benstopford.com/2011/08/14/distributed-storage-phase-change-memory-and-the-rebirth-of-the-in-memory-database/
http://en.wikipedia.org/wiki/In-memory_database
http://en.wikipedia.org/wiki/T-tree
http://www.mcobject.com/in_memory_database
http://www.ibm.com/developerworks/data/library/dmmag/DBMag_2010_Issue1/DBMag_Issue109_solidDB/
http://docs.oracle.com/cd/E13085_01/doc/timesten.1121/e14261/arch.htm

*If you find something is misleading or not correct then please throw some light on it.