20 years ago when I got a 80386 Windows-95 PC with ¾ inch Floppy Drive, 4 MB RAM (I don’t remember the HDD capacity – I believe it was around 120MB or so) from my infrastructure guy, I felt like I’m hero on the street advertising to everyone the great piece of hardware I had received and how important I’m to the manager, etc. :). From last five years, there is growing popularity of In-Memory Platforms, SSDs and the time is not too far away when we stop seeing HDDs anymore.
In today’s world of high competition and data-based decision making, businesses often need to make quick decisions while operations are in progress. Complex business rules need to be run on the fly in real-time to give that intelligence to end users or applications consuming such intelligence (“Operational Intelligence”). Few examples include Credit Card Fraud Detection, Promotional offer based on the items in the basket, Gaming, Stock Trading Applications, etc.
The data modelers and architects face a balancing act in creating a data model that caters to both operational and business intelligence applications. Alternative is to create different data models for these applications and duplicate the data. Analytics latency is increased as the data needs to be duplicated onto OLAP data model using ETL/ELT applying different sanity checks and business rules. One of the major factors affecting database performance is the I/O bottleneck from traditional HDDs.
HTAP (Hybrid Transactional Analytical Processing) using in-memory technology is today’s answer to solve this problem. There are at least 10 in-memory platforms available in the market today but the more popular ones are SAP Hana, DB2 BLU, Oracle Exalytics, TIBCO ActiveSpaces, Aerospike and Hadoop Spark. As a side note, big names in the industry are participating in Spark development as it has a lot to offer and it is seen as potential replacement for MapReduce in the future.
All In-Memory databases use RAM to store all data while partial In-Memory databases use RAM only for hot data with cold data stored on SSD or HDD. SAP Hana is an example of an All In-Memory database. Teradata Intelligent Memory option is an example of a partial In-Memory database.
Each In-Memory platform has its nuances, pros and cons, different way of storing the data in memory, database consistency and availability rules, but overall the idea is to minimize or avoid the performance bottleneck created by I/O from traditional HDD and offer a platform for analytics on both structured and unstructured data. ETL/ELT processing can be eliminated as operational and analytical applications would be running on single system of record.
Is In-Memory database here to replace traditional DBMS? No, not at least in the near future. It really depends on the use case. Think about all the sub-second response times we get from OLTP applications built on RDBMS or hierarchical databases such as IMS (many of the applications running on IMS would have been converted to RDBMS by now). Few examples – ATM Applications, Retail Transactions, Air Ticket Reservation, etc. These applications have been there for a while. They don’t necessarily use In-Memory technology. The database design for OLTP is mostly around DML operations required for a single successful transaction and each transaction deals with only the data that’s absolutely required. In contrast, analytical applications typically involve reading and transforming huge amounts of data to give required answer.
As with any other technology decision, there are different factors one needs to take into account before making a decision on In-Memory Database.
Whether In-Memory platform is needed:
- Is current infrastructure catering to our most complex operational intelligence / analytics and CEP needs?
- Can we quantify the potential business lost or $ saving opportunity due to current platform’s inability to meet the needs?
- What is the road-map (programs and projects) for next 3-5 years and how many of them fall into the CEP or operational analytics category?
- What are the performance requirements of current and future business applications?
- How scalable is the current DBMS platform?
Factors influencing In-Memory platform selection:
- ACID / CAP Compliance
- Performance Comparison (POCs, CRPs and 3rd party benchmarks)
- Feature Comparison (Internal analysis and 3rd party reports)
- Support for unstructured data
- SQL Support
- Ease of Use
- Ease of Integration
- Licensing Options
- Supported Platforms
- Supported Data Types
- Administration Tools and their features
- Horizontal Scalability
- Fault Tolerance (how does the database ensure data integrity and consistency if we pull the plug?)
- Total Cost of Ownership
- Cost-Benefit Analysis
- Analysis on other less expensive alternatives