The In-Memory Technologies Behind Business Intelligence Software
Understanding the in-memory technologies that are used in Business Intelligence software
By: Elad Israeli
Aug. 30, 2011 10:00 AM
If you follow trends in the business intelligence (BI) space, you'll notice that many analysts, independent bloggers and BI vendors talk about in-memory technology.
There are technical differences that separate one in-memory technology from another, some of which are listed on Boris Evelson's blog.
Some of the items on Boris' list are just as applicable to BI technologies that are not in-memory (‘Incremental updates', for example), but there is one item that merits much deeper discussion. Boris calls this characteristic ‘Memory Swapping' and describes it as, What the (BI) vendor's approach is for handling models that are larger than what fits into a single memory space.
Understanding Memory Swapping
Obviously, in order to perform calculations on data completely in memory, all the relevant data must reside in memory, i.e., in the computer's RAM. So the questions are: 1) how does the data get there? and 2) how long does it stay there?
These are probably the most important aspects of in-memory technology, as they have great implications on the BI solution as a whole.
Pure In-Memory Technology
QlikView's technology is described as "associative technology." That is a fancy way of saying that QlikView uses a simple tabular data model which is stored entirely in memory. For QlikView, much like any other pure in-memory technology, compression is very important. Compressing the data well makes it possible to hold more data inside a fixed amount of RAM
Pure in-memory technologies which do not compress the data they store in memory are usually quite useless for BI. They either handle amounts of data too small to extract interesting information from, or they break too often.
With or without compression, the fact remains that pure in-memory BI solutions become useless when RAM runs out for the entire data model, even if you're only looking to work with limited portions of it at any one time.
Just-In-Time In-Memory Technology
Note: The term JIT is borrowed from Just-In-Time compilation, which is a method to improve the runtime performance of computer programs.
JIT in-memory technology involves a smart caching engine that loads selected data into RAM and releases it according to usage patterns.
This approach has obvious advantages:
However, since JIT In-Memory loads data on demand, an obvious question arises: Won't the disk reads introduce unbearable performance issues?
The answer would be yes, if the data model used is tabular (as they are in RDBMSs such as SQL Server and Oracle, or pure in-memory technologies such as QlikView), but scalable JIT In-Memory solutions rely on a columnar database instead of a tabular database.
This fundamental ability of columnar databases to access only particular fields, or parts of fields, is what makes JIT In-Memory so powerful. In fact, the impact of columnar database technology on in-memory technology is so great, that many confuse the two.
The combination of JIT In-Memory technology and a columnar database structure delivers the performance of pure in-memory BI technology with the scalability of disk-based models, and is thus an ideal technological basis for large-scale and/or rapidly-growing BI data stores.
The ElastiCube Chronicles - Business Intelligence Blog
SOA World Latest Stories
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
SYS-CON Featured Whitepapers
Most Read This Week