The In-memory Database Solution for Resource-constrained Embedded Systems
Designed for performance
An in-memory database system (IMDS) is optimized for speed, and to not waste storage space (which is RAM). The primary reason for adopting an in-memory database is for the performance advantage. In rare cases (some embedded systems) the reason is simply that there is no persistent storage. But the overwhelming driver of in-memory database system usage is performance. So, of course, the primary optimization goal of a team designing and developing an in-memory database system is performance. Secondarily, it’s efficient use of the storage space. RAM is neither as inexpensive nor as abundant as any persistent media. So, in addition to using as few CPU cycles as possible, an in-memory database system should be written to store the data in memory as compactly as possible. Some of this falls out as a natural by-product of being an in-memory database system: an IMDS has no need for a cache, so it also has no need for logic to determine if a given request can be satisfied by what’s in cache or not, and no need for a LRU algorithm to maintain the cache contents.
But beyond that, an IMDS should be written carefully with attention to details such as inlining, choice of if-then-else or switch, and much more. An IMDS should minimize padding by rearranging the data, if necessary. Padding (to align data on a word boundary, for example), just wastes memory. An IMDS doesn’t need to keep a redundant copy of indexed data in the index structure (a technique used in persistent database optimization to reduce I/O). This has a beneficial knock-on benefit for performance when the index is a tree type index like a b-tree. To find the insertion point in a b-tree, the algorithm has to walk down the tree to find the node that contains the indexed value. A b-tree that is five levels deep will, on average, require the algorithm to walk down to the 3rd level. By keeping redundant data out of the index structure, we can keep the index more shallow, and reduce the average number of levels the algorithm needs to walk down.
Any application intended for a resource-constrained embedded system should minimize the use of stack. That means that 3rd party libraries, including an embedded database system, that you build into your embedded system should also take care with respect to stack size. The design of eXtremeDB embraces this philosophy.
The amount of stack used is a function of several factors, notably:
1. Function call depth, including direct or indirect recursive functions
2. The number and size of parameters passed to each function on the stack
3. The number and size of local variables declared within each function
How eXtremeDB minimizes stack to reduce memory consumption in embedded systems
The design of eXtremeDB addresses these issues in these ways:
1. Selective in-lining of frequently used code (this is also a performance optimization because it avoids the CPU cycles needed for pushing and popping the stack)
2. & 3, eXtremeDB reserves a small amount of the RAM allocated to it by the application for its own internal heap. Certain meta data is kept here, rather than passing it between functions on the stack. In addition, some temporary variables needed by a function can be allocated from this memory rather than be declared as a local variables. N.B. eXtremeDB has its own purpose-optimized memory allocator for this memory; it does not rely on any external memory allocators (such as the C runtime malloc)
3. When stack is used, pass references to data instead of passing data itself.
Avoiding context switches
Another way to minimize CPU cycle consumption and improve performance is to avoid context switches. One key way in which eXtremeDB does this is through use of a futex-like synchronization primitive (futex = fast user-space mutex). This is a combination of a spinlock in user-space and a semaphore. When one of multiple tasks needs to use a common resource, it will attempt to acquire an atomic integer in user-space in a loop with a predetermined number of iterations (a spinlock). Only on the rare occasion that it is unable will it request a semaphore and enter the kernel’s waiting queue. Going hand-in-hand with this, operations in eXtremeDB, much like in the operating system, are designed to be very short. Most of the time, the technique avoids the context switch necessary to make a kernel call for the semaphore.
When memory became more abundant, and less expensive, in the 2000s and 2010s, more database vendors started to jump on the in-memory database system wagon. In many cases, they did so simply by using memory as a substitute for disk, with no changes to the internals of the database system itself. That certainly gains some performance advantage, but doesn’t make it an “in-memory database system”. Another way to illustrate this is: Any persistent database system can be used to create a database in a RAM-drive. But it won’t be able to store as much data in a given amount of memory, and won’t be able to perform nearly as well as a true in-memory database system. Download our white paper on the topic, here.
Learn more about the eXtremeDB in-memory and persistent database management system.
Articles for Professional Developers
- “A McObject Focus—What’s Changing in the Satellite Industry?” SatMagazine
- “Industrial Internet of Things (IIoT) Database Usage in Rail Systems” insight.tech
- “SCADA as You’ve Never Seen It Before“ Nuclear Engineering International
See a list of articles
Webinars for Professional Developers
Watch to on-demand Webinars, hosted by experts, about proven database management system practices.
What Makes a Database System ‘In-Memory’? In-memory database systems (IMDSs) are held out as the ideal database solution for real-time and embedded systems software. But what is unique about IMDSs versus caching, RAM-disks, “memory tables”, and solid-state disks and others.
Review our list of Webinars