The Essence of an Embedded Database

The term ’embedded database’ has been around since the mid-1980’s. It was originally created to mean a database system that is embedded within application code. In other words, the database management system is delivered as one or more libraries of object code that you, the developer, link with your application code (and other libraries) to create an executable. In that sense, the database system functionality is ’embedded’ within your application code. Hence the name “embedded database.”

Since the late 1990’s, embedded database system vendors have been trying to sell their technology to developers of embedded systems. This has created a lot of unfortunate confusion. In the years since, some folks have come to equate “embedded database” with “embedded systems”, which has led them down a path to frustration and, in some cases, project failure.

Why? Because the vast majority of embedded databases were not written with the unique characteristics of embedded systems in mind: Slower CPUs, limited memory, no persistent storage, etc. In fact, many embedded database systems were created in the 1980’s, long before anyone considered using a database system in embedded systems (remember that most embedded systems in that era were 8- and 16-bit systems that simply couldn’t address enough memory to permit the use of an embedded database system).

Unfortunately, some embedded database vendors haven’t helped the situation. They have adjusted to changing market conditions by re-casting their embedded database products as a solution to the data management needs of embedded systems, even though their technology was not written – and, in fact, is not suited – for embedded systems. These changing market conditions include the rise of open source/dual-license products like MySQL and BerkeleyDB that became dominant players in the line-of-business client/server DBMS and embedded database system markets respectively, and the emergence of free entry-level RDBMS offerings from Oracle and Microsoft (Oracle 10g Express Edition and SQL Server Express edition, respectively). Faced with these challenges, vendors of proprietary, closed source, and commercial (not free) embedded database products found it increasingly difficult to compete, and sought “green fields” in the embedded systems software market for their products.

As an aside, the media recognized the situation in the early part of the 2000s and, SD Times in particular, tried to popularize a new term, “application-specific database.” Unfortunately, the effort didn’t stick and we are still left with the term ’embedded database’.

So, back to the subject herein. What is the essential attribute of an embedded database system? It is exactly what I described in the opening paragraph: The database system functionality is linked with application code and resides in the same address space. This contrasts to a client/server architecture DBMS in which the database server exists as a standalone executable, accessed by client programs through an inter-process communication (IPC) and/or remote-procedure-call (RPC) mechanism. (See https://www.mcobject.com/embedded-or-client-server/.)

In short, an embedded database system should exist wholly within your application’s address space and not require communication with any external agent. Anything external is an immediate tip-off that the DBMS is not, in fact, wholly embedded.

As a former colleague of mine, a VP of marketing, once said to me: “What is the ‘so, what’ of it?” Excellent question. Why should anyone give a hoot?

Perhaps in the non-embedded systems market of embedded databases, nobody does (though even that is arguable). But in embedded and real-time systems, one significant “so, what?” is performance. The need to communicate with an external program, for any purpose, imposes a performance hit that few real-time/embedded systems can afford. This is true regardless of whether that external program is a lock manager, lock arbiter, dead-lock detector, or anything else.

Another “so, what?” is the introduction of dependencies on external components, notably a communication protocol like TCP/IP. Communication between the application (with the database system embedded within it) and an external component also necessarily increases the complexity, fragility, and, consequently, the potential need for administration. These dependencies might not be a big deal in line-of-business systems running on PCs and other systems running robust operating systems like Windows, Linux and Solaris and in organizations with an IT staff. But for an unattended embedded system running on a relatively modest CPU, with a simple RTOS and limited network connectivity/bandwidth, it can be a killer.

Since McObject is publishing this webpage, it should be no surprise that eXtremeDB is an embedded database in the true sense. eXtremeDB never requires communication with an external component. In fact, eXtremeDB can have no external dependencies; it does not require the C runtime library, and can run on ‘bare metal’ (i.e. without an operating system). We do offer remote interfaces to eXtremeDB databases through both our native and SQL APIs, and the High Availability and Cluster editions require a communication channel for synchronizing databases and replicating transactions. But these are optional.

If you have demanding performance requirements, limited resources, and/or are developing an embedded system that absolutely, positively must run un-attended (i.e. “zero administration”) then carefully consider your choice of embedded database system.