eXtremeDB Kernel Mode Reference Application
The reference application presented here corresponds to the “Kernel mode database integration for high performance applications” presentation and paper by McObject’s Andrei Gorine and Alexander Krivolapov, presented at the Databases Session of Embedded World 2007 in Nuremberg, Germany. (Click here to download the paper.)
The application implements a rudimentary access control system. It utilizes McObject’s eXtremeDB embedded in-memory database system to create and maintain the access control database in the kernel space. The database keeps file access rules and the runtime provides drivers and user-level applications with high-performance access to the storage.
The application contains of three major components:
- a “database” kernel module, based on eXtremeDB and responsible for storage, that maintains database access logic
- a kernel module that intercepts file system calls and provides a file access authorization mechanism to the system. This module is referred to as a “filter” module
- a user-mode application that implements a user-mode database API
The example code shown in this paper uses UNIX-like notations. The application’s source code (including the database runtime) is available for free download – contact us to obtain the source code.
Figure 2. The application contains three major components.
The database kernel module implements kernel-mode data storage and provides the API to manipulate the data. The module is integrated with the eXtremeDB database runtime, which is responsible for providing “standard” database functionality such as transaction control, data access serialization and locking, search algorithms, etc. Figure 3 shows the data layout using the eXtremeDB Data Definition Language schema notations.
Class File describes a file object that is identified by the file name, the device the file is located on and its inode. The rest of the fields (owner, defaces and acl vector) are used to define file access rules. The database maintains two hash-based indices that facilitate fast data access.
The database itself could grow large. Therefore the database pool is allocated in virtual memory. In order to use the allocated memory pool, it is mapped to the physical page (Figures 4 and 5). Once the memory is allocated, the in-memory database is created and supports connections using standard database runtime functions.
Figure 5 (Locking virtual memory pages)
The module exports two types of interfaces: the “direct” API available to other kernel modules and drivers; and the indirect API that implements the system call interface to the database. The direct API is not available for user-mode processes, but is extremely fast because it maintains only kernel-space references and eliminates translations for the kernel to the user address space. In order to implement the “indirect” system call API, during its initialization, the module registers a number of I/O controls (Figure 6). Regardless of the interface type, these APIs completely hide all database access details from drivers and user-mode applications.
The filter module intercepts calls to the file system and replaces standard file access functions with its own, providing authorization. The implementation involves registering the custom module’s file access functions upon module initialization (Figures 7 and 8). In turn, these custom functions use the database access API exposed by the database module to authenticate file access (Figure 9).
Figure 7 (module initialization)
Figure 8. Replacement for the open() system call, my_open()
Figure 9 (my_open () example)
Finally, the user-level application’s component creates a user-level database access API exposed by the database driver via a system call interface. This API allows user-mode processes (such as administrative applications) to interact with the kernel database. The API contains functions that correspond to the I/O controls exposed by the database module (Figures 10 and 11.1, 11.2 and 11.3).
With the approach presented in this paper, applications are able to take advantage of a full set of database features—including transaction processing, multi-threaded data access, ability to perform complicated queries using built-in indexing, convenient data access API, and a high-level data definition language—while still providing the near-zero latency of a kernel-based software component. The memory-only nature of the database eliminates unpredictable disk I/O, while direct pointers to data elements prevent expensive buffer management and remote procedure calls that can introduce latency.
As a result, the kernel-mode database runtime remains non-intrusive and refrains from monopolizing system resources, increasing interrupt latencies, or noticeably affecting the overall kernel responsiveness. The query execution path for such a kernel mode database generally requires just a few CPU instructions. Concurrent access to the kernel data structures and complex search patterns are coordinated by the database run-time, and the kernel mode database is made available to user-mode applications by a set of public interfaces implemented via system calls.
Figure 10 (user-mode database access API implemented via ioctl)
Figure 11. Example of the user-mode find_file stub accessing a kernel mode database.