Professional Developers Want More (Concurrency) Control
Usually (with rare exception), a database is a shared resource, meaning that it is used concurrently by two or more tasks. This leads us to the topic of concurrency control; i.e. how do we coordinate tasks’ access to the database? This is part-and-parcel of providing a database management system that adheres to the ACID properties.
AAtomic, means all changes within a transaction complete or the database is restored to the pre-commit state.
CConsistency, means that the application of a transaction can only transition the database from one consistent to another consistent state, i.e. all rules, constraints, etc. must be honored.
IIIsolation, means that changes made within an as-yet uncommitted transaction by one task cannot be seen by any other task.
DDurability, means that a committed transaction remains committed even after a crash. Database durability presupposes durability of the media. In other words, a hard disk head crash will compromise database durability, through no fault of the DBMS.
The isolation property is implemented through concurrency control. Broadly speaking, there are two forms of concurrency control: pessimistic and optimistic.
eXtremeDB offers both forms of concurrency control through two different transaction managers: MURSIW and MVCC.
MURSIW – MUtiple Readers, SIngle Writer (pessimistic)
MVCC – Multi-Version Concurrency Control (optimistic)
These are so called because pessimistic concurrency control proactively prevents harm (harm, in this case, being a violation of the isolation property), whereas optimistic concurrency control assumes that no harm will happen, but if it is detected only then will measures be taken.
Pessimistic concurrency control
In operational terms, pessimistic concurrency control is affected with locks. Different vendors can, and do, offer different levels of locks: database locks, table locks, row locks and even column locks in extreme cases. For locking at any level of granularity below the database, things get complicated: there must be either a mechanism to prevent deadlocks, or a means to detect and remediate deadlocks. Invariably, this involves a process called a lock manager or lock arbiter. A deadlock happens when
|Process 1||Process 2|
|Holds a lock on ‘A’||Holds a lock on ‘B’|
|Requests a lock on ‘B’||Requests a lock on ‘A’|
Deadlocks can be prevented by grouping lock requests, i.e. all the locks that will be required within the scope of a transaction are acquired at once. This works for table level locks but is rarely possible for row level locks.
For pessimistic concurrency control with the MURSIW transaction manager, eXtremeDB locks the database for the exclusive use of a task with a READ_WRITE transaction. This eliminates the need for a lock manager/arbiter or to deal with deadlock prevention or detection, and greatly simplifies the transaction manager. Given the speed of an eXtremeDB in-memory database, this coarse level of locking is well-justified: the time (CPU cycles) that would be required for complex lock arbitration would exceed the time a process needs to simply get into the database, do its work, and get out. This calculus also holds up for persistent databases that have a high ratio of READ_ONLY to READ_WRITE transactions, and there are few concurrent READ_WRITE transactions at any given moment, and transactions are short-lived. The definition of “few” and “short-lived” varies for relatively fast SSD compared to relatively slow HDD.
With MURSIW, concurrent access looks something like:
|Process 1||Process 2|
|Epoch 1||Start READ_WRITE|
|Epoch 2||Start READ_WRITE|
|Epoch 3||Read record A||Waiting|
|Epoch 4||Modify field A.b||Waiting|
|Epoch 5||Write record A||Waiting|
|Epoch 7||Start READ_ONLY||Read record A|
|Epoch 8||Waiting||Modify field A.c|
|Epoch 9||Waiting||Modify field A.d|
|Epoch 10||Waiting||Write record A|
|Epoch 12||Start reading…||Start READ_ONLY|
|Epoch 13||Reading||Start reading|
So, the MURSIW transaction manager prevents harm, i.e. the violation of the isolation property, by blocking any task from accessing the database while another task is modifying the database. The other tasks will wait on a spinlock or be put in the operating system’s waiting queue until the transaction manager can schedule their access to the database. In an environment with many tasks that frequently modify the database, having all but one of them blocked at any given moment is undesirable, so optimistic concurrency control could be advantageous.
Optimistic concurrency control
Optimistic concurrency control in eXtremeDB is implemented with the MVCC transaction manager. Again, the optimistic model doesn’t block tasks from concurrent access to the database. This approach optimistically assumes that tasks will not violate each other’s isolation, e.g. that this won’t happen:
|Process 1||Process 2|
|Epoch 1||Read record A||Read record A|
|Epoch 2||Modify field A.b||Modify field A.c|
|Epoch 3||Write record A||Modify field A.d|
|Epoch 4||Write record A|
In this scenario, when Process 2 writes record A back to the database, it reverts field A.b to the value when both Process 1 and 2 read the record, obliterating Process 1’s change to field A.b. This scenario is generically called a conflict.
But we can’t guarantee that this won’t happen, so MVCC needs a mechanism to detect when a conflict would happen if a transaction was allowed to commit. This is done by creating versions of the record in the database, and then checking the version number when the transaction is committed. The above scenario becomes:
|Process 1||Process 2|
|Epoch 1||Start READ_WRITE||Start READ_WRITE|
|Epoch 2||Read record Ao
MVCC gives copy of Ao
|Read record Ao
MVCC gives copy of Ao
|Epoch 3||Modify field Ao.b||Modify field Ao.c|
|Epoch 4||Write record Ao||Modify field Ao.d|
|Epoch 6||MVCC creates A1|
|Epoch 7||Write record Ao|
|Ao is older than A1, CONFLICT!!! (Commit fails)
Retry the transaction from the start
Obviously, there is some overhead involved with MVCC. Versions (copies) have to be made of records and those versions checked prior to allowing the transaction to commit, and in the event of a conflict, the transaction has to be executed again from the beginning, and finally, old versions have to be cleaned up when they’re no longer in use. So, there are assumptions underlying MVCC: First, that the overhead associated with versions is comparable to the overhead of managing locks, deadlock detection, etc. associated with pessimistic concurrency control, that the benefit of not having tasks blocked (i.e. greater concurrency) justifies slightly more complex programming (to handle possible conflicts) and that conflicts will be rare. If conflicts are common and transactions are being re-executed frequently, then MURSIW might be a better alternative.
In the absence of conflicts, MVCC exhibits far superior concurrency.
Use the following link to learn more about MVCC performance.
The following table summarizes the MVCC and MURSIW comparison.
|One or few concurrent writers||Great||Good|
|High ratio of READ_ONLY to READ_WRITE||Great||Good|
|High ratio of READ_WRITE to READ_ONLY||Poor||Great|
|Concurrency/multi-core utilization||Depends on ratio of READ_ONLY to READ_WRITE||Great|
The choice of transaction manager (MURSIW or MVCC) in eXtremeDB is made by linking one library or another. The APIs are identical, so the only necessary change in the application code is that to deal with conflicts in MVCC. So, a developer can write the application assuming that MVCC will be used, and switch between MURSIW and MVCC with a simple change of libraries (or a configuration parameter if you’re using the xSQL server).
Internet of Things
eXtremeDB Advanced Replication Fabric combines advanced, easy to use, development tools with unmatched elastic database scalability, data availability, safety and information security at the edge and for the cloud.
JVC, DirecTV, GoPro and others all discovered that eXtremeDB’s small code size, portable data format and efficient use of compute & storage can reduce component cost while also supporting data-hungry new features.
eXtremeDB’s sophisticated event notification systems, time series data processing and high availability have powered its wide-spread adoption in SCADA, fleet management, smart building automation and other verticals.
White Papers for Professional Developers
McObject is continually researching, testing, improving on, and retesting our software in order to provide our clients with the best possible data management solutions. We invite you to read “Database Persistence, Without the Performance Penalty” and more.
Review our research
Webinars for Professional Developers
Watch to on-demand Webinars, hosted by experts, about proven database management system practices. Watch “Scaling IoT Applications – IoT Panel Discussion Parts 1 and 2“. Or, “Edge Node Database Systems, the Internet of Things’ Hidden Workhorses” and others.
Review our list of Webinars