The Multi-Versioning Concurrency Control (MVCC) transaction manager enhances applications’ database concurrency management options. With the MVCC model, while querying a database each transaction sees a snapshot of committed data, regardless of any in-progress transactions belonging to other tasks. This protects the transaction from viewing inconsistent data that could be caused by other transaction updates on the same set of objects or indexes, thus providing transaction isolation for each transaction. The MVCC manager allows applications to choose how transactions are isolated from each other by setting the transaction isolation level at runtime.
Isolation Level
When the transaction start API is called without specifying an isolation level, the transaction isolation level is set to
MCO_DEFAULT_ISOLATION_LEVEL
, which is by defaultMCO_REPEATABLE_READ
for MVCC.Conflict Management
When
MVCC
is used with other thanMCO_SERIALIZABLE
, thenMCO_READ_WRITE
transactions are executed concurrently. Sometimes concurrent transactions modify the same objects, thus creating transaction conflicts. The transaction manager resolves those conflicts by aborting one of the conflicting transactions and letting the other one commit its updates to the database. When a transaction is aborted, the application receives theMCO_E_CONFLICT
error code. It is the application’s responsibility to manage this possibility with logic similar to the following:do { mco_trans_start( db, MCO_READ_WRITE, MCO_TRANS_FOREGROUND, &t); ...<update database>... rc = mco_trans_commit(t); } while ( rc == MCO_E_CONFLICT );
Note that when MVCC is used, the application must be able to tolerate transaction rollbacks due to conflicts as described above.
If the number of conflicts is too high, it could lead to sharp performance degradation due to the need to retry transactions. When this occurs, the transaction manager temporarily changes the isolation level to
MCO_SERIALIZABLE
. The application can set the conflicts threshold over which the optimistic control is disabled. This is done in C applications by callingmco_trans_optimistic_threshold()
. (This functionality is only available in the C API.)Also in C applications, if the percentage of transactions that have been aborted because of the transaction conflict exceeds the
max_conflicts_percent
, the transaction isolation level is changed toMCO_SERIALIZABLE
fordisable_period
successive transactions.MCO_SERIALIZABLE
permits a singleMCO_READ_WRITE
transaction at a time (eliminating the potential for conflicts), that can run in parallel withMCO_READ_ONLY
transactions. By default the optimistic threshold is set to 100 (which means “never disable optimistic mode no matter how many conflicts occur”).Object Versions and Cleanup
With MVCC every transaction that changes objects in the database spawns versions of those objects. The versions are visible depending on transaction order and active isolation level. When a version becomes not needed by any transaction it should be deleted as it consumes memory. On the other hand, all object versions are included in the indexes defined for that class, so excluding them requires re-balancing which requires a lock on the indexes and a decrease in performance. This object version cleanup procedure is referred to as Garbage Collection.
Garbage Collection Policies
The Garbage Collection (GC) policy can be set a runtime with the C API
mco_trans_set_gc_policy()
. The possible policies are:
MCO_GC_DISABLED
: garbage collection is disabledMCO_GC_SELF_VERSIONS
: each session will perform a cleanup of its own garbage only. This policy can be efficient when each session (connection) works with its own subset of classes (tables)MCO_GC_ON_MODIFICATION
: read-write transactions in any session will trigger garbage collection in all sessions (connections)MCO_GC_ALWAYS
: likeMCO_GC_ON_MODIFICATION
, butREAD_ONLY
transactions will trigger garbage collection as wellThe default behavior is
MCO_GC_ON_MODIFICATION
, which is generally the best practice. This means that, for eachMCO_READ_WRITE
transaction, the eXtremeDB runtime tries to collect all "garbage" immediately on completion of the transaction. But if there are long living transactions, then old object versions will not be removed for a long time. In this case GC at each transaction commit will just waste time and CPU. For this reason it is also possible, and can be optimal, to spawn one or more separate GC threads. (It is possible to specify the number of GC threads in the configuration file when starting xSQL as server.)But consider the case where there are large number of
MCO_READ_ONLY
transactions and rare update (MCO_READ_WRITE
) transactions. The update transaction will not be able to perform cleanup because old versions may still be accessed byMCO_READ_ONLY
transactions which in turn do not perform cleanup because of theMCO_GC_ON_MODIFICATION
policy. This means that all versions will persist until the next portion of updates which will cause degradation of performance. In this case theMCO_GC_ALWAYS
policy is preferable.The
MCO_GC_DISABLED
policy is intended to be used to limit garbage collection to dedicated sessions (connections). In this scenario, the worker sessions will have this policy, whereas a dedicated session (or multiple sessions) with policyMCO_GC_ALWAYS
will collect garbage in separate threads.Special Case for Persistent Databases
A special case is persistent (disk) objects when one or more GC threads use policy
MCO_GC_ALWAYS
. If aMCO_READ_ONLY
transaction tries to perform GC and deletes some version(s) then it actually becomes aMCO_READ_WRITE
transaction and requires a write to the transaction log. This is why the default policy isMCO_GC_ON_MODIFICATION
; so thatMCO_READ_ONLY
transactions do not try to perform GC.Recovering unused data storage space
When the MVCC transaction manager is used, in the case of a crash, a persistent database can contain undeleted old versions and working copies. Their presence will not break the consistency of the database and doesn't prevent the normal working of an application, but does unnecessarily consume space. Detecting these stale object versions requires a complete scan of the database. For this reason the recovery process doesn't perform this function automatically. Instead, the removal of the unused versions is performed explicitly in C/C++ applications by calling the
mco_disk_database_vacuum()
function:MCO_RET mco_disk_database_vacuum(mco_db_h con);Note that
mco_disk_database_vacuum()
requires exclusive access to the database, so no operations can be performed on the database until the vacuum operation is complete and the function has returned control back to the application.Alternatively, the application can enable the repair process by setting the
MCO_DB_MODE_MVCC_AUTO_VACUUM
mode mask in themco_db_params_t
when callingmco_db_open_dev()
.