Select Page

Distributed Database Management

With eXtremeDB for High Performance Computing

What to choose:  review our chart of distributed database choices and use cases

eXtremeDB for HPC delivers the benefits of distributed database management via distributed query processing, clustering and high availability options.

Distributed Query Processing

eXtremeDB for HPC partitions, or shards, a database and distributes query processing across multiple servers, CPUs and/or CPU cores. Performance is accelerated — dramatically, in some cases — via parallel execution of database operations and by harnessing the capabilities of many host computers rather than just one.

The benefits of distributed query processing are evident in McObjects recent STAC-M3 benchmarks.  We partnered with E8 Storage, IBM, and Lucera Financial Infrastructures to name a few.  In these tests, the eXtremeDB database was partitioned horizontally across up to 128 shards, resulting in record-setting performance managing tick data.  Please use the following link to review a summary of the benchmark records.

eXtremeDB offers different distributed database options to address different objectives.  Learn about Sharding with eXtremeDB, or review this table that lists different distributed database uses and options.

Chart of eXtremeDB distributed database use cases and options

Sharding

eXtremeDB offers ultra-fast, elastically scalable data management with sharding. Databases are partitioned (“sharded”), with each partition/shard managed by an instance of the DBMS server. Shards are typically distributed on a storage array (which may be a SAN) – with each server keeping a CPU core busy – or distributed across different physical servers with their own storage systems.

U

Read about using the eXtremeSQL distributed SQL engine in our on-line documentation

Learn more about eXtremeDB distributed query processing

1

Learn about Sharding for elastic scalability

Distributed query processing chart copyright McObject

High Availability

High availability enables deployment of a master database and one or more synchronized replica databases. Replication is between separate hardware instances and features application-directed fail-over with strategies that include 2-safe (synchronous) and 1-safe (asynchronous). It delivers “five nines” (99.999% uptime) reliability, or better, with eXtremeDB for HPC’s unsurpassed performance. In addition, read-only replicas are available to support distribution/load-balancing of database query/analysis/reporting requirements.

U

Read about High Availability in our on-line documentation

Learn more about eXtremeDB time-cognizant eager replication

High availability distributed database management
High availability distributed database management
U

Read about High Availability in our on-line documentation

Learn more about eXtremeDB time-cognizant eager replication

Clustering

In clustered deployments, every eXtremeDB HPC database instance serves as a master. This means that changes to one node are efficiently replicated to others.  It is unique as the first clustering database system to offer an embedded architecture.  The database system runs within the application process at every node, eliminating the need for separate client and server modules.

U

Read about clustering our on-line documentation

Learn more about eXtremeDB independently audited speed records.

Learn about other eXtremeDB distributed database options 

Cluster database chart

Distributed Database Options – Contrast and Compare

Which distributed database option best fits your needs?

Flexible eXtremeDB offers professional developers the tools they need to tailor their data management needs. The table below summarizes the primary purpose and characteristics of different distributed database options and objectives (some of which may be combined, e.g. Sharding, Cluster and High Availability).

Sharding  High Availability Cluster IoT
Primary purpose Scalable database applications that require maximum CPU, memory and storage utilization to serve large data sets with a high degree of resource efficiency Database applications that require five 9s availability and instant switch-over. Supports the distribution of read-only workloads (read load balancing) Applications that require distributed, cooperative computing and a resilient topology with five 9s availability. Cluster supports distribution of all workloads (read- and write load balancing) on modest sized networks Data aggregation from a large number of data collection points. Smart data containers to support sporadic connectivity. Advanced server-side analytics for aggregated data
Replication When combined with HA Master-slave replication. Synchronous, Asynchronous Multi-master replication. Synchronous On-demand, based on connection state, data modification events, timers, and more
Scalability Elastic, near liner scalability with added shards Near linear read scalability. Read requests can be distributed across multiple nodes Near linear read scalability. Overall scalability is a function of the workload (% read-only versus read-write transactions). Server-side performance can be increased with added cores & sharding
Reliability and Fault-tolerance When combined with HA Fault tolerant  Fault tolerant

Containers are durable even with sparse connectivity.

Server-side can be made reliable through the normal means — clustering and HA

Concept and Distribution Topology A logical database is horizontally partitioned — physically split into multiple (smaller) parts called shards; shards may reside on separate servers to spread the load or on the same server to better utilize multiple CPU cores. eXtremeDB’s SQL engine handles query distribution and presents the distributed database as a single logical database A single master database receives all modifications (insert/update/delete operations) and replicates transactions to replicas. In the event of a failure, one replica is elected as new master Multi-master architecture in which each node can apply modifications (insert/update/delete). Each transaction is synchronously propagated to all nodes, keeping copies of the database identical (consistent). Database reads are always local (and fast). Writes are longer, but don’t block the database —high concurrency is achieved through Optimistic Concurrency Control.

Push data from IoT Edge to aggregation points (Gateways and/or Servers) for analytics.

Push data down to the edge, usually for new device configuration/provisioning.

Controlled through push/pull interfaces and/or automatic data exchange between collection points and servers.

Sharding

Primary purpose: Scalable database applications that require maximum CPU, memory and storage utilization to serve large data sets with a high degree of resource efficiency.

Replication: When combined with HA

Scalability:  Elastic, near liner scalability with added shards

Reliability and Fault-tolerance:  When combined with HA

Concept and Distribution Topology:  A logical database is horizontally partitioned — physically split into multiple (smaller) parts called shards; shards may reside on separate servers to spread the load or on the same server to better utilize multiple CPU cores. eXtremeDB’s SQL engine handles query distribution and presents the distributed database as a single logical database.

 

High Availability

Primary purpose:  Database applications that require five 9s availability and instant switch-over. Supports the distribution of read-only workloads (read load balancing).

Replication:  Master-slave replication. Synchronous, Asynchronous

Scalability:  Near linear read scalability. Read requests can be distributed across multiple nodes.

Reliability and Fault-tolerance:  Fault tolerant

Concept and Distribution Topology:  A single master database receives all modifications (insert/update/delete operations) and replicates transactions to replicas. In the event of a failure, one replica is elected as new master.

 

Cluster

Primary purpose:  Applications that require distributed, cooperative computing and a resilient topology with five 9s availability. Cluster supports distribution of all workloads (read- and write load balancing) on modest sized networks.

Replication:  Multi-master replication. Synchronous

Scalability:  Near linear read scalability. Overall scalability is a function of the workload (% read-only versus read-write transactions).

Reliability and Fault-tolerance:  Fault tolerant

Concept and Distribution Topology:  Multi-master architecture in which each node can apply modifications (insert/update/delete). Each transaction is synchronously propagated to all nodes, keeping copies of the database identical (consistent). Database reads are always local (and fast). Writes are longer, but don’t block the database —high concurrency is achieved through Optimistic Concurrency Control.

 

IoT

Primary purpose:  Data aggregation from a large number of data collection points. Smart data containers to support sporadic connectivity. Advanced server-side analytics for aggregated data

Replication:  On-demand, based on connection state, data modification events, timers, and more.

Scalability:  Server-side performance can be increased with added cores & sharding

Reliability and Fault-tolerance:  

 Concept and Distribution Topology:

  • Push data from IoT Edge to aggregation points (Gateways and/or Servers) for analytics.
  • Push data down to the edge, usually for new device configuration/provisioning.
  • Controlled through push/pull interfaces and/or automatic data exchange between collection points and servers.

 

We want to help with your next project.  Contact us to discuss your distributed database options and objectives.