Faster Time Series Data Management with Pipelining

 

Pipelining is the programming technique in eXtremeDB for HPC that accelerates the processing of time series data by combining the database system’s vector-based statistical functions into assembly lines of processing for market data, with the output of one function becoming input for the next.

Database management systems (DBMSs) can greatly accelerate processing of time series data (such as market data) via a column-based approach. McObject’s eXtremeDB for HPC database enables column-based handling via its ‘sequence’ data type, and its library of vector-based statistical functions. This video presents the technique of pipelining these functions in SQL to improve performance by maximizing the proportion of relevant market data that is loaded into CPU cache, and reducing data transfers between CPU cache and main memory.

Learn more about pipelining.

 

Without pipelining (e.g. using a vector-based language like R), the interim result of each step in a complex calculation (algorithm) must be materialed and stored in temporary storage, then transferred back into the next function/step in the calculation. Such transfers greatly slow down processing because moving data off of the CPU cache is 6X – 8X slower than keeping data in cache. (Cache is 3X – 4X faster than RAM and two transfers are necessary per function/step.)

Graphic of data management without pipelining, and how latency is created in the process

Copyright 2019 McObject LLC

 

Learn about the features that make  eXtremeDB the most low latency time series database.

Read the independently audited STAC research proving the speed of Pipelining.