Watch the Video about Pipelining Data for Faster Time Series Processing
Pipelining is the programming technique in eXtremeDB for HPC that accelerates the processing of time series data by combining the database system’s vector-based statistical functions into assembly lines of processing, with the output of one function becoming input for the next.
Database management systems (DBMSs) can greatly accelerate processing of time series data (such as market data) via a column-based approach. McObject’s eXtremeDB for HPC database enables column-based handling via its ‘sequence’ data type, and its rich library of vector-based statistical functions. This video presents the technique of pipelining these functions in SQL to improve performance by maximizing the proportion of relevant market data that is loaded into CPU cache, and reducing data transfers between CPU cache and main memory.
Without pipelining (e.g. using a vector-based language like R), the interim result of each step in a complex calculation (algorithm) must be materialed and stored in temporary storage, then transferred back into the next function/step in the calculation. Such transfers greatly slow down processing because moving data off of the CPU cache is 6X – 8X slower than keeping data in cache. (Cache is 3X – 4X faster than RAM and two transfers are necessary per function/step.)
Copyright 2020McObject LLC