Digital Asset Research Worked with Data Sleek to Optimize for Speed

Digital Asset Research (DAR) is a specialized provider of ‘clean’ digital asset data, insights, and research for institutional clients. Since 2017, DAR leads by rigorously vetting out noisy inputs for flagship clients such as Bloomberg, FTSE Russell, and Wilshire. Each day, DAR processes 250+ million trades to price over 7,000+ institutional-quality digital asset prices and deliver a range of product solutions to navigate the cryptoverse. With expertise in traditional finance and the digital asset space, DAR’s success is driven by a commitment to deliver honest data emphasizing accuracy, quality, and transparency.

DAR was dealing with a huge amount of data. They were ingesting about 40 millions rows of data every day, which translates to high computation and storage costs incurred on a daily basis. They wanted to investigate cost-saving measures that they could apply to the data loading process while maintaining the integrity and quality of the data they were providing to their clients.

At the time, DAR’s existing setup captured and recorded data for more than 1000 assets every 15 seconds. To make this process more efficient and cost-effective, they needed a robust processing system. In order to build a robust system, the system was recording data from multiple sources simultaneously. With this setup, they saw frequent outages during high volume periods and had increasing database contention between updates and reads.

Data Sleek was initially hired to look at their existing infrastructure running on MySQL. Because of the amount of data being ingested each day, it was difficult to archive the data efficiently without impacting the database server. In other words, we wanted to avoid locking rows, or worse, the tables in the database.

After further auditing, Data Sleek advised DAR to use a different database engine. We recommended using Singlestore as the new database engine because it could handle the load and scale this company needed. We performed a proof of concept before migrating data. Based on this demonstration, stakeholders at DAR were able to recognize the benefit of using Singlestore and they were pleasantly surprised at how fast the queries were.

The next challenging phase was to migrate existing data into Singlestore and synchronize the switch between MySQL and the new database cluster solution. Because Singlestore supports MySQL protocol, it was a drop-in solution and everything worked seamlessly. To complete our work on this project, Data-Sleek optimized the new table, imported old data, and loaded it into the main table.

We imported the data using a pipeline from S3 to Singlestore. Once an S3 pipeline was in place, any data added to the S3 bucket was automatically ingested into Singlestore. We also set up some custom queries with Datadog to monitor data in Singlestore for data quality.

Results

After our project concluded, DAR reported that they were very pleased with the outcome of our work. 

The main table containing more than 15B rows was able to handle queries and return results in less than 1 second. As a result of our collaboration, DAR lowered their TCO (total cost of ownership) by 50% and expanded their client base by 600%.