Piotr Kołaczkowski

Estimating Benchmark Results Uncertainty

Physicists say that a measurement result given without an error estimate is worthless. This applies to benchmarking as well. We not only want to know how performant a computer program or a system is, but we also want to know if we can trust the performance numbers. This article explains how to compute uncertainty intervals and how to avoid some traps caused by applying commonly known statistical methods without validating their assumptions first.

read more

Scalable Benchmarking with Rust Streams

In the previous post I showed how to use asynchronous Rust to measure throughput and response times of a Cassandra cluster. That approach works pretty well on a developer’s laptop, but it turned out it doesn’t scale to bigger machines. I’ve hit a hard limit around 150k requests per second, and it wouldn’t go faster regardless of the performance of the server. In this post I share a different approach that doesn’t have these scalability problems. I was able to saturate a 24-core single node Cassandra server at 800k read queries per second with a single client machine.

read more

Benchmarking Apache Cassandra with Rust

Performance of a database system depends on many factors: hardware, configuration, database schema, amount of data, workload type, network latency, and many others. Therefore, one typically can’t tell the actual performance of such system without first measuring it. In this blog post I’m describing how to build a benchmarking tool for Apache Cassandra from scratch in Rust and how to avoid many pitfalls. The techniques I show are applicable to any system with an async API.

read more

In Defense of a Switch

Recently I came across a blog post whose author claims, from the perspective of good coding practices, polymorphism is strictly superior to branching. In the post they make general statements about how branching statements lead to unreadable, unmaintainable, inflexible code and how they are a sign of immaturity. However, in my opinion, the topic is much deeper and in this post I try to objectively discuss the reasons for and against branching.

read more

Multiple Thread Pools in Rust

In the previous post, I showed how processing file data in parallel can either boost or hurt performance depending on the workload and device capabilities. Therefore, in complex programs that mix tasks of different types using different physical resources, e.g. CPU, storage (e.g. HDD/SSD) or network I/O, a need may arise to configure parallelism levels differently for each task type. This is typically solved by scheduling tasks of different types on dedicated thread pools. In this post I’m showing how to implement a solution in Rust with Rayon.

read more

Performance Impact of Parallel Disk Access

One of the well-known ways of speeding up a data processing task is partitioning the data into smaller chunks and processing the chunks in parallel. Let’s assume we can partition the task easily, or the input data is already partitioned into separate files which all reside on a single storage device. Let’s also assume the algorithm we run on those data is simple enough so that the computation time is not a bottleneck. How much performance can we gain by reading the files in parallel? Can we lose any?

read more