Saturday, May 25, 2013

Cassandra Stress Test

In this post, I will go through how you can quickly stress test your Cassandra performance. Before you go for tuning your Cassandra you might want to see how well its performing so far or where its slowing down. You can definitely write a benchmark tool which inserts some random data and reads it after that and measure performance based on time. When I first asked to stress test Cassandra, I was writing pretty much same kind of tool. But in the middle I found an existing code which stress test Cassandra and which is good enough to start with. It's basically a pom based Java project which uses Hector (my project also use Hector - A Java Client for Cassandra).

You can directly go here to get more information about how its written and how to run it:

But if you just want a quick way to run it, you can follow the following steps:

Step#1: Install It

Step#2: Run It:
What the above command doing is:
  • Inserting (-o insert) 1000000 records (-n) into column family StressStandard which has 10 columns (-c)
  • Using 5 threads (-t) and each batch size is 1000(-b)
  • So each thread is getting 1000000 / 5 = 200000 inserts, as the batch size is 1000, so each thread is actually inserting 200000 / 1000 = 200 times.
After it inserts 1000000, it will show you a brief stat of data insertion performance. For the above test, it took around 3 minutes to insert all records (no optimization), which was 140.87 write request per seconds with bandwidth 15730.39 kb/sec. You can also test read performance, as well as some other Hector's API performance (rangeslice, multiget, etc).

I played with this stress tool a lot and later I converted it based on my needs(to work with my Cassandra keyspace andcolumn families) and ran it for my stress test. I highly recommend you to use this stress tool, it will serve most of the basic cases.

Note: For privacy purpose, I had to modify several lines on this post from my original post. So if you find something is not working or facing any issues, please do not hesitate to contact me :)

1 comment:

  1. ERROR 23:26:13,301 Could not start connection pool for host
    Exception in thread "main" me.prettyprint.hector.api.exceptions.HectorException: All host pools marked down. Retry burden pushed out to client.
    at me.prettyprint.cassandra.connection.HConnectionManager.getClientFromLBPolicy(
    at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(
    at me.prettyprint.cassandra.service.AbstractCluster.describeKeyspace(
    at com.riptano.cassandra.stress.Stress.initializeCommandRunner(
    at com.riptano.cassandra.stress.Stress.main(