On Zeitgeist optimization

Last weekend I asked myself: “How fast is zeitgeist, and can we make zeitgeist even faster?” It turned out to be a too general question, zeitgeist has various places where performance matters, so I decided to take a first look at some very basic FindEvents queries.
To get a first impression of how fast some commonly used queries are I wrote a small benchmarking tool which on the one hand gets me some timings and also is able to produce some nice plots.
The first plot I started with gave me a first overview, the speed of these queries varies from a few milliseconds to over half a second. But as you can see, the slower queries all have a red border around their bar, this means that we are not using our SQL indices for such queries. So my first step of this optimization story was to change the queries in a way that they are using the index they should.

And voila, since yesterday these queries are multiple times faster, as you can see in this plot. The yellow series show the same data as the first plot, and the additional series in cyan shows how fast the same queries are after this first step of optimization – pretty impressive.

But we can do even better! Until now I exclusively looked at the class of queries where the timerange argument is “TimeRange.always()”, which is already optimized. So my next question was: “What happens if we do not query over the whole period of time, but only a random interval?”. To understand the next plot you have to know that all events in my sample activity log (which contains 50000 events) have a timestamp greater than 0 and lower than 50000, so ‘TimeRange.always()’ and the intervall ‘(1, 60000)’ will return the same result. The plot is a bit harder to read: always a yellow and a cyan bar describe the same kind of query, using the same codebase. The only difference is that the yellow bars are using a concrete time-interval were the cyan ones are using the already optimized ‘TimeRange.always()’ statement – and remember, both types will return the same results. And as you can see, ‘TimeRange.always()’ is up to three times faster! But I already have a fix, take a look at this one, the yellow and purple bars are the same as in the last plot, and the cyan series shows the upcoming optimization which will hopefully land in zeitgeist soonish. querying on random time intervals will roughly be at the same speed than on the complete time-period.

  1. #1 by Seif Lotfy on 18. November 2010 - 3:46

    M to the A to the RKS in da HOUUUUUUUUUSE 😛

  2. #3 by ifrade on 20. November 2010 - 16:16

    So… what did you do to improve the time-limited queries?

    • #4 by thekorn on 21. November 2010 - 13:39

      @Ivan, I’m planning a post right now to explain exactly what I’ve changed so far, and what I would like to improve in the future. This post will answer this en detail, so stay tuned 😉

  1. The Evil Blog » Blog Archive » Fwd: On Zeitgeist optimization
  2. Zeitgeist adds more data providers, speeds up, will appear in GNOME-Shell and gets a dedicated MeeGo dev
  3. Seilo @ Geeky Ogre » Blog Archive » Quick news from the Zeitgeist universe

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: