TerrariumDB: Enhancing Performance by optimizing QT Library during OS migration

In TerrariumDB team we are always trying to use the latest and safest technologies in order to provide our clients with the best user experience and best efficiency. There are some situations when we are limited to the software that is compatible with OS that TerrariumDB is running on. In such cases, the only way to use newer versions of our dependencies is by updating the operating system on our clusters. In most cases, when a new OS is released, most of applications are running out of the box. However, there are times when applications need to adapt their code to the latest versions of libraries that come bundled with the new OS. This is often required due to the removal of certain deprecated functionality or significant changes that break backward compatibility. The same type of check applies in case of TerrariumDB. It relies on a few essential libraries bundled with the operating system. The libraries enable features such as time zones handling, text formatting, and so on.

TerrariumDB apps

At the very first moment when TerrariumDB apps were compiled on the new target system, no signs indicated any potential issues related to the new operating system. Even without making any changes to our codebase, all seemed smooth and seamless. In majority of cases, when everything complies smoothly and all tests pass with flying colors, it can be tempting to view it as a green light to proceed with migration on the production side. However, in Synerise we have different rules – rule which I really appreciate. We believe in solid proof that everything is not only functioning as well as before but, in fact, performing even better. Only then do we make the changes to our production setup.

So we have begun intensive testing of our application on the new operating system. TerrariumDB ideology embraces an ideology that goes beyond mere local tests with sets of data prepared by developers. We are committed to providing concrete evidence that new functionalities workwell both in controlled environments as well as with real user data. We need to remember that working with storage is something more than managing user data. We must be prepared to handle multitude of queries used by our users to extract valuable insights from our database,even if these queries are not prepared in the most optimal way.

We are achieving this by using additional setups that run alongside our main TerrariumDB clusters, receiving real user data and user queries. This approach provides us with substantial evidence that our software is running smoothly and that newly introduced features are working as expected. When it comes to OS migration, we stick to the same ideology to prove that the new OS is not impacting our performance.

Why is that approach to testing so important for us?

As I mentioned before, the initial tests may not indicate any problems with performance. However, through the extensive execution of over 0.5 million full scan user queries, we found out that there are few of them that were performing slower on the new setup.

We discovered that a few user queries, which received responses from TerrariumDB in less than 3 seconds on the old OS, experienced significantly longer response times of almost 2 minutes or more on the new OS. This was despite having an internal default computation timeout set to 2 minutes. In terms of quality over quantity, we decided to solve these issues before we proceed with migration to the new OS on production clusters. That situation happened in first few weeks after I joined Synerise half year ago. Right from the start, my new team entrusted me with the task of investigating the performance issues mentioned earlier. The challenge I faced was directly linked to the QT library, which TerrariumDB relies on for executing time and date-related operations using data from user queries or stored within TerrariumDB itself.

You may ask why didn’t we consider switching to a different library that offers specific features for handling time zones, or better yet, why did we use this particular library in the first place? There are two reasons:

  1. Our objective is to bring new solutions to the market, so we are focused on creating innovative features rather than spending time on creating another time zone library for our internal use.
  2. The second reason is reliability - The QT library we utilize is not exclusive to our organization, but is used by thousands of other software applications, including integral components of the Ubuntu system. This means that it is tested by thousands of developers and end users, showing that it iworks well. It's important to note that QT goes beyond just managing time. It includes different QT libraries that we use in various parts of our code. And we So that is why it is worth to keep that library in our dependency list.

To begin my investigation

I first aimed to understand the extent of the problem we were facing. I’ve collected all methods used by TerrariumDB from the QT Date Time library and used that information to create performance benchmarks using the Google Benchmark framework. At the suggestion of a colleague from my team, I incorporated multi-threaded version of the performance tests. The main idea was to execute these benchmarks on both old and new OS and analyze the resulting data. The second goal was to isolate QT calls from TerrariumDB code, ensuring that all benchmarks were independent of TerrariumDB.

The initial results indicated that certain QT methods were performing significantly slower on the old OS, achieving only 19% of their speed in single-threaded benchmarks and as slow as 2% in the multi-threaded versions of the same scenarios. This explains why we experienced delays of over 2 minutes on Ubuntu 22.04.

Figure 1 Representation of performance factor for QT DateTime methods between Ubuntu 22.04 and Ubuntu 20.04 in percents (100% means performance of Ubuntu 20.04). Each cell is representing separate benchmark result.

Since QT is one of many libraries that operating system is dependent on, a change of the version to previous one on new OS was not an option. Instead, we needed to identify the issue within the QT library itself and develop a patch to restore TerrariumDB's performance.

When it comes to the issues we encountered with QT, it's important to note that they weren't caused by any mistakes in the way it was implemented. Instead, they originated from attempts to improve performance that didn't align with TerrariumDB's specific needs. In our case, these were changes related to caching some of time zones in order to speed up time zone change operations and features like local-based interpretation of text strings containing date data. Due to the way how TerrariumDB deals with date types these changes didn't provide any benefits for us.

One of the optimization that drops performance significantly was the change in which time zone transition cache was stored inside the QDateTimePrivate class as a copy of QVector (more about why it was a copy of QVector in remaining part of this article). This functionality is useful if you frequently manipulate a single QDateTime object and you want to avoid fetching zone information from the system for each operation. This was achieved by keeping a QHash cache that stored the previously retrieved zone information. To ensure the integrity of the cache during simultaneous reads and writes, a QMutex was used for protection. In our use case, when we create a single QDateTime object based on a user or database data and perform a single time zone transition before destructing the object, it causes unnecessary allocation and deallocation of cached time zones entries. Additionally, the mutex slows down the process as only one process can fetch the pre-cached data at a given moment. That sounds like small overhead, but if you multiply this by millions of operations that TerrariumDB is performs every second then it starts to have significant effect on computation times.

Another small change that I made was to replace readInt() used inside qtdatetime.cpp with less generic version. Originally that function was used in order to convert ISO dates line 23-01-2024 into numbers corresponding to day, month and year based on substring provided as QStringView argument. The main performance improvement was achieved by switching from using locale-based string to integer conversion to using non-locale-based conversion. In the case of ISO dates, we are working with simple numeric values rather than locale-specific data, so using non-locale-based conversion proves to be more efficient. This is great example that sometimes it is better to use specialized versions of functionalities over more generic one.

Figure 2 One of the methods that we need to tune for our needs was

There were also small issues related to using non const iterators for objects that use implicit sharing. QT utilizes a lot of implicit sharing among many QT containers like QVector – it means that you can cost effectively return QVector as a copy from function and as long as you call only the const methods on that object then no data will be copied. This is a great example of copy-on-write idiom. We find out that there were a few places in which due to calling the non-const method, an unnecessary copy of QVector was made.

Figure 3 This method is returning a copy QVector<QTzTransitionTime>, thanks to implicit sharing data it is not copied according to copy-on-write idiom.
Figure 4 Result of tranCache() is in that context returned as non-const copy of QVector<QTzTransitionTime>, due to that non-const last() method is called causing copy of data.
Figure 5 Instead of letting compiler call const or non-const version of last(), we call the constLast() method explicitly. In that case, only const methods were invoked on implicitly sharing object so no copy was made
Figure 6 Screenshot showing in last three lines that last methods were called over QVector<QTzTransitionTime> causing detaching the vector leading to copying it's data.

In the end..

After applying several patches to the QT core library source code, the performance-related problems were identified and fixed on Ubuntu 22.04. The final patch not only restored the previous level of performance but also provided some advantages compared to the older operating system in certain cases. More details on these improvements can be found below.

Figure 7 Representation of performance factor for QT DateTime methods between Ubuntu 22.04 and Ubuntu 20.04 expressed as percentage (100% means performance of Ubuntu 20.04). Each cell is representing separate benchmark result after applying final patch to QT core sources.

This story highlights several areas where developers should pay special attention. One such area is ensuring the costness of object to avoid unnecessary copying of data. It’s important for developers to be mindful that not all optimizations will work in every case, and in projects like TerrariumDB, it may be necessary to use specially tuned solutions for specific purposes. And last, but not least, we need to keep in mind that there are situations in which we don’t need to call a function that is doing many things like readInt() does, especially in case when there is no advantage over calling generic function over specialized one.

Author: Wojciech Czerniecki | C++ Developer in Synerise