In my last post, I gave some background on what a shared-memory cache is, and how KDE already uses one (KPixmapCache
) to save memory and make the desktop more efficient. I also noted how the current implementation leaves some things to be desired, and hinted at a new implementation I was working on.
In this second part, I’ll discuss some of the basic design principles of the new class, which I called KSharedDataCache
.
Why a new class?
If you didn’t read Part 1, you may be wondering why I don’t just fix the current implementation, KPixmapCache, instead of writing new code. It’s a good question, but the short story is that due to the public API used by KPixmapCache, it is non-trivial (to say the least :) to improve KPixmapCache and take some necessary steps to improve its performance. The penalty of getting it wrong is pretty severe as well, as there have been probably hundreds of reported crash bugs already due to KPixmapCache.
So, someone on IRC gave me the idea that, why don’t I just make my improvements in a different class, like a KPixmapCache2 and move the majority of the current users of KPixmapCache to use that instead? It sounded like a good option to me, so that’s what I started on, eventually settling on a generic cache layer under a slightly more specialized image-handling cache.
KSharedDataCache
KSharedDataCache is a class that manages a cache, keyed by QString values, and holding QByteArrays for generality. The cache is held in shared memory, which is accessed across multiple processes based on the cache name (which is converted internally to a file name).
The central data structure is the cache itself. Everything that is needed to be able to insert items, find items, and otherwise manage the cache is kept in the same memory segment, instead of being split into two different files like in KPixmapCache. A very ugly drawing of the layout would look like this:
Starting from the left, we have the header for the shared cache itself. This contains several important pieces of data, including the cache size, the page size (which is adjustable), the number of free pages, and the mutex which protects against concurrent access to the shared data.
One note about the mutex, is that it is used instead of KLockFile. It requires support for process-shared POSIX thread primitives (which is required for XSI-conformant systems, but was not present in Linux/glibc until NPTL IIRC). As long as your system tells the truth about whether it supports process-shared primitives KSharedDataCache will still work (even if it can’t use shared memory).
After the cache header, the entry index table is located (starting from the first byte meeting alignment criteria to avoid crashing on non-x86 systems… although I have none to test!). This table is a fixed-size table, based on the total cache size and page size. Entries are placed into the entry table based on the hash of the entry key, and each entry contains information such as the item size, hash code, use count, time of last access, and location of its data.
Collisions are possible with any hash table. The standard answer to handling collisions is to use a method called “chaining” to just make a list of entries which share the same hash code. Unfortunately dynamic memory allocation is much more involved when you’re dealing with a fixed-size block of shared memory, so currently quadratic probing is used to try to seek out other, hopefully empty candidates. Since this is just a cache, the probing is only continued for a small number of attempts.
Following the entry table is the page table, which simply records the entry currently using every page in memory. It is possible to compress the page table by using a bit vector, and making a full page table only when needed (currently only during defragmenting) but I didn’t have time to implement that.
Finally, the rest of the shared memory is devoted to a paged memory allocation system (this is probably the most suboptimal part of my current implementation, but at least it can be fixed later this time ;). Every entry is stored in this data area, with the key that is used followed by the actual QByteArray data.
Resolving an access for an item with a key of “juk_128x128.png” would work something like this:
- Lock the cache. If unable to acquire the lock, assume the cache is corrupt, unlink it on disk, and create it all over again.
- Convert the key to a byte array using the UTF-8 encoding, then determine the hash code.
- Use the hash code to find the appropriate entry index. Compare the hash code to the candidate entry’s hash code, and if they don’t match, use quadratic probing to find another candidate. Give up if the entry is not found within several attempts.
- If the hash codes match, search the matching data area to determine the saved key value, and make sure the keys also match. If they don’t match then go back to before, using quadratic probing to find a match.
- If the keys did match, we found our entry. Update the entry’s use count and last access time, and then copy the data out of the page or pages to return to the caller. Since this all happened in shared memory this should be much much faster than loading it from disk (assuming of course that the operating system hasn’t paged out the shared memory to disk in the meantime).
You might have noticed that I possibly unlink the cache with reckless abandon in step 1. I actually do this in many more places, the idea being that a corrupt cache can lead to bugs that are very hard for the end user to diagnose and correct, and by definition a cache can be expected to drop entries at any time. The only danger would be tampering with a cache that other processes are currently using in shared memory. By unlinking (and only unlinking) the cache, the other processes can continue to use the inode that used to be associated with the file, and the kernel will finish the cleanup when the other processes exit.
Of course I’m up past a thousand words now, so I’ll continue in Part 3, where I’ll discuss how pointers work in shared memory, how initial cache setup is performed, and my attempts at handling cache pressure, defragmentation, etc.