C # Implementations in C # .NET

Hello, Habr! In anticipation of the start of the course "C # ASP.NET Core Developer" , we prepared a translation of interesting material on the implementation of the cache in C #. Enjoy reading.



One of the most commonly used patterns in software development is caching . It is a simple and, at the same time, very effective concept. The idea is to reuse the results of operations performed. After a time-consuming operation, we save the result in our cache container . The next time we need this result, we will extract it from the cache container, instead of having to perform a laborious operation again.

For example, to get a user avatar, you may have to request it from the database. Instead of executing the request with each call, we will store this avatar in the cache, extracting it from memory every time you need it.

Caching is great for data that changes infrequently. Or, ideally, they never change. Data that is constantly changing, for example, the current time, should not be cached, otherwise you run the risk of getting incorrect results.

Local cache, persistent local cache, and distributed cache


There are 3 types of caches:

  • In-Memory Cache is used for cases when you just need to implement the cache in one process. When a process dies, the cache dies with it. If you are running the same process on multiple servers, you will have a separate cache for each server.
  • Persistent in-process Cache - this is when you back up the cache outside the process memory. It can be located in a file or in a database. It is more complex than the cache in memory, but if your process restarts, the cache is not flushed. Best suited for cases where getting a cached item is expensive and your process tends to restart frequently.
  • Distributed Cache is when you need a shared cache for multiple machines. Usually these are several servers. The distributed cache is stored in an external service. This means that if one server has retained a cache element, other servers can also use it. Services like Redis are great for this.

We will only talk about the local cache .

Primitive implementation


Let's start by creating a very simple cache implementation in C #:

public class NaiveCache<TItem>
{
    Dictionary<object, TItem> _cache = new Dictionary<object, TItem>();
 
    public TItem GetOrCreate(object key, Func<TItem> createItem)
    {
        if (!_cache.ContainsKey(key))
        {
            _cache[key] = createItem();
        }
        return _cache[key];
    }
}

Using:

var _avatarCache = new NaiveCache<byte[]>();
// ...
var myAvatar = _avatarCache.GetOrCreate(userId, () => _database.GetAvatar(userId));

This simple code solves an important problem. To get a user avatar, only the first request will be the actual request from the database. The avatar data ( byte []) by the result of the request is stored in the process memory. All subsequent avatar requests will retrieve it from memory, saving time and resources.

But, like most things in programming, things are not so simple. The above implementation is not a good solution for a number of reasons. On the one hand, this implementation is not thread safe . When used from multiple threads, exceptions may occur. In addition, cached items will remain in memory forever, which is actually very bad.

This is why we should remove items from the cache:

  1. A cache can start to take up a lot of memory, which ultimately leads to exceptions due to its shortage and crashes.
  2. High memory consumption can lead to memory pressure (also known as GC Pressure ). In this state, the garbage collector works much more than it should, which reduces performance.
  3. The cache may need to be updated when data changes. Our caching infrastructure must support this feature.

To solve these problems exist in the frameworks of displacement policies (also known as the removal of policy - Eviction / Removal policies ). These are the rules for removing items from the cache according to the given logic. Among the common removal policies are the following:

  • An Absolute Expiration policy that removes an item from the cache after a fixed amount of time, no matter what.
  • A Sliding Expiration policy that removes an item from the cache if it has not been accessed for a certain period of time. That is, if I set the expiration time to 1 minute, the item will remain in the cache while I use it every 30 seconds. If I do not use it for more than a minute, the item will be deleted.
  • Size Limit policy , which will limit the size of the cache.

Now that we’ve figured out everything we need, let's move on to better solutions.

Better Solutions


To my great disappointment as a blogger, Microsoft has already created a wonderful cache implementation. This deprived me of the pleasure of creating a similar implementation myself, but at least the writing of this article is also less.

I will show you a Microsoft solution, how to use it effectively, and then how to improve it for some scenarios.

System.Runtime.Caching / MemoryCache vs Microsoft.Extensions.Caching.Memory


Microsoft has 2 solutions, 2 different NuGet caching packages. Both are great. According to the recommendations of Microsoft, it is preferable to use Microsoft.Extensions.Caching.Memory, because it integrates better with Asp. NET Core. It can be easily integrated into the Asp .NET Core dependency injection mechanism.

Here is a simple example with Microsoft.Extensions.Caching.Memory:

public class SimpleMemoryCache<TItem>
{
    private MemoryCache _cache = new MemoryCache(new MemoryCacheOptions());
 
    public TItem GetOrCreate(object key, Func<TItem> createItem)
    {
        TItem cacheEntry;
        if (!_cache.TryGetValue(key, out cacheEntry)) //    .
        {
            //    ,   .
            cacheEntry = createItem();
            
            //    . 
            _cache.Set(key, cacheEntry);
        }
        return cacheEntry;
    }
}

Using:

var _avatarCache = new SimpleMemoryCache<byte[]>();
// ...
var myAvatar = _avatarCache.GetOrCreate(userId, () => _database.GetAvatar(userId));

It is very reminiscent of my own NaiveCache, so what has changed? Well, firstly, it's a thread safe implementation. You can safely call it from multiple threads at once.

Secondly, it MemoryCachetakes into account all the crowding-out policies that we spoke about earlier. Here is an example:

IMemoryCache with preemption policies:

public class MemoryCacheWithPolicy<TItem>
{
    private MemoryCache _cache = new MemoryCache(new MemoryCacheOptions()
    {
        SizeLimit = 1024
    });
 
    public TItem GetOrCreate(object key, Func<TItem> createItem)
    {
        TItem cacheEntry;
        if (!_cache.TryGetValue(key, out cacheEntry))//    .
        {
            //    ,   . 
            cacheEntry = createItem();
 
            var cacheEntryOptions = new MemoryCacheEntryOptions()
         	.SetSize(1)// 
         	//        (  )
                .SetPriority(CacheItemPriority.High)
                //       ,    .
                 .SetSlidingExpiration(TimeSpan.FromSeconds(2))
                //       ,    .
                .SetAbsoluteExpiration(TimeSpan.FromSeconds(10));
 
            //    .
            _cache.Set(key, cacheEntry, cacheEntryOptions);
        }
        return cacheEntry;
    }
}

Let's analyze the new elements:

  1. The MemoryCacheOptions has been added SizeLimit. This adds a size limit policy to our cache container. The cache does not have a mechanism for measuring the size of records. Therefore, we need to set the size of each cache entry. In this case, each time we set the size to 1 with SetSize(1). This means that our cache will have a limit of 1024 elements.
  2. , ? .SetPriority (CacheItemPriority.High). : Low (), Normal (), High () NeverRemove ( ).
  3. SetSlidingExpiration(TimeSpan.FromSeconds(2)) 2 . , 2 , .
  4. SetAbsoluteExpiration(TimeSpan.FromSeconds(10)) 10 . , 10 , .

In addition to the options in the example, you can also set a delegate RegisterPostEvictionCallbackthat will be called when the item is deleted.

This is a fairly wide range of functions, but, nevertheless, we need to think about whether there is anything else to add. There are actually a couple of things.

Problems and missing features


There are several important parts missing from this implementation.

  1. While you can set a size limit, caching does not actually control memory pressure. If we monitored, we could tighten policy with high pressure and weaken policy with low.
  2. , . . , , , 10 . 2 , , ( ), .

Regarding the first problem of pressure on gc: it is possible to control the pressure on gc by several methods and heuristics. This post is not about this, but you can read my article “Finding, Fixing, and Preventing Memory Leaks in C # .NET: 8 Best Practices” to learn about some useful methods.

The second problem is easier to solve . Actually, here is an implementation MemoryCachethat completely solves it:

public class WaitToFinishMemoryCache<TItem>
{
    private MemoryCache _cache = new MemoryCache(new MemoryCacheOptions());
    private ConcurrentDictionary<object, SemaphoreSlim> _locks = new ConcurrentDictionary<object, SemaphoreSlim>();
 
    public async Task<TItem> GetOrCreate(object key, Func<Task<TItem>> createItem)
    {
        TItem cacheEntry;
 
        if (!_cache.TryGetValue(key, out cacheEntry))//    .
        {
            SemaphoreSlim mylock = _locks.GetOrAdd(key, k => new SemaphoreSlim(1, 1));
 
            await mylock.WaitAsync();
            try
            {
                if (!_cache.TryGetValue(key, out cacheEntry))
                {
                    //    ,   .
                    cacheEntry = await createItem();
                    _cache.Set(key, cacheEntry);
                }
            }
            finally
            {
                mylock.Release();
            }
        }
        return cacheEntry;
    }
}

Using:

var _avatarCache = new WaitToFinishMemoryCache<byte[]>();
// ...
var myAvatar =
await _avatarCache.GetOrCreate(userId, async () => await _database.GetAvatar(userId));

In this implementation, when you try to get an element, if the same element is already in the process of being created by another thread, you will wait until the first thread completes. Then you will get an already cached item created by another thread.

Code parsing


This implementation blocks the creation of an element. Locking occurs on a key. For example, if we are waiting for Alexey's avatar, we can still get the cached values ​​of Zhenya or Barbara in another thread.

The dictionary _locksstores all locks. Regular locks do not work with async/await, so we need to use SemaphoreSlim .

There are 2 checks to check if the value is already cached if (!_Cache.TryGetValue(key, out cacheEntry)). The one in the lock is the one that provides the only creation of the element. One that is outside of the lock, for optimization.

When to use WaitToFinishMemoryCache


This implementation obviously has some overhead. Let's look at when it is relevant.

Use WaitToFinishMemoryCachewhen:

  • When the creation time of an item has any value, and you want to minimize the number of creations as much as possible.
  • When the time to create an item is very long.
  • When an item must be created once for each key.

Do not use WaitToFinishMemoryCachewhen:

  • There is no danger that multiple threads will gain access to the same cache element.
  • You are not categorically against creating elements more than once. For example, if one additional query to the database does not greatly affect anything.

Summary


Caching is a very powerful pattern. And also it is also dangerous and has its pitfalls. Cache too much and you can cause pressure on the GC. Cache too little and you can cause performance issues. There is also distributed caching, which represents a whole new world to explore. This is software development, there is always something new that can be mastered.

I hope you enjoyed this article. If you are interested in memory management, my next article will focus on the dangers of pressure on the GC and how to prevent it, so sign up . Enjoy your coding.



Learn more about the course.



All Articles