Some aspects of working with Dictionary when developing loaded applications

To write this short note, I was prompted by several recent interviews for the position of lead developer in our company. Some applicants, as it turned out, are not well versed in what kind of mechanism this is, Dictionary, and how it should be used. I even came across a very radical opinion: they say that the dictionary works very slowly, and due to the fact that when it is immediately placed in the heap of large objects (LOH), it cannot be used in code and it is better to apply queries to regular collections using LINQ filters !


Of course, these are not entirely true statements. A dictionary as a mechanism is very often in demand both in building highly loaded code and in implementing business logic. The developer must imagine how the dictionary is organized, how it works with memory, how expensive it is, how many β€œtraces” it will leave in generations and LOH, causing unnecessary load on the garbage collector; how to initialize it better, in which cases it is better to use a collection of key-value pairs instead, in which a sorted collection, etc.


In this article we will try to deal with some of these issues.


The implementation of the dictionary from Microsoft is based, as everyone knows, on the hash table mechanism. This gives in some scenarios the constant constant complexity O (1) that is unattainable for other algorithms for insert, search, and delete operations. In some scenarios, the refusal to use this mechanism is fraught with significant problems with performance and memory consumption.


An important feature of the architecture is that data storage is organized using two arrays of the same size:


    int[] _buckets = new int[size];
    Entry[] _entries = new Entry[size];

These arrays store useful data and service elements for quick access to them. Arrays are created with a margin, that is, they are almost always empty.


The Entry helper entity that wraps each key-value pair is a significant type:


    private struct Entry
    {
        public uint hashCode;
        public int next;
        public TKey key;
        public TValue value;
    }

, , , _entries. , ConcurrentDictionary, . ( 64 16 8 ) , .


_entries Dictionary 85000 LOH (, , 64 3372 , 1932). , , .


Microsoft _entries , Entry? LOH (, , ). , , .


, . 12 (4 _buckets 4 hashCode next Entry). , , , 12 . ? .


Dictionary . . Dictionary<int, int> ( ). , – ( ). 500 . 13 . 500 .



, . , , , .


, _buckets _entries , «», . LOH, .
, 746 3 .


, Dictionary . , , , .NET Core TrimExcess, .


:


  • , EnsureCapacity. .
  • , 400 10. (80 ). new Dictionary<int, int>(10001000), 190. , , .
  • , Dictionary EnsureCapacity(capacity*), . , _buckets _entries . . , EnsureCapacity(0).
  • , ( ) , , Dictionary: . , . , ( ).
  • TrimExcess. LOH . .

Dictionary. , Dictionary<Key, Value>, :


    public class Key
    {
        public int Item { get; set; }
        public Key(int item)
        {
            Item = item;
        }
        public override bool Equals(object obj)
        {
            return Item.Equals(obj);
        }
        public override int GetHashCode()
        {
            return Item.GetHashCode();
        }
    }
    public class Value
    {
        public int Item { get; set; }
        public Value(int item)
        {
            Item = item;
        }
    }

.



?


:


  • 64 (8 ) (8 ).
  • , , 8 «» .
  • Item Key Value 8 , , Item 4 .

20 76 .


:


  • ( ), .
  • , . .

, . ? , ( 10-20-30 ) . ?


. , Dictionary<int, int>(int count) :


    struct Entry<TValue>
    {
        public int[] Keys;
        public TValue[] Values;
        public Entry(int count)
        {
            Keys = new int[count];
            Values = new TValue[count];
        }
        public bool TryGetValue(int key, out TValue value)
        {
            for (int j = 0; j < Keys.Length; ++j)
            {
                if (Keys[j] == key)
                {
                    value = Values[j];
                    return true;
                }
            }
            value = default(TValue);
            return false;
        }
    }

count – -, .
.



, 18 count=10 134 count=190 ( 50 . ). ( – 29 ), ( 150) , (, ).


, (, , Entry , ).
, (callvirt GetHashCode Equals). – . , , , (generic) . .


SortedList<int, int>.
.



, .


: , . .


In all the examples presented, it should be borne in mind that small data is used in the test configuration, therefore, the overhead for service information is relatively high.
Test projects can be found here .


All Articles