Data structures: a list that can do everything *

* By everything, I mean the relatively quick execution of operations on a single element of an array.

The data structures that implement the list are complete. Everyone has their own advantages and disadvantages. For example, in the Java world - depending on the necessary operations - you can use:

  • add (obj), get (obj), set (index, obj): a basic set of almost all lists, e.g. ArrayList.
  • add (index, obj): tree-like structures, e.g. TreeList from apache common-collections.
  • remove (index): same as above.
  • contains (obj), indexOf (obj): you can use a bunch of ArrayList and HashMap.
  • remove (obj): ... I find it difficult to answer. In some cases, you can get by with a LinkedHashSet. It is solved trivially in the presence of the previous two points, but which structures can both quickly?

When I needed a structure with quick add (obj), get (index), remove (index) and indexOf (obj), google did not give an answer. I did not find any code examples or descriptions of such structures. Maybe I was not looking there, I had to invent it myself. But if someone drops the link in the comments, I will greatly appreciate it.

Perhaps someone realized that you can take a TreeList, which can quickly insert / remove items in the middle of the list and add a HashMap from the object to the index in the TreeList for quick execution of indexOf (obj). And it will be a simple, elegant, but incorrect decision. After all, when adding to the middle or removing from the middle, it will be necessary to recalculate the indices, on average, for half the elements. This will degrade performance to O (n).

Next I will talk about a data structure that can do all of the above. Which performs any operation on one element in O (log (n)) time. Well, almost - for the logarithm is performed in the case when all the objects in the list are different. If the list contains the same objects, then it is possible to sag performance up to O (log (n) ^ 2).

I will warn you right away that I will not paint the code here. It can be quite complicated for the article. But it is, written in Java. Based on the TreeList class from apache common-collections. Pull request already exists, but at the time of writing, the article is not yet poured.

Also, I will not describe well-known algorithms. For example, tree balancing algorithms. For most, it may be sufficient to take for granted the fact that the tree can be kept balanced. This does not affect the understanding of the general idea. Those who want to know more can easily find information. But I will tell you very briefly about some basic things, because without the knowledge of the basics, many key elements cannot be understood.

Links will be at the end.

Why is it necessary


In fact, it’s not so easy to come up with situations where everything is needed directly from the list. It is unlikely that this is some kind of super necessary structure, otherwise everyone would know about it. However, a few examples where such a list could be useful can be given.

I recognize that many of the examples are far-fetched. All or almost everything can be solved in another way.

Caching and compression


My initial task, because of which I began to research the issue. Played with compression of specific data and needed a list for the object cache.

The idea is this: when processing another object, we look for it in the list. If not found, save the object and add it to the top of the list. If found, then we take its index in the list and instead of the object we save only its index, after which we move the object to the top of the list. Thus, objects that occur will often receive small indexes, and objects that occur only once will eventually move to the end of the list and be deleted.

Turn


If instead of the usual FIFO queue, for some tasks, a similar structure is used, then the following operations can be done:

  • Answer the question: how many tasks are in the queue before this task.
  • Remove tasks from the queue.

It’s like in a supermarket. If you came for a chocolate bar, but you see that the line is moving slowly, then maybe the chocolate bar is not so much needed? :)

High score table


Suppose we want to store the time during which players complete a level in a game. There are many players and they all compete, trying to show the minimum time. Player data can be put into an array and sorted by time. Using this structure, you can:

  • Move players higher in the list if they show better results than before.
  • Remove players from the list, for example, in the case of a ban for cheating.
  • Show each player where he is.
  • Show the table of records page by page.
  • Show a sparse table in places, for example, time 1, 2, 3, 5, 10, 20, 50, 100, 1000, 10000 places.

Data structure


The structure is based on a tree with an implicit key. It is on this approach, for example, that TreeList in apache common-collections is based. In order to move on, you need to understand how this structure works.

Implicit Key Tree


A tree consists of nodes (Nodes). Each node contains a link to an object that is stored in the node and 2 links to other nodes: left and right. The topmost node is called the root node. In the simplest case, the node looks something like this:

class Node<T> {
  obj: T
  left: Node<T>
  right: Node<T>
}

In the classical binary tree for each node in the left subtree, all objects are smaller than in the current node, and in the right - large. For instance:

                             [ element: 25 ]
                           /                 \
                          /                   \
          [ element: 14 ]                       [ element: 45 ]
           /          \                           /          \
          /            \                         /            \
[ element: 10 ]    [ element: 22 ]     [ element: 27 ]    [ element: 90 ]
                    /          \                            /
                   /            \                          /
            [ element: 17 ] [ element: 23 ]         [ element: 80 ] 

But for our purpose, such a tree is not suitable. We do not need to store objects sorted, but we need to have access to them by index, as in an array.

How can I put an array in a tree? Let's select an element with index i from the middle of the array. Place the ith element from the array in the root node. 2 subtrees exit the root node. In the left subtree we put half the array with index <i, and in the right one with index> i. How to do it? In the same way: we select an element from the middle in a subarray, put this element in a node, we get 2 more smaller subarrays. And so until we put all the elements of the array in the nodes of the tree.

For example, an array with the elements [“q”, “w”, “e”, “r”, “t”, “y”, “u”] might look like this:

                            [el: r,  size: 7]
                           /        :        \
                          /         :         \
         [el: w, size: 3]           :           [el: y, size: 3]
           /     :    \             :             /    :     \
          /      :     \            :            /     :      \
[el: q, size: 1] : [el: e, size: 1] : [el: t, size: 1] : [el: u, size: 1]
        :        :         :        :         :        :         :
        :        :         :        :         :        :         :
       [q]      [w]       [e]      [r]       [t]      [y]       [u]

Index:  0        1         2        3         4        5         6

The middle element in the “r” array, we put it in the root node. Two subarrays [“q”, “w”, “e”] and [“t”, “y”, “u”] are placed in the left and right subtrees. For this, the central elements are selected from the subarrays, in our case these are “w” and “y”, and they fall into the nodes of the next level. And so on.

In our case, the tree is balanced, the depth of all subtrees is the same. But this does not have to be so.

In the picture above, each node, in addition to the element and links to the left and right nodes, contains the number of elements of the entire subtree. This information must be updated correctly when the tree changes.

Let's see how to find, for example, an element with index = 4 in such a tree.
We start the crawl from the root node (root, in our case with the “r” element). We have 3 options: we are already on the right node, the right node on the left, the right node on the right. In order to understand where to look for the desired element, you need to compare the size of the left subtree (in our case left.size = 3) and the current index (in our case 4). If these 2 numbers are equal, then we found the necessary node and the desired element in it. If the size of the left subtree is larger, then the required node in the left subtree. If it’s less, then you need to look in the right subtree, but you need to reduce the desired index: index = index - left.size - 1.

Since in our case left.size <index, we are looking in the right subtree for the element with the new index 4 - 3 - 1 = 0. Move to the node with the element “y”.

Then we do the same thing that we did in the root node. Compare left.size and index. Since 1> 0, we look in the left subtree, move to the node with the element “t”.

There is no left subtree in this node, and its size is 0. index = left.size, which means we found a node with index 4 and can get the required element “t” from it.

In pseudo code, it looks something like this:

function get(node: Node<T>, index: int): T {
  val leftSize: int = (node.left == null) ? 0 : node.left.size;
  if (leftSize == index) {
    return node.obj;
  } else if (leftSize > index) {
    return get(node.left, index);
  } else {
    return get(node.right, index — leftSize — 1);
  }
}

I tried to describe the key principle of how to put an array in a tree. Such a structure works, of course, slower than the classical array, for O (log (n)) versus O (1). But it has an important advantage: adding an element to the middle or removing from the middle also works for O (log (n)) versus O (n) for the array. Of course, provided that the tree is more or less balanced. There are many algorithms for maintaining a tree in an almost balanced way. For example, red-black tree, AVL tree, Cartesian tree. I will not write down the details of balancing the tree, any algorithm is suitable for us. Let's just assume that the tree is balanced on average and its maximum depth is not much different from the minimum.

Slight optimization


The approach described above, with checking the size of the tree on the left is convenient for perception, but can be done a little more efficiently. In order not to look into the left subtree each time, instead of the size of the tree, one can store in the node its position relative to the position of its parent node. The root node stores an absolute position that matches the size of the left subtree.

                             [el: r, pos: 3]
                           /        :        \
                          /         :         \
         [el: w, pos: -2]           :           [el: y, pos: +2]
           /     :    \             :             /    :     \
          /      :     \            :            /     :      \
[el: q, pos: -1] : [el: e, pos: +1] : [el: t, pos: -1] : [el: u, pos: +1]
        :        :         :        :         :        :         :
        :        :         :        :         :        :         :
       [q]      [w]       [e]      [r]       [t]      [y]       [u]

Index:  0        1         2        3         4        5         6

For example, the root node “r” has position 3. The node “w” has position -2 relative to the parent node or the absolute position 3 + (-2) = 1. Similarly, you can go down one more level, for example, the node “e” has position 3 + (-2) + (+1) = 2. That is, node index is the sum of the positions from the root of the tree to this node.

This optimization, in addition to a faster search for an item in the list, will provide a faster and easier search for the index on the node. But, of course, correctly updating the position when changing the tree has become a little more difficult.

Add Indexing


So, in the tree we can take an element by index, change its value, add elements to the middle and delete. Essentially, we just need to add a quick index search by value, indexOf (obj). Then contains (obj) and remove (obj) will be trivially solved.

But first, let's simplify the task a bit. Let's make a structure that stores only unique elements.

In order to quickly search for something, they usually use a table. In the Java world, tables are called Map; it has 2 main implementations: HashMap and TreeMap. The key to the table will be a link to the object, and the value will be a link to its node:

class IndexedTreeListSet<T> {
  root: Node<T>
  indexMap: Map<T, Node<T>>
}

Those. the structure consists of two parts: the list tree itself and the table with links to objects and nodes of this tree. When updating the tree, the table must also be updated. I will not describe the process in detail. Intuitively, it should be understandable: add a node - put it in the table, delete the node - delete it from the table. In practice, there are nuances with balancing the tree: the algorithm should change links between nodes, and not move objects between nodes. Otherwise, you will have to do a lot of updates in the table and performance will drop.

Ok, we will assume that we can quickly find the node by the element that it contains. So what? We need to find its index, but this cannot be done yet. But we can complicate the node class so that it contains not only links to the left and right nodes, but also to its parent:

class Node<T> {
  obj: T
  left: Node<T>
  right: Node<T>
  parent: Node<T>
  pos: int
}

Of course, updating the tree is a little more complicated, because now we need to carefully update the link to the parent. But now, knowing the node, we can go up the tree and calculate the index of any node. If we used the optimization from the previous chapter, then we just need to calculate the sum of the positions from the current node to the root.

For a list containing unique elements, the problem can be considered solved.

True, we have a small problem. Suppose we call set (index, obj). We can easily replace one element in a node with another, but only if there is no new element in the list yet. And if so, what should I do? Remove excess item from old position and put in new? Or vice versa, first add and then delete? The result may be different. And you can do nothing at all or throw an exception. There is no perfect solution.

Sorting by standard methods such a list, most likely, will not work either. After all, the sorting algorithm will not know about the need for uniqueness of objects and will create duplicates when moving items in the list.

We remove uniqueness


Ok, complicating things further, let us keep the same objects. Obviously, you need to do something with the table. The first idea to store a list of nodes in it does not seem very good: with an increase in the length of the list, performance will deteriorate. Up to O (n) if all list items are the same.

Then let's try to store a sorted tree of nodes in a table instead of a list. Sorted by position in the list.

class IndexedTreeList<T> {
  root: Node<T>
  indexMap: Map<T, TreeSet<Node<T>>>
}

Then insertion / deletion to / from the TreeSet <Node> of size m will occur during log (m) comparisons of the positions of the nodes, and each comparison will occur over log (n) time. The final complexity of inserting or deleting into a similar structure will occur in O (log (n) * (1 + log (m))), where n is the total number of elements in the list and m is the number of elements in the list equal to the inserted / deleted. In the worst case, when all the elements are equal to each other, we obtain the complexity O (log (n) ^ 2).

An attentive reader will probably object: but what about immutability? After all, we cannot change objects if they are table keys? In general, it is. However, for a tree that stores sorted objects in keys, in addition to the standard rules for comparisons, it is sufficient to preserve the invariant: if a <b, then this property should not change over time. This is just our case: if the position of one node is less than the position of another node, then this property will be preserved regardless of how many nodes were added or deleted between them.

Is it possible to make the structure persistent?


Short answer: no, it is impossible. Due to the biconnectedness of the tree, from the root to the leaves and back, we have every tree node connected to each. Persistence cannot be done in this way; you have to recreate the entire structure with any change.

But I have an understanding of how to implement a persistent structure for cases where we do not need to insert elements in the middle of the list. You can add elements to the beginning or end, and you can delete from the middle. The remaining properties are the same.

If you are interested, then I will try to write an article about this structure. Perhaps I even implement it in Java, Kotlin or Scala. But, most likely, it will not be soon.

A few implementation features


Here I want to describe some features that I had to face.
About one of the optimizations about storing the node position in the list, I wrote above. Here the strength of Open Source is manifested: I took the ready-made TreeList code and did not delve into the details of the AVL tree, node rotations, position updates, etc.

Another feature inherited from TreeList is the links to subtrees in tree leaves. Each node stores boolean leftIsPrevious and rightIsNext. These variables indicate the presence or absence of a left / right subtree. If there is no subtree, then in left / right, instead of a link to the subtree, a link to the node that corresponds to the previous or next element is stored. In our example, [“q”, “w”, “e”, “r”, “t”, “y”, “u”] the node “e” is leafy, it has no subtrees. Accordingly, leftIsPrevious and rightIsNext are true, and left and right point to the nodes “w” and “r”, respectively. This approach helps iterate through the list faster. And it interferes with the programming of new features :)

A little bit about working with the table object → node. Ideally, you need to put an element into the table once when adding it to the structure and delete it once from the structure. In practice, I could not achieve this. When you add an item, it is added to the table, everything is as it should. However, when you delete an item, the balancing algorithm sometimes moves items between nodes. The result is two deletions and one record in the table instead of one deletion. This can be fixed if you remove the optimization from leftIsPrevious and rightIsNext. And even get a small performance gain, and not only during removal. In some tests, the increase was 10-20%. But the iteration speed drops significantly, 1.5-2.5 times in my tests. I decided to leave the optimization for now.

In Java, the main types of tables are HashMap and TreeMap. For a table, an object → a node uses HashMap by default. However, you can use TreeMap with a task-specific Comparator. In this case, indexOf (obj) and remove (obj) will search / delete the object that is equal to the specified object according to the Comparator code. For example, we store a list of users, and the comparator compares users only by name. Then we can answer the question “What positions of the list are users with the name 'Napoleon?'”. Or remove all Napoleons from the list :).

The structure does not support null. You can fix it, but there is no feeling that it is necessary.

Regarding the fact that the structure “knows everything”, I, of course, was a little misleading. Of course, when working with single elements, everything is fine and under certain conditions even for the logarithm. However, she does not know some things that other structures can. For example, a Cartesian tree with an implicit key, there were articles about it on the hub It does not know how to quickly do indexOf, but it knows how to make a sublist and concatenate two lists into one for the logarithm (on average, not guaranteed), plus it can be made persistent.

Performance


In Java, performance is usually measured using the jmh framework. Tests were conducted on the 2017 MacBook Pro under Java11.

I compared the performance of the standard ArrayList, TreeList from apache common-collections, and my two classes IndexedTreeList and IndexedTreeListSet in several scenarios. In each scenario, 1000 operations of the same type were performed, so the result should be multiplied by 1000.

Code under the spoiler
@Fork(1)
@Warmup(iterations = 3)
@Measurement(iterations = 5)
public class PerformanceCompare {

    public static final Map<String, Class> CLASSES = Stream.of(TreeList.class, IndexedTreeListSet.class, IndexedTreeList.class,
            ArrayList.class)
            .collect(Collectors.toMap(c -> c.getSimpleName(), c -> c));

    public static final int ITERATIONS = 1000;

    @State(Scope.Benchmark)
    public static class Plan {

        @Param({"10", "100", "1000", "10000", "100000", "1000000"/*, "10000000"*/})
        public int size;

        @Param({"ArrayList", "TreeList", "IndexedTreeList", "IndexedTreeListSet"})
        public String className;

        private Random random;
        private List<Integer> list;

        @Setup
        public void init() throws IllegalAccessException, InstantiationException {
            random = new Random();
            list = (List<Integer>) CLASSES.get(className).newInstance();

            for (int i = 0; i < size; i++) {
                list.add(i);
            }
        }
    }


    @Benchmark
    public void indexOfKnown(Plan plan, Blackhole blackhole) {
        List<Integer> list = plan.list;
        Random random = plan.random;
        int value = 0;
        for (int i = 0; i < ITERATIONS; i++) {
            value = list.indexOf(random.nextInt(plan.size));
        }
        blackhole.consume(value);
    }

    @Benchmark
    public void indexOfUnknown(Plan plan, Blackhole blackhole) {
        List<Integer> list = plan.list;
        Random random = plan.random;
        int value = 0;
        for (int i = 0; i < ITERATIONS; i++) {
            value += list.indexOf(random.nextInt());
        }
        blackhole.consume(value);
    }

    @Benchmark
    public void addRemoveRandom(Plan plan, Blackhole blackhole) {
        List<Integer> list = plan.list;
        Random random = plan.random;
        int value = 0;
        for (int i = 0; i < ITERATIONS; i++) {
            list.add(random.nextInt(list.size() + 1), random.nextInt());
            value += list.remove(random.nextInt(list.size()));
        }
        blackhole.consume(value);
    }

    @Benchmark
    public void get(Plan plan, Blackhole blackhole) {
        List<Integer> list = plan.list;
        Random random = plan.random;
        int value = 0;
        for (int i = 0; i < ITERATIONS; i++) {
            value += list.get(random.nextInt(list.size()));
        }
        blackhole.consume(value);
    }

    @Timeout(time = 1, timeUnit = TimeUnit.MILLISECONDS)
    public static void main(String[] args) throws RunnerException {
        Options opt = new OptionsBuilder()
                .include(PerformanceCompare.class.getSimpleName())
                .forks(1)
//                .jvmArgs("-Xms2048m", "-Xmx2048m", "-XX:MaxDirectMemorySize=512M")
                .build();

        new Runner(opt).run();
    }
}


To begin with, I compared the speed of getting a random item from a list. I’ll warn you right away that in this test the overhead is very significant. Results approaching 100,000 * 1,000 operations per second are severely distorted.

Get test result
PerformanceCompare.get                       ArrayList       10  thrpt    5  79865.412 ± 10145.202  ops/s
PerformanceCompare.get                       ArrayList      100  thrpt    5  81862.243 ±   983.727  ops/s
PerformanceCompare.get                       ArrayList     1000  thrpt    5  81033.507 ±  4540.206  ops/s
PerformanceCompare.get                       ArrayList    10000  thrpt    5  64096.123 ±  1430.361  ops/s
PerformanceCompare.get                       ArrayList   100000  thrpt    5  41289.491 ± 11286.114  ops/s
PerformanceCompare.get                       ArrayList  1000000  thrpt    5   8598.944 ±  2048.461  ops/s
PerformanceCompare.get                        TreeList       10  thrpt    5  33912.275 ±  3754.284  ops/s
PerformanceCompare.get                        TreeList      100  thrpt    5  21346.854 ±   863.588  ops/s
PerformanceCompare.get                        TreeList     1000  thrpt    5  14808.414 ±   508.098  ops/s
PerformanceCompare.get                        TreeList    10000  thrpt    5   8679.384 ±   109.250  ops/s
PerformanceCompare.get                        TreeList   100000  thrpt    5   4605.998 ±  1028.945  ops/s
PerformanceCompare.get                        TreeList  1000000  thrpt    5   2241.381 ±   768.147  ops/s
PerformanceCompare.get                 IndexedTreeList       10  thrpt    5  34054.357 ±  3682.829  ops/s
PerformanceCompare.get                 IndexedTreeList      100  thrpt    5  21934.002 ±  2339.947  ops/s
PerformanceCompare.get                 IndexedTreeList     1000  thrpt    5  14626.691 ±   369.893  ops/s
PerformanceCompare.get                 IndexedTreeList    10000  thrpt    5   7386.863 ±   342.150  ops/s
PerformanceCompare.get                 IndexedTreeList   100000  thrpt    5   4562.126 ±   352.772  ops/s
PerformanceCompare.get                 IndexedTreeList  1000000  thrpt    5   2105.718 ±   702.064  ops/s
PerformanceCompare.get              IndexedTreeListSet       10  thrpt    5  33317.503 ±  2307.829  ops/s
PerformanceCompare.get              IndexedTreeListSet      100  thrpt    5  21247.440 ±  1253.386  ops/s
PerformanceCompare.get              IndexedTreeListSet     1000  thrpt    5  14665.557 ±   487.833  ops/s
PerformanceCompare.get              IndexedTreeListSet    10000  thrpt    5   7667.214 ±    80.093  ops/s
PerformanceCompare.get              IndexedTreeListSet   100000  thrpt    5   3454.023 ±    82.994  ops/s
PerformanceCompare.get              IndexedTreeListSet  1000000  thrpt    5   1768.701 ±    35.878  ops/s


Here, oddly enough, the greatest interest is the standard ArrayList. Theoretically, the speed of getting out of it should be constant and not depend on the number of elements. In practice, the performance first holds around 90,000 * 1000 operations per second (remember the overhead), but with a list length of several thousand items it starts to sag. This is due to the increasingly frequent cache miss: the processor cache does not have the necessary data and more and more often you need to go for data in RAM. With a million elements, the speed of the test is 10 times lower, but in practice, the performance drawdown is even greater.

TreeList, IndexedTreeList and IndexedTreeListSet expectedly show similar results. Expected much slower than ArrayList. Even with a small number of elements, TreeList is several times slower than ArrayList, although the test shows the difference only 2 times.

The next test is addRemoveRandom. Here, in each test, I insert an element into a random position and remove an element from a random position.

AddRemoveRandom test result
PerformanceCompare.addRemoveRandom           ArrayList       10  thrpt    5  12440.764 ±   485.642  ops/s
PerformanceCompare.addRemoveRandom           ArrayList      100  thrpt    5   9880.123 ±   464.014  ops/s
PerformanceCompare.addRemoveRandom           ArrayList     1000  thrpt    5   5288.905 ±  1219.055  ops/s
PerformanceCompare.addRemoveRandom           ArrayList    10000  thrpt    5   1024.942 ±   179.366  ops/s
PerformanceCompare.addRemoveRandom           ArrayList   100000  thrpt    5     91.219 ±    25.380  ops/s
PerformanceCompare.addRemoveRandom           ArrayList  1000000  thrpt    5      5.499 ±     0.400  ops/s
PerformanceCompare.addRemoveRandom            TreeList       10  thrpt    5   6242.607 ±   350.290  ops/s
PerformanceCompare.addRemoveRandom            TreeList      100  thrpt    5   3117.945 ±   116.066  ops/s
PerformanceCompare.addRemoveRandom            TreeList     1000  thrpt    5   1829.778 ±    80.516  ops/s
PerformanceCompare.addRemoveRandom            TreeList    10000  thrpt    5   1230.077 ±    53.381  ops/s
PerformanceCompare.addRemoveRandom            TreeList   100000  thrpt    5    443.571 ±    69.207  ops/s
PerformanceCompare.addRemoveRandom            TreeList  1000000  thrpt    5    308.963 ±    84.077  ops/s
PerformanceCompare.addRemoveRandom     IndexedTreeList       10  thrpt    5   3556.511 ±   144.596  ops/s
PerformanceCompare.addRemoveRandom     IndexedTreeList      100  thrpt    5   2120.777 ±    83.848  ops/s
PerformanceCompare.addRemoveRandom     IndexedTreeList     1000  thrpt    5   1211.112 ±    92.288  ops/s
PerformanceCompare.addRemoveRandom     IndexedTreeList    10000  thrpt    5    789.458 ±    19.450  ops/s
PerformanceCompare.addRemoveRandom     IndexedTreeList   100000  thrpt    5    302.989 ±    40.030  ops/s
PerformanceCompare.addRemoveRandom     IndexedTreeList  1000000  thrpt    5    178.822 ±    92.853  ops/s
PerformanceCompare.addRemoveRandom  IndexedTreeListSet       10  thrpt    5   4138.007 ±   119.943  ops/s
PerformanceCompare.addRemoveRandom  IndexedTreeListSet      100  thrpt    5   2435.803 ±    20.276  ops/s
PerformanceCompare.addRemoveRandom  IndexedTreeListSet     1000  thrpt    5   1445.054 ±   276.909  ops/s
PerformanceCompare.addRemoveRandom  IndexedTreeListSet    10000  thrpt    5    972.256 ±    19.987  ops/s
PerformanceCompare.addRemoveRandom  IndexedTreeListSet   100000  thrpt    5    366.608 ±    94.487  ops/s
PerformanceCompare.addRemoveRandom  IndexedTreeListSet  1000000  thrpt    5    227.677 ±    48.276  ops/s


It could be assumed that ArrayList is faster on small lists. However, the fact that he wins in this test on lists of up to 10,000 elements looks interesting. Apparently, System.arrayCopy is very well optimized and uses all the features of modern processors. Starting at 10,000 items, specialized data structures are starting to win. With 1,000,000 elements, the speed difference is 30-50 times.

IndexedTreeList and IndexedTreeListSet are expected to be slower than TreeList. About 1.5 - 2 times.

The remaining 2 tests indexOfKnown and indexOfUnknown should just demonstrate the main feature of this structure. The difference between the tests is that in one case we are looking for an element that is in the list, and in the other case we are looking for an element that is not in the list.

Test result indexOfKnown and indexOfUnknown
PerformanceCompare.indexOfKnown              ArrayList       10  thrpt    5  41424.356 ±   549.047  ops/s
PerformanceCompare.indexOfKnown              ArrayList      100  thrpt    5  17216.477 ±  1444.744  ops/s
PerformanceCompare.indexOfKnown              ArrayList     1000  thrpt    5   2296.306 ±    76.372  ops/s
PerformanceCompare.indexOfKnown              ArrayList    10000  thrpt    5    233.863 ±    26.926  ops/s
PerformanceCompare.indexOfKnown              ArrayList   100000  thrpt    5     23.208 ±     2.776  ops/s
PerformanceCompare.indexOfKnown              ArrayList  1000000  thrpt    5      0.919 ±     0.455  ops/s
PerformanceCompare.indexOfKnown               TreeList       10  thrpt    5  26740.708 ±  1323.125  ops/s
PerformanceCompare.indexOfKnown               TreeList      100  thrpt    5   5670.923 ±    99.638  ops/s
PerformanceCompare.indexOfKnown               TreeList     1000  thrpt    5    745.408 ±    26.827  ops/s
PerformanceCompare.indexOfKnown               TreeList    10000  thrpt    5     52.288 ±     1.362  ops/s
PerformanceCompare.indexOfKnown               TreeList   100000  thrpt    5      4.224 ±     0.855  ops/s
PerformanceCompare.indexOfKnown               TreeList  1000000  thrpt    5      0.193 ±     0.052  ops/s
PerformanceCompare.indexOfKnown        IndexedTreeList       10  thrpt    5  34485.128 ±  1582.703  ops/s
PerformanceCompare.indexOfKnown        IndexedTreeList      100  thrpt    5  29209.412 ±  1544.268  ops/s
PerformanceCompare.indexOfKnown        IndexedTreeList     1000  thrpt    5  21139.584 ±  1442.867  ops/s
PerformanceCompare.indexOfKnown        IndexedTreeList    10000  thrpt    5  12544.306 ±   312.097  ops/s
PerformanceCompare.indexOfKnown        IndexedTreeList   100000  thrpt    5   3538.201 ±   272.537  ops/s
PerformanceCompare.indexOfKnown        IndexedTreeList  1000000  thrpt    5   1420.119 ±   538.476  ops/s
PerformanceCompare.indexOfKnown     IndexedTreeListSet       10  thrpt    5  39201.995 ±  1887.065  ops/s
PerformanceCompare.indexOfKnown     IndexedTreeListSet      100  thrpt    5  34204.112 ±  1122.517  ops/s
PerformanceCompare.indexOfKnown     IndexedTreeListSet     1000  thrpt    5  25374.557 ±  1596.746  ops/s
PerformanceCompare.indexOfKnown     IndexedTreeListSet    10000  thrpt    5  14291.317 ±   391.180  ops/s
PerformanceCompare.indexOfKnown     IndexedTreeListSet   100000  thrpt    5   4215.898 ±   283.680  ops/s
PerformanceCompare.indexOfKnown     IndexedTreeListSet  1000000  thrpt    5   1729.100 ±  1260.815  ops/s
PerformanceCompare.indexOfUnknown            ArrayList       10  thrpt    5  59053.313 ±  1845.665  ops/s
PerformanceCompare.indexOfUnknown            ArrayList      100  thrpt    5  10867.572 ±   142.823  ops/s
PerformanceCompare.indexOfUnknown            ArrayList     1000  thrpt    5   1186.583 ±    28.003  ops/s
PerformanceCompare.indexOfUnknown            ArrayList    10000  thrpt    5    120.953 ±     4.146  ops/s
PerformanceCompare.indexOfUnknown            ArrayList   100000  thrpt    5     11.936 ±     0.320  ops/s
PerformanceCompare.indexOfUnknown            ArrayList  1000000  thrpt    5      0.566 ±     0.335  ops/s
PerformanceCompare.indexOfUnknown             TreeList       10  thrpt    5  28134.237 ±  2291.670  ops/s
PerformanceCompare.indexOfUnknown             TreeList      100  thrpt    5   3153.930 ±   158.734  ops/s
PerformanceCompare.indexOfUnknown             TreeList     1000  thrpt    5    322.383 ±    44.245  ops/s
PerformanceCompare.indexOfUnknown             TreeList    10000  thrpt    5     25.674 ±     1.787  ops/s
PerformanceCompare.indexOfUnknown             TreeList   100000  thrpt    5      1.867 ±     0.291  ops/s
PerformanceCompare.indexOfUnknown             TreeList  1000000  thrpt    5      0.093 ±     0.008  ops/s
PerformanceCompare.indexOfUnknown      IndexedTreeList       10  thrpt    5  66625.126 ±  5232.668  ops/s
PerformanceCompare.indexOfUnknown      IndexedTreeList      100  thrpt    5  70038.055 ±  5803.848  ops/s
PerformanceCompare.indexOfUnknown      IndexedTreeList     1000  thrpt    5  63240.467 ±   885.956  ops/s
PerformanceCompare.indexOfUnknown      IndexedTreeList    10000  thrpt    5  54731.988 ±  3950.150  ops/s
PerformanceCompare.indexOfUnknown      IndexedTreeList   100000  thrpt    5  22049.476 ±   821.924  ops/s
PerformanceCompare.indexOfUnknown      IndexedTreeList  1000000  thrpt    5   9459.862 ±   804.738  ops/s
PerformanceCompare.indexOfUnknown   IndexedTreeListSet       10  thrpt    5  70274.968 ± 15830.355  ops/s
PerformanceCompare.indexOfUnknown   IndexedTreeListSet      100  thrpt    5  71017.685 ±  6920.447  ops/s
PerformanceCompare.indexOfUnknown   IndexedTreeListSet     1000  thrpt    5  66405.960 ±  1127.231  ops/s
PerformanceCompare.indexOfUnknown   IndexedTreeListSet    10000  thrpt    5  57983.963 ±  3276.142  ops/s
PerformanceCompare.indexOfUnknown   IndexedTreeListSet   100000  thrpt    5  41277.110 ±  9919.893  ops/s
PerformanceCompare.indexOfUnknown   IndexedTreeListSet  1000000  thrpt    5   9840.185 ±  2159.352  ops/s


Here, ArrayList and TreeList have almost no surprises. With increasing size, the speed decreases almost linearly. The search for an item from a non-list is expected to be 2 times slower than the search for an item from the list, because you need to go through the entire array instead of half on average.

But IndexedTreeList and IndexedTreeListSet here show the expected good result. These data structures show an indexOf execution speed comparable to ArrayList even with 10 elements. With 1000 elements, these structures are 10 times faster, with 1,000,000 faster 1000 times. When searching for an item that is not in the list, they are expected to give better speed than when searching for an item from the list.

What else is interesting to pay attention to is the performance subsidence of IndexedTreeList and IndexedTreeListSet in the indexOfUnknown test. Here the situation is similar to that in the test with ArrayList.get. Theoretically, we should not have gotten a drop in performance, but in practice, because of cache miss, we got it, moreover, significantly.

Instead of a conclusion


I still don’t know if the proposed structure has a novelty or not. On the one hand, the idea is not complicated if you know how the tree works by an implicit key. On the other hand, I have not seen a description of a structure with such properties. And if so, then it makes sense to make the structure more famous, it might be useful to someone.

But even if this is another bike, I tried to make it useful. A pull request in common-collections has been created, but at the time of writing this article is not yet poured. Knowing how slowly everything can happen in open source, I won’t be surprised if the process drags on for months.

Somewhat surprised by the result of comparing the performance of ArrayList and TreeList. Tests showed that TreeList does not make sense to use up to 10,000 elements on the list size. It would be interesting to try b-tree instead of a binary tree. This structure should use memory more carefully and, most likely, work faster. And for it you can adapt the idea with indexing.

In any case, it’s fun to have an instrument in the arsenal that can (almost) do everything with predictable complexity.

References


Original
Pull request project in apache common-collections
Ticket in Jira

All Articles