Weak heap sorting


Of the entire zoo heaps, this structure is perhaps the most unusual. At the same time, the elegant simplicity of the algorithm is quite a match for its amazing eccentricity.

When sorting using a weak heap, there are always fewer comparisons and exchanges than using a regular heap. So yes, a weak pile is stronger than an ordinary pile.
EDISON Software - web-development
This article was written with the support of EDISON.

We are engaged in the creation of embedded software as well as the development of web applications and sites .

We love the theory of algorithms! ;-)

Weak pile


A regular heap is a sorting tree in which any parent is greater (or equal) than any of its descendants. In a weak heap, this requirement is weakened - any parent is greater than (or equal to) any descendant only from its right subtree. In the left subtree, the descendants can be both smaller and larger than the parent, there you are so lucky.


This approach can significantly reduce the cost of maintaining the data set in a heap state. After all, it is necessary to ensure the principle “a descendant is no more than a parent” not for the whole structure, but only for half of it. At the same time, a weak heap, not being a 100% sorting tree, sorts no worse than an ordinary heap, and in some ways even better. Did half the battle - walk boldly!

Since the root of the heap, even a weak one, needs exactly the largest element, the root of the left subtree does not.

Minimizing the number of comparisons



Ronald D. Dutton, a specialist in algorithms and graph theory, presented us with a weak bunch in 1993. A conceptually weak pile is more difficult to understand (but this difficulty is more likely not in complexity, but in extravagance, you have to break your consciousness patterns through the knee) than an ordinary pile, so it has not received much practical distribution. However, when Dutton invented this structure, he not only wanted to practice abstract abstractions, but he pursued a completely pragmatic goal.

There is a theoretical lower limit for estimating the minimum number of comparisons (in those sortings in which these comparisons are widely used):

log n ! = n log n - n / ln 2 + O (log n), where 1 / ln 2 = 1.4426

In sorting by a weak heap, the number of comparisons is minimized and is close enough to the lower limit.

This can be of practical importance if you need to arrange objects whose comparison is expensive, for example, if it comes to sorting long strings.

Juggling descendants


“Left” and “right” in a weak heap is a situational phenomenon. A subtree can be either a left or a right descendant for its parent node - moreover, this “left / right” relation for the subtree and the parent can repeatedly switch from one value to the opposite during the process.

To indicate for the parent who has his right son and who is his left daughter is not simple, but very simple. To do this, you need an additional bitmap (consisting of only 0/1 values) for those nodes that have descendants.

Recall how the index i -th parent element, we define the indices of its left and right descendant in a conventional pile (indices of the array measured from zero):

Left descendant 2 × i + 1
Right descendant 2 × i + 2

In a weak heap, we have a cherry on the cake - a root that has only the right subtree, so we will adjust these formulas for the descendant indices by adding a reverse shift to 1 position:

Left descendant: 2 × i
Right descendant: 2 × i + 1

And finally , needed additional bitmap (call it bIT ), wherein for i -th element noted was whether the exchange places between its left and right subtrees. If the value for an element is 0, then there was no exchange. If the value is 1, then the left and right children go in the opposite order. And the formulas in this case are as follows:

Left descendant: 2 × i + BIT [ i ]
Right descendant: 2 × i + 1 - BIT [ i ]

This is how it looks. Elements whose descendants are located “vice versa” are highlighted in blue. The values ​​in the BIT array for them are 1.


You can check by substituting the parent values i and the corresponding 0/1 from the BIT array in the descendant formulas - the indices of the descendants will turn out as needed.

As you can see, for any parent to swap the left and right subtrees, in the array itself, the group of elements does not need to be moved anywhere. Only the 0/1 value for the parent in the BIT array is switched and that’s it.

Next - a session of magic with its subsequent exposure.

Build a weak pile


Swapping left and right descendants is the main tool for converting into a weak heap a set of data from an array.

In the process of forming the primary weak heap, we need to sort through the elements of the array in the reverse order (starting from the last) and for each of them find up the branch of the first (right) parent, for which it will be the right subtree.

If the element is someone's right descendant , then you do not have to go far. Its immediate parent is what you need:


If the element is someone’s left descendant , then you need to go up a number of levels before you meet the desired grandparent, for which the element is in the right subtree:



Then you need to compare the descendant and the progenitor found somewhere above. And if the descendant is larger than the progenitor, then the following must be done:

  1. If the descendant has his descendants, then swap his left and right subtrees (i.e. switch 0/1 in the BIT array for this element).
  2. Exchange the values ​​of the descendant node and the progenitor node.

Take a look at a specific example. Let’s say the following situation arose:


For the element of the array A [6] = 87, the necessary progenitor A [1] = 76 was found.
The progenitor A [1] is smaller than the element A [6] (76 <87).
Element A [6] has left and right subtrees (marked in shades of green).
You need to swap these subtrees
(that is, for element A [6] in the BIT array, change the value from 0 to 1).
It is also necessary to exchange the values ​​of the elements A [6] and A [1].


After the necessary actions are completed:


For element A [6], the left and right subtrees were exchanged
(that is, in the BIT array for element A [6], the value from 0 was changed to 1).
There was also an exchange of values ​​between A [6] and A [1].


If you go through the array from the end to the beginning and along the way perform this procedure for all elements, you will end up with a weak heap.

Why this strange mechanism works is an explanation closer to the end of the article.

Parsing a weak pile


A heap is a heap if the maximum element is at the root. Using this fact, all heap sortings work the same way. The root (where the maximum is located) exchanges values ​​with the last element of the unsorted part of the array - as a result, the unsorted part of the array decreases, and the sorted part of the array increases. After this exchange, the heap ceases to be a heap, since the current maximum element is no longer in its root. The heap needs to be restored, that is, make the resulting tree a heap again - find another maximum element and move it to the root.

How to recover a regular binary heap, we know - with the help of a sifter . But how to restore a weak heap? To do this, do the following.

From the root we descend down the left descendants (right up to the lowest):


Then we go up the left descendants back up and along the way each left descendant is compared with an element in the heap root. And if the next left descendant is larger than the root, then we do the same as in the previous stage: at the left descendant we swap the subtrees (if he has one) and change the values ​​of the left descendant and root.

As a result, we will restore the weak heap - the maximum element that is in the remaining tree will pop up at its root.

And again, we have this mystical carousel with subtrees that replace each other. So what is the secret to success? Why, if during the exchange of nodes with values, the left and right descendants of the lower node are interchanged, do we end up with a sorted array? You will never guess, although the answer is simple in its genius.

Weak heap sorting :: Weak heap sort


So, the final algorithm:

  • I. :
    • I.1. -.
    • I.2. «» .
    • I.3. .
    • I.4. , :
      • I.4.. ( ⇔ ) , .
      • I.4.. «» .
  • II. , :
    • II.1. .
    • II.2. . .
    • II.3. , . :
      • II.3.. .
      • II.3.. , .
      • II.3.. , , :
        • II.3.c. 1. We swap (left ⇔ right) subtrees with descendants for the node in which the current left descendant is located.
        • II.3.c.2. We change the heap root and the node with the current left child.
    • II.4. At the root of the weak heap is again the maximum element for the remaining unsorted part of the array. We return to paragraph II.1 and repeat the process until all the elements are sorted.


Animation (array indices in my animations start with one):



C ++ Code


At the bottom of the “Links” section, those interested can familiarize themselves with the implementation of this sorting in C ++. Here I give only the part that illustrates the algorithm itself.

#define GETFLAG(r, x) ((r[(x) >> 3] >> ((x) & 7)) & 1)
#define TOGGLEFLAG(r, x) (r[(x) >> 3] ^= 1 << ((x) & 7))

void WeakHeap::WeakHeapMerge(unsigned char *r, int i, int j) {
  if (wheap[i] < wheap[j]) {//""  ?
    //  ,   
    //( "",   "")
    TOGGLEFLAG(r, j);
    //  ""  
    swap(wheap[i], wheap[j]);
  }
}

void WeakHeap::WeakHeapSort() {
  int n = Size();
  if(n > 1) {
		
    int i, j, x, y, Gparent;
    int s = (n + 7) / 8;
    unsigned char * r = new unsigned char [s];
		
    //  ,    
    // "",   ""
    for(i = 0; i < n / 8; ++i) r[i] = 0;
		
    //   
    for(i = n - 1; i > 0; --i) {
      j = i;
      //    , 
      //   ""  
      while ((j & 1) == GETFLAG(r, j >> 1)) j >>= 1;
      //       ""  
      Gparent = j >> 1;
      //  ,   
      //   ""
      WeakHeapMerge(r, Gparent, i);
    }
		
    //      -->
    //  -->    
    for(i = n - 1; i >= 2; --i) {
      //      
      //       
      swap(wheap[0], wheap[i]);
      x = 1;
      //    "" 
      while((y = 2 * x + GETFLAG(r, x)) < i) x = y;
      //  ""     
      //        
      while(x > 0) {
        WeakHeapMerge(r, 0, x);
        x >>= 1;
      }
    }
    //  -   
    //    
    swap(wheap[0], wheap[1]);
    delete[] r;
  }
}

I especially like how the binary tree is traversed easily and naturally using bitwise operations.

Extra memory complexity


It seems like O ( n ) - an additional array is required, in which for nodes with descendants (there are about half of those in the array), the order of the left / right subtrees is fixed.

However, there is an opinion that the sorting complexity here is actually O (1)! For an element, we need only one additional bit (zero / one) to indicate the order of the descendants. If we sort, for example, strings, then it is quite feasible to append this additional bit to the element itself.

Another way to turn O ( n ) into O (1) is to store flags in a whole number. Binary expansion of numbers - a set of zeros and ones responsible for the order of subtrees of all elements of the array. The i- th element of the array corresponds to the i- th bit of the number.

Time complexity


By time, O ( n log n ) is the same as a regular heap. When sorting rows (especially long ones), a weak heap can be faster than a regular heap. But this is if we sort the long lines. If we sort the numbers, then, according to rumors, an ordinary heap is faster to manage.

Full sifting up


At the stage of formation of the initial weak heap, by analogy with the usual heap, the quite obvious idea suggests raising large elements as high as possible. That is, if we exchanged the values ​​of the lower node and its ancestor, then it would be logical to immediately repeat the steps for the ancestor - to find his nearest right ancestor for him and compare (and if it is also necessary to exchange values ​​+ exchange of subtrees). And, if possible, raise a large element to the very root. This is how it looks at the first stage (the actions at the second stage of the algorithm are unchanged):


The time complexity score remains the same.

Binomial pile


All that was up to this point is a deception, an illusion. Of course, we formally perform some manipulations with the binary tree there, change the nodes with values, rearrange the subtrees and all that. However, the algorithm has a double bottom, which we will now look into.

To understand this sort, you need to understand what a weak heap really is.

If we take an array in which the number of elements is a power of two, then the weak heap and binomial heap constructed on its basis are isomorphic.


Do not look at the fact that a weak heap is binary, and binomial is not. In a weak heap, the left and right subtrees are essentially different. The right subtree is a descendant in the classical sense, but the left subtree is rather a “brother”. Although not. The left subtree is not even a “brother”, but a vector of “brothers” with fewer nodes.

However, the weak heap and binomial heap are not 100% the same, although they are the closest relatives. The difference is obvious, if you take an array whose number of elements is not equal to 2 n . Binomial decomposition of such an array will give a connected list of several ideal heaps (the number of nodes in each of them is a certain power of two):


A weak heap in this case will be one imperfect binary tree:



The binomial pile and the weak pile are twin brothers. DNA is the same, although in appearance you can’t tell.

Secret algorithm


Given that a weak heap is a cryptobinomial heap, shuffling subtrees suddenly finds a simple explanation.


With a weak heap, sweep away the pseudobinary tinsel and look at the real relations between the nodes in the binomial heap style. Everything becomes clear.

In fact:

  1. There is no "weakness", it is a full-fledged sorting (non-binary) tree in which the principle "any parent is greater than any of its descendants" is achieved and maintained.
  2. At all stages, we compare the descendants not with their ancestors, but with their immediate parents.
  3. What looks like an exchange of values ​​between a descendant and an ancestor + an exchange of places of subtrees in a descendant - it turns out to be the exchange of the ratio itself (descendant / parent). If the parent node is smaller than the descendant by value, then the parent itself becomes the descendant, and the descendant becomes the parent.

Here is an honest visualization:



In the next series


The next pile that I would like to talk about is my favorite - the Cartesian tree. This is not only a bunch, but also a binary search tree . But then first, in the next article, something interesting needs to be clarified about BST trees. And only then, through the article, and about Cartesian talk.

References


Weak Heap , Binomial Heap / Binomial

Heap C ++ Weak Heap Implementation

Ronald D. Dutton: Personal Page , UCF Website Profile

Weak Heaps and Friends: Recent Developments

The Weak-Heap Data Structure: Variants and Applications

On the Performance of WEAK-HEAPSORT

Adaptive heapsort: Source code

Sergey Kopeliovich - Lecture Hall - Weak pile (from 48:32 to 1:16:06)

Series Articles:




Today’s sorting is added to the AlgoLab application by a weak bunch, who uses it - update the excel file with macros.

In the comments to the cell with the name of the sort, you can specify some settings. If you set siftup = 1, then the sorting will use the full screening up in the first stage (by default siftup = 0).

If you prescribe binomial = 1, then the tree will be a la “binomial heap” (by default binomial = 0, that is, just a weak heap).

All Articles