🍪 ⚫️ 👋🏻 SQL HowTo: write a while-loop directly in the query, or "Elementary three-way" 🚎 🗓️ 🏂🏿

Periodically, the task of finding related data on a set of keys arises, until we type the desired total number of records .

The most “vital” example is to derive the 20 oldest tasks listed on the list of employees (for example, within the same unit). For various managerial “dashboards” with brief squeezes on job sites, a similar topic is often required.

In the article, we will consider the implementation on PostgreSQL of a “naive” version of solving such a problem, “smarter” and a very complex SQL “cycle” algorithm with the condition to exit from the found data , which can be useful both for general development and for use in other similar cases .

Take the test data set from the previous article . So that the displayed records do not “jump” from time to time when the sorted values coincide, we expand the subject index by adding a primary key . At the same time, this will immediately give it uniqueness, and guarantees us the uniqueness of the sort order:

CREATE INDEX ON task(owner_id, task_date, id);
--   - 
DROP INDEX task_owner_id_task_date_idx;

As it is heard, it is written

First, we outline the simplest version of the request, passing the ID of the performers by the array as an input parameter :

SELECT
  *
FROM
  task
WHERE
  owner_id = ANY('{1,2,4,8,16,32,64,128,256,512}'::integer[])
ORDER BY
  task_date, id
LIMIT 20;

[look at explain.tensor.ru]

It's a little sad - we ordered only 20 records, and Index Scan returned 960 lines to us , which we had to sort later ... And let's try reading a little less.

unnest + ARRAY

The first consideration that will help us is if we need only 20 sorted records, then it is enough to read no more than 20 sorted records in the same order for each key. Fortunately, we have a suitable index (owner_id, task_date, id).

We will use the same mechanism for extracting and “reversing into columns” of a holistic table entry as in the previous article . We also apply convolution to an array using the function ARRAY():

WITH T AS (
  SELECT
    unnest(ARRAY(
      SELECT
        t
      FROM
        task t
      WHERE
        owner_id = unnest
      ORDER BY
        task_date, id
      LIMIT 20 --  ...
    )) r
  FROM
    unnest('{1,2,4,8,16,32,64,128,256,512}'::integer[])
)
SELECT
  (r).*
FROM
  T
ORDER BY
  (r).task_date, (r).id
LIMIT 20; -- ...   -

[look at explain.tensor.ru]

Oh, much better already! 40% faster, and 4.5 times less data had to be read.

Materializing table entries through CTE

, , «» CTE «» InitPlan :

SELECT
  ((
    SELECT
      t
    FROM
      task t
    WHERE
      owner_id = 1
    ORDER BY
      task_date, id
    LIMIT 1
  ).*);

Result  (cost=4.77..4.78 rows=1 width=16) (actual time=0.063..0.063 rows=1 loops=1)
  Buffers: shared hit=16
  InitPlan 1 (returns $0)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.031..0.032 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t  (cost=0.42..387.57 rows=500 width=48) (actual time=0.030..0.030 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4
  InitPlan 2 (returns $1)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.008..0.009 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t_1  (cost=0.42..387.57 rows=500 width=48) (actual time=0.008..0.008 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4
  InitPlan 3 (returns $2)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.008..0.008 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t_2  (cost=0.42..387.57 rows=500 width=48) (actual time=0.008..0.008 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4"
  InitPlan 4 (returns $3)
    ->  Limit  (cost=0.42..1.19 rows=1 width=48) (actual time=0.009..0.009 rows=1 loops=1)
          Buffers: shared hit=4
          ->  Index Scan using task_owner_id_task_date_id_idx on task t_3  (cost=0.42..387.57 rows=500 width=48) (actual time=0.009..0.009 rows=1 loops=1)
                Index Cond: (owner_id = 1)
                Buffers: shared hit=4

«» 4 … PostgreSQL 11 , «» CTE, .

Recursive battery

In the previous version, we read a total of 200 lines for the necessary 20. Already not 960, but even less - is it possible?

Let's try to use the knowledge that we need only 20 entries. That is, we will iterate the proofreading of the data only until the quantity we need is achieved.

Step 1: start list

Obviously, our “target” list of 20 entries should begin with the “first” entries for one of our owner_id keys. Therefore, first we find such "very first" for each of the keys and put it in the list, sorting it in the order we want - (task_date, id).

Step 2: find the “next” entries

Now, if we take the first record from our list and begin to “step” further in the index while preserving the owner_id key, then all the records found are just the next in the resulting selection. Of course, only until we cross the application key of the second entry in the list.

If it turned out that we “crossed” the second record, then the last read record should be added to the list instead of the first (with the same owner_id), after which we will sort the list again.

That is, it always turns out that we have no more than one entry in the list for each of the keys (if the records ended, but we did not “cross”, then the first record from the list simply disappears and nothing is added), and they are always sorted in ascending order of the application key (task_date, id).

Step 3: filter and “expand” the records

In the part of the lines of our recursive selection, some entries are rvduplicated - first we find such as “the intersecting border of the 2nd entry in the list”, and then substitute it as the 1st from the list. So the first appearance must be filtered.

Scary final query

WITH RECURSIVE T AS (
  -- #1 :    ""      
  WITH wrap AS ( -- "" record',        InitPlan/SubPlan
    WITH T AS (
      SELECT
        (
          SELECT
            r
          FROM
            task r
          WHERE
            owner_id = unnest
          ORDER BY
            task_date, id
          LIMIT 1
        ) r
      FROM
        unnest('{1,2,4,8,16,32,64,128,256,512}'::integer[])
    )
    SELECT
      array_agg(r ORDER BY (r).task_date, (r).id) list --     
    FROM
      T
  )
  SELECT
    list
  , list[1] rv
  , FALSE not_cross
  , 0 size
  FROM
    wrap
UNION ALL
  -- #2 :   1-   ,      2-
  SELECT
    CASE
      --       1- 
      WHEN X._r IS NOT DISTINCT FROM NULL THEN
        T.list[2:] --    
      --       2- 
      WHEN X.not_cross THEN
        T.list --       
      --      2- 
      WHEN T.list[2] IS NULL THEN
        --    
        '{}'
      --  ,  1-      
      ELSE (
        SELECT
          coalesce(T.list[2] || array_agg(r ORDER BY (r).task_date, (r).id), '{}')
        FROM
          unnest(T.list[3:] || X._r) r
      )
    END
  , X._r
  , X.not_cross
  , T.size + X.not_cross::integer
  FROM
    T
  , LATERAL(
      WITH wrap AS ( -- "" record
        SELECT
          CASE
            --  - ""  2- 
            WHEN NOT T.not_cross
              --    -   
              THEN T.list[1]
            ELSE ( --   ,        -   
              SELECT
                _r
              FROM
                task _r
              WHERE
                owner_id = (rv).owner_id AND
                (task_date, id) > ((rv).task_date, (rv).id)
              ORDER BY
                task_date, id
              LIMIT 1
            )
          END _r
      )
      SELECT
        _r
      , CASE
          --  2-     ,    - 
          WHEN list[2] IS NULL AND _r IS DISTINCT FROM NULL THEN
            TRUE
          ELSE --     ""
            coalesce(((_r).task_date, (_r).id) < ((list[2]).task_date, (list[2]).id), FALSE)
        END not_cross
      FROM
        wrap
    ) X
  WHERE
    T.size < 20 AND --   
    T.list IS DISTINCT FROM '{}' --     
)
-- #3 : ""  -    
SELECT
  (rv).*
FROM
  T
WHERE
  not_cross; --   ""

[look at explain.tensor.ru]

Thus, we exchanged 50% of data readings for 20% of the runtime . That is, if you have reasons to believe that reading can be long (for example, the data is often not in the cache, and you have to go to disk for it), then this way you can depend on reading less.

In any case, the execution time turned out better than in the "naive" first version. But which of these 3 options to use - you choose.