We are talking only about PostgreSQL 13. On April 8, a “freeze” took place - PostgreSQL feature freeze , now only those features that are accepted before this date will be included in this version.
It’s hard to name a revolutionary version of this. There are no cardinal, conceptual changes in it. In addition, such important patches as Table and Functions for the JSON / SQL standard, which I wanted to see in PG12 next to the JSONPath patch, did not have time to enter it; ready-made embedded storage did not appear - only the interface is being finalized. But the list of improvements is still impressive. We have prepared a fairly complete summary of the patches included in the Devil's Dozen.
Changes to SQL Commands
CREATE DATABASE ... LOCALEUtilitiesinitdb
,createdb
and the teamCREATE COLLATION
have a settingLOCALE
that allows you to specify values for the rightLC_CTYPE
andLC_COLLATE
. Now the same opportunity appeared in the teamCREATE DATABASE
:CREATE DATABASE db_koi8r TEMPLATE template0
ENCODING 'KOI8R' LOCALE 'ru_RU.KOI8R';
ALTER VIEW ... RENAME COLUMNThe column name in the view can now be changed with the commandALTER VIEW
. Previously, this required re-creating the view.Suppose you forgot to give the column a name:CREATE VIEW uptime AS SELECT current_timestamp, current_timestamp - pg_postmaster_start_time();
SELECT * FROM uptime;
current_timestamp | ?column?
-------------------------------+-----------------
2020-03-23 15:37:00.088824+03 | 04:18:24.897856
It can be fixed:ALTER VIEW uptime RENAME COLUMN "?column?" TO uptime;
SELECT * FROM uptime;
current_timestamp | uptime
-------------------------------+-----------------
2020-03-23 15:37:40.726516+03 | 04:19:05.535548
ALTER TABLE ... ALTER COLUMN ... DROP EXPRESSIONThe generated column of the table can now be made normal, that is, delete the expression to evaluate it:CREATE TABLE payments (
id integer PRIMARY KEY,
amount numeric(18,2),
income_tax numeric(18,2) GENERATED ALWAYS AS (amount*0.13) STORED
);
INSERT INTO payments(id, amount) VALUES(1, 42);
\d payments
Table "public.payments"
Column | Type | Collation | Nullable | Default
------------+---------------+-----------+----------+--------------------------------------
id | integer | | not null |
amount | numeric(18,2) | | |
income_tax | numeric(18,2) | | | generated always as ((amount * 0.13))
stored
Indexes:
"payments_pkey" PRIMARY KEY, btree (id)
Subsequently, they decided that income_tax should be explicitly set. Delete the expression:ALTER TABLE payments ALTER COLUMN income_tax DROP EXPRESSION;
\d payments
Table "public.payments"
Column | Type | Collation | Nullable | Default
------------+---------------+-----------+----------+---------
id | integer | | not null |
amount | numeric(18,2) | | |
income_tax | numeric(18,2) | | |
Indexes:
"payments_pkey" PRIMARY KEY, btree (id)
Of course, the existing data from the column has not gone away:SELECT * FROM payments;
id | amount | income_tax
----+--------+------------
1 | 42.00 | 5.46
DROP DATABASE ... FORCEIf you want to delete a database without waiting for all users to disconnect, you can use the newFORCE
commandoptionDROP DATABASE
.CREATE DATABASE db;
Connect to the new database:CREATE EXTENSION IF NOT EXISTS dblink;
SELECT dblink_connect('dbname=db');
dblink_connect
----------------
OK
And now we will delete, forcibly interrupting, as well pg_terminate_backend
, open connections:DROP DATABASE db WITH (FORCE);
ALTER TYPE ... SET STORAGEThe commandALTER TYPE
allows for basic data types to change various properties, in particular, the storage strategy. Previously, you could only set it in a teamCREATE TYPE
.For demonstration, we will not create a new base type, but use the existing -tsquery
. But first, create a separate database and connect to it:CREATE DATABASE db;
\c db
A storage strategy is used for the tsquery data type plain
, so columns of tables of this type get the same strategy:SELECT typname,typstorage FROM pg_type WHERE typname = 'tsquery';
typname | typstorage
---------+------------
tsquery | p
CREATE TABLE queries (query tsquery);
SELECT attname, attstorage FROM pg_attribute WHERE attrelid = 'queries'::regclass AND attname = 'query';
attname | attstorage
---------+------------
query | p
If you need to use a different strategy for new tables, you can change the base type:ALTER TYPE tsquery SET (storage=external);
SELECT typname,typstorage FROM pg_type WHERE typname = 'tsquery';
typname | typstorage
---------+------------
tsquery | e
The type of storage in the new tables will also change:CREATE TABLE new_queries (query tsquery);
SELECT attname, attstorage FROM pg_attribute WHERE attrelid = 'new_queries'::regclass AND attname = 'query';
attname | attstorage
---------+------------
query | e
It should be borne in mind that changing a strategy involving the use of TOAST plain
cannot be changed back to :ALTER TYPE tsquery SET (storage=plain);
ERROR: cannot change type's storage to PLAIN
Therefore, the experiments were carried out in a separate database, which is not a pity to delete.ALTER STATISTICS ... SET STATISTICS Thecommand CREATE STATISTICS
allows you to collect lists of the most common values for selected combinations of table columns. The number of most common values collected is determined by the parameter default_statistics_target
. The value for specific statistics can now be changed with the command:ALTER STATISTICS SET STATISTICS _;
FETCH FIRST with WITH TIES optionAs you know,SELECT
instead of specifying acommand,LIMIT
you can use the syntax defined in the SQL standard:SELECT *
FROM (VALUES (1,'1.1'), (2,'2.1'), (2,'2.2'), (3,'3.1')) t(a,b)
ORDER BY a
FETCH FIRST 2 ROWS ONLY;
a | b
---+-----
1 | 1.1
2 | 2.1
(2 rows)
Now it FETCH
supports the phrase WITH TIES
, which adds to the output all “related” lines (lines equal to the ones already selected, if only the sorting condition is taken into account):SELECT *
FROM (VALUES (1,'1.1'), (2,'2.1'), (2,'2.2'), (3,'3.1')) t(a,b)
ORDER BY a
FETCH FIRST 2 ROWS WITH TIES;
a | b
---+-----
1 | 1.1
2 | 2.1
2 | 2.2
(3 rows)
Built-in Functions and Data Types
get_random_uuid Thenew functionget_random_uuid
returns the version 4 UUID value (random value):SELECT gen_random_uuid();
gen_random_uuid
--------------------------------------
25e02793-80c0-438c-be07-c94b966c43ab
The function is useful for generating unique UUID values in distributed systems.Previously, you had to use the uuid-ossp or pgcrypto libraries.min_scale and trim_scale for values of type numericThe function min_scale
determines the number of significant digits in the fractional part of the number, and the function trim_scale
discards non- significant zeros:SELECT n, min_scale(n), trim_scale(n) FROM (VALUES(42.42000)) as t(n);
n | min_scale | trim_scale
----------+-----------+------------
42.42000 | 2 | 42.42
gcd and lcmReplenishment in the section of mathematical functions. Now you can quickly find the largest common divisor (gcm
) and the smallest common multiple (lcm
):SELECT gcd(54,24), lcm(54,24);
gcd | lcm
-----+-----
6 | 216
Aggregate functions min and max for type pg_lsn Aggregate functions have been added for thedata typeandthat allows you to perform queries of the form:pg_lsn
min
max
SELECT min(restart_lsn) FROM pg_replication_slots;
SELECT min(sent_lsn) FROM pg_stat_replication;
Checking the type modifier of the return value of a functionIn previous versions, the type modifier was not checked for the return value of the function.Suppose there is a type for storing monetary units and a function that returns the amount of income tax:CREATE TYPE currency AS (
amount numeric(18,2),
code text
);
CREATE FUNCTION income_tax(quantity currency) RETURNS currency
AS 'SELECT quantity.amount * 0.13, quantity.code' LANGUAGE SQL;
Calling the function, we expect to get two decimal places, however, we get four. Even explicit casting after a function call does not help (third column):SELECT amount, code, amount::numeric(18,2)
FROM income_tax(ROW(42.42, ''))\gx
-[ RECORD 1 ]--
amount | 5.5146
code |
amount | 5.5146
In version 13, the result is correct:SELECT amount, code, amount::numeric(18,2)
FROM income_tax(ROW(42.42, ''))\gx
-[ RECORD 1 ]
amount | 5.51
code |
amount | 5.51
The localized names in to_date () and to_timestamp ()Functionsto_date
alsoto_timestamp
learned to understand the localized names of the months and days of the week. Previously, only English names could be used:SELECT to_date(', 24 2020', 'TMDay, DD TMMonth YYYY');
to_date
------------
2020-03-24
normalize and IS NORMALIZEDTo comply with the SQL standard, the normalize () function has been added to normalize a Unicode string, and the IS NORMALIZED predicate to check if a string is normalized.SELECT str, normalize(str, NFC) AS nfc,
str IS NFC NORMALIZED AS is_nfc_normalized,
normalize(str, NFKC) AS nfkc,
str IS NFKC NORMALIZED AS is_nfkc_normalized
FROM (VALUES(U&'\0032\2075')) AS vals(str)\gx
-[ RECORD 1 ]------+---
str | 2⁵
nfc | 2⁵
is_nfc_normalized | t
nfkc | 25
is_nfkc_normalized | f
Read more about UNICODE normalization forms.Xid8 type and xid8_current () function for 64-bit transaction numbersAdded a new xid8 data type for 64-bit transaction number. But no, this does not mean that PostgreSQL switched to 64-bit transactions: everything works exactly as before. But some functions return a new type, for example, is now recommended for use instead of the old functions pg_current_xact_id txid_current, which returned int8, and so on. N.New polymorphic data types anycompatible familyadded types anycompatible
, anycompatiblearray
, anycompatiblenonarray
, anycompatiblerange
. Unlike family types anyelement
, new types allow you to use not exactly the same, but actually compatible types.In the following example, the functionmaximum
as arguments defined as anycompatible
are passed integer
and numeric
. The return value is converted to a common value for these two types:CREATE FUNCTION maximum(a anycompatible, b anycompatible) RETURNS anycompatible
AS 'SELECT CASE WHEN a > b THEN a ELSE b END' LANGUAGE SQL;
SELECT f, pg_typeof(f) FROM maximum(42, 42.42) f;
f | pg_typeof
-------+-----------
42.42 | numeric
Moreover, the types anycompatible- and any- are two independent sets of types:CREATE FUNCTION first_elems(a anyarray, b anycompatiblearray)
RETURNS TABLE(a anyelement, b anycompatible) AS $$
SELECT ROW(a[array_lower(a,1)], b[array_lower(b,1)]);
$$ LANGUAGE sql;
SELECT first_elems(ARRAY[1,2,3], ARRAY['a','b','c']) AS str,
first_elems(ARRAY[1.1,2.2,3.3],ARRAY[true,false]) bool;
str | bool
-------+---------
(1,a) | (1.1,t)
Procedural Languages
Transform for type bool in PL / PerlMost recently, TRANSFORM of Ivan Panchenko (Postgres Professional) - bool_plperl was committed . Postgres passes boolean values tot
orin PL / Perl likef
, but for Perl itf
’s not a boolean false , but just the letter f, i.e. in a logical context, truth . This problem can be solved in different ways (see correspondence ), but creating TRANSFORM for bool, according to Tom Lane, is the most practical.Fast execution of simple expressions in PL / pgSQLSimple expressions (at least not containing table calls and not requiring locks) will be faster. Previously, in these cases, time was unproductively spent on contacting the scheduler on each cycle.CREATE OR REPLACE FUNCTION slow_pi() RETURNS double precision AS $$
DECLARE
a double precision := 1;
s double precision := 1;
r double precision := 0;
BEGIN
FOR i IN 1 .. 10000000 LOOP
r := r + s/a; a := a + 2; s := -s;
END LOOP;
RETURN 4*r;
END;
$$ LANGUAGE plpgsql;
Call slow_pi () in PG12:SELECT slow_pi();
slow_pi
--------------------
3.1415925535898497
(1 row)
Time: 13060,650 ms (00:13,061)
Now in PG13:SELECT slow_pi();
slow_pi
--------------------
3.1415925535897915
(1 row)
Time: 2108,464 ms (00:02,108)
Trusted extensions instead of pg_pltemplateThe system directory has decreased by one table. Deletedpg_pltemplate
. It stored the properties of procedural languages that are needed during executionCREATE LANGUAGE
. Now we decided to register the properties from in the scripts of extensions of the corresponding languagespg_pltemplate
, and get rid of the table itself. But in order to realize what is planned, it is necessary to provide for the possibility for the database owner (without superuser rights) to create a trusted language from the extension script. Indeed, now to create, for example, plpgsql, the database owner does not have to be superuser.Received as follows. A new logical parameter has appeared in the control file for extensionstrusted
. If both parameterstrusted
andsuperuser
included, then the extension can be created not only by the superuser, but also by the user with the right CREATE
to the current database (and therefore its owner). When executing a script of such an extension, the superuser rights that initialized the cluster will be used. The objects created by the extension will belong to it, although the owner of the extension itself will be the creating user.Important consequences of these changes:- Trusted extensions open the way for third-party developers to create other trusted languages. Now we are limited only to plpgsql, plperl and pltcl.
- It
pg_pltemplate
was hard written that plpython refers to the second version of the language. Failure to pg_pltemplate
do so is a step (necessary, though not sufficient) to the transition to python 3.
Indices
Compression of B-treeAn important and long-awaited (the work began already in 2015) patch written by Anastasia Lubennikova (Postgres Professional) and Peter Geigan (Peter Geoghegan) is finally communicated by Peter. Nastya managed to talk about this at PGconf India . Postgres has learned to significantly reduce the size of B-tree indexes through deduplication, that is, savings on duplicate index keys. These indexes have been seriously redesigned so that compression is possible without loss in compatibility with previous versions of indexes. The idea of deduplication is taken from a more flexible architecture of indexes like GIN (reverse indexes - Generalized Inverted Index).In these indices more often than in B-tree, there is a situation when a key corresponds to a large number of records. In the case of word processing, for example, the same token is usually found in several documents. And it is stored in the index only once. Until recently, B-tree indexes did not know how to do this.B-tree indexes differ from GIN indexes primarily in leaf pages. Depending on the number of records related to the same key value, options are possible: the page contains only a posting list - a list of TIDs (identifiers of indexed records), if the list is small, and if there are a lot of TIDs, then instead of a list of values are stored new “tree branches” - links to other pages like the posting list or other tree branches (they are called posting tree).Such a tree structure is similar to a B-tree, but differs in essential details: for example, lists for moving through pages of the same tree level in the GIN are unidirectional, not bidirectional. Therefore (including) good compatibility of new, deduplicated indexes with old versions is not easy to achieve. And the improvements really took more than 3 years. It was also necessary to work out the cleaning mechanism (microvacuum) and other nuances.In performance tests, all indexes to which deduplication is applicable have shrunk by about 3 times. Compressing duplicates also helps unique indexes, eliminating the problem of index swelling at a high rate of table changes. New behavior can be connected and disconnected at the index settings level.Full scan by GIN index is not done where it is not necessaryThis patch allows in some cases to avoid a full pass through the entire GIN index. Some operations, although supported by the GIN index, are performed by a full scan of the index. Take, for example, the index for full-text column search tsvector
. If the search query has the form “anything but a given word”, then the entire index will have to be read in its entirety. If, however, another condition is present in the request that does not require a full scan of the index, then the index will still be scanned completely.With the new optimization, a more accurate condition will be used first, allowing you to get a gain from the index, and then the results will be double-checked to take into account another limitation. Compare the number of pages that were read in version 12 (Buffers):EXPLAIN (ANALYZE, BUFFERS, COSTS OFF, TIMING OFF)
SELECT * FROM mail_messages WHERE tsv @@ to_tsquery('!vacuum') AND tsv @@ to_tsquery('analyze');
QUERY PLAN
------------------------------------------------------------------
Bitmap Heap Scan on mail_messages (actual rows=5864 loops=1)
Recheck Cond: ((tsv @@ to_tsquery('!vacuum'::text)) AND (tsv @@ to_tsquery('analyze'::text)))
Heap Blocks: exact=5167
Buffers: shared hit=24 read=27405
-> Bitmap Index Scan on mail_messages_tsv_idx (actual rows=5864 loops=1)
Index Cond: ((tsv @@ to_tsquery('!vacuum'::text)) AND (tsv @@ to_tsquery('analyze'::text)))
Buffers: shared hit=24 read=22238
Planning Time: 0.283 ms
Execution Time: 3258.234 ms
with the number of buffers in the new version:EXPLAIN (ANALYZE, BUFFERS, COSTS OFF, TIMING OFF)
SELECT * FROM mail_messages WHERE tsv @@ to_tsquery('!vacuum') AND tsv @@ to_tsquery('analyze');
QUERY PLAN
---------------------------------------------------------------------------
Bitmap Heap Scan on mail_messages (actual rows=5864 loops=1)
Recheck Cond: ((tsv @@ to_tsquery('!vacuum'::text)) AND (tsv @@ to_tsquery('analyze'::text)))
Heap Blocks: exact=5156
Buffers: shared hit=5179
-> Bitmap Index Scan on mail_messages_tsv_idx (actual rows=5864 loops=1)
Index Cond: ((tsv @@ to_tsquery('!vacuum'::text)) AND (tsv @@ to_tsquery('analyze'::text)))
Buffers: shared hit=23
Planning Time: 0.250 ms
Execution Time: 8.779 ms
A similar situation can be encountered when using trigrams, and when checking for the occurrence of arrays.Parameters of operator classesIn PostgreSQL, many index access methods are a “framework” that takes on a high-level implementation of the search algorithm, working with pages and locks, and the WAL log. And binding to specific data types and operators is performed using operator classes.Until now, operator classes could not have parameters. For example, for a full-text search, you can use the GiST index with the operator class tsvector_ops
(about the GiST operator classes here) This class of operators uses a signature tree, and the signature length was fixed (124 bytes). Now you can specify the length explicitly, which allows you to control the balance between the size of the index and efficiency (the number of hash collisions):CREATE INDEX ON mail_messages USING gist(tsv tsvector_ops(siglen=1024));
Similar changes for starters were made for other classes of GiST operators that use a signature tree, which applies to the hstore, intarray, ltree and pg_trgm extensions.But the main idea for which this change was conceived is the ability to pass the JSONPath expression to the GIN index so that not the entire JSON document is indexed, but only the necessary part of it. In many cases, this will radically reduce the size of indexes. But this work remains to be done.The idea of Oleg Bartunov, the implementation of Nikita Glukhov and Alexander Korotkov (all three Postgres Professional).The operator <-> (box, point) wasadded. The missing operation was added for use in kNN for GiST and SP-GiST . In PG12 when working with geometric types point
andbox
you can use the distance operator <->(point, box)
, and it will speed up the search with the GiST and SP-GiST indexes. But the operator symmetric to him <->(box, point)
was not implemented, although he box
already understood the distances to more complex types - polygons and circles.CREATE TABLE points(point point);
CREATE TABLE boxes(box box);
INSERT INTO points VALUES('1,2','3,4');
INSERT INTO boxes VALUES(box(point(3,4),point(5,6)), box(point(13,14),point(15,16)));
In PG12:SELECT p.point, b.box, b.box <-> p.point distance FROM points AS p, boxes AS b ORDER BY distance;
: : box <-> point
If vice versa, then everything is ok:SELECT p.point, b.box, p.point <-> b.box distance FROM points AS p, boxes AS b ORDER BY distance;
point | box | distance
-------+-----------------+--------------------
(1,2) | (5,6),(3,4) | 2.8284271247461903
(2,1) | (5,6),(3,4) | 3.1622776601683795
(1,2) | (15,16),(13,14) | 16.970562748477143
(2,1) | (15,16),(13,14) | 17.029386365926403
And in PG13:SELECT p.point, b.box, b.box <-> p.point distance FROM points AS p, boxes AS b ORDER BY distance;
point | box | distance
-------+-----------------+--------------------
(1,2) | (5,6),(3,4) | 2.8284271247461903
(2,1) | (5,6),(3,4) | 3.1622776601683795
(1,2) | (15,16),(13,14) | 16.970562748477143
(2,1) | (15,16),(13,14) | 17.029386365926403
GiST and SP-GiST indices will be accelerated in this operation.Note that in PG13, if you ask:SELECT count(*) FROM pg_operator WHERE oprname = '<->';
count
-------
28
and if we do the same in PG12, we get 20 entries: in the 13th version, the list was replenished with as many as 8 operators.Json
Support for the .datetime () method for jsonpathThis is one of the unsuccessful patches of a large series of JSONPath patches that PG12 did not have time to complete. Part of the JSON / SQL standard. The problem was that all the functions in the JSONPath patch series are immutable, but the date comparison takes into account the current time zone, which may change during the session.In such cases, we allow existing immutable functions to throw an error about non-immutable comparisons. At the same time, this patch has functions with the suffix _tz that work stably in operations with timezone.New function - jsonb_set_lax functionIn general, lax is a non-strict (unlike strict) mode of operation of functions with jsonb. In this case, this function will be operational in a situation where one of the arguments that it takes is NULL. Unlike the strict version - jsonb_set () - it has an additional argument that indicates actions in the case of NULL. Options: use_json_null / raise_exception / return_target / delete_key. Options suggested by interested users.Optimized some jsonb functions.Optimized a lot., mainly through the efforts of Nikita Glukhov (Postgres Professional). But to analyze each point in this case is pointless: firstly, their abundance will inflate an already short article; and secondly, the changes relate to the internal device, and not every user is interested. Therefore, we will only list most of them:- Optimized function JsonbExtractScalar ();
- Optimized operator # >>, functions jsonb_each_text (), jsonb_array_elements_text ();
- Recognition of type JsonbContainer in get_jsonb_path_all () is optimized;
- Fetching the first token from the JsonbIterator iterator is replaced by the lightweight macro JsonbContainerIsXxx ();
- More convenient key extraction - findJsonbKeyInObject ();
- Optimized storage of result findJsonbValueFromContainer () and getIthJsonbValueFromContainer ();
- get_jsonb_path_all(), ;
JsonbValueAsText.
, SQL/JSON: JSON_TABLE SQL/JSON: functions. . , . . PG14. JSONPath .pgbench
The utility for running performance tests received a series of improvements. There were statistics on the execution of tasks at the initialization phase, a more visual conclusion , the ability to see the code of built-in scripts , testing on a partitioned table of accounts.In addition, we added a command \aset
similar to \gset
, but allowing to set variables for several requests sent at a time. The following line, sent to the server for execution, sets both variables one
and two
:SELECT 1 AS one \; SELECT 2 AS two \aset
pg_dump
pg_dump learned to unload data from third-party tables . Through the parameter, --include-foreign-data
you can specify a list of third-party servers, the data from the tables of which will be unloaded.Use this unloading carefully. Far from the fact that the data must be uploaded to a third-party server. In addition, it is quite possible that a third-party server will not be available during recovery. Or a third-party server can only allow reading, but not writing data.psql
A series of small patches makes psql more comfortable:- Improved tab completion for several teams.
- In addition to
\echo
sending a string to STDOUT, a new command \warn
sends a string to standard error output (STDERR). \d
toast- . \d+
.\dt+
«Persistence» (unlogged) (permanent).\e
, ;
( \g*
). , psql .- .
PROMPT1
PROMPT2
%x
. - New commands for more information about access methods:
\dAc
, \dAf
, \dAo
,\dAp
- You can
\g
now specify in brackets any options that are supported \pset
. They will act only on the current team.
libpq
Minor changes in connection with PostgreSQL:- The inaccuracy in the description of the host and hostadr parameters and the resulting inconsistency in the output of the
\conninfo
psql utility command have been fixed. - If the client certificate key is stored in encrypted form, you can enter the password only in interactive mode. The new sslpassword parameter allows you to decrypt the key non-interactively.
- Two new parameters, sslminprotocolversion and sslmaxprotocolversion, allow you to specify a restriction on the version of the SSL / TCL protocol with which connection is allowed.
reindexdb
The new --jobs parameter of the reindexdb utility sets the number of database connections in which indexes will be rebuilt at the same time.pg_rewind
The limitations of the utility are gradually being removed, and the possibilities are increasing.Firstly, pg_rewind can now record information for recovery (as pg_basebackup can do this), as well as start recovery and subsequent shutdown of an instance if it was not stopped through a breakpoint (this had to be done manually before).And secondly, pg_rewind learned to work with the WAL archive .After the utility finds the WAL divergence point between the two servers, it should build a list of all pages that need to be copied to the target cluster in order to eliminate the differences. For this, the utility requires all WAL files, starting from the found point. If the necessary WAL files are not available on the target cluster, the utility could not perform its work before.With this patch by Alexey Kondratov (Postgres Professional) pg_rewind will be able to read the missing WAL segments from the archive of the log files using the restore_command parameter if a new -c or --restore-target-wal switch is specified.pg_waldump
pg_waldump will decrypt the prepared transaction record .amcheck
The amcheck extension has learned to better recognize damage in B-tree indexes.By the way, now messages in the server log about damaged pages will differ for indexes and tables .pageinspect
The heap_tuple_infomask_flags
pageinspect extension function decrypts the values of the fields infomask
and infomask2
returned by the function heap_page_items
. Useful in investigating data corruption situations.postgres_fdw
The superuser at the username mapping level can allow regular users to use a connection without a password:ALTER USER MAPPING FOR SERVER
OPTIONS (ADD password_required 'false');
This is done, among other things, so that sslkey and sslcert can be used as connection parameters .adminpack
The adminpack extension has a new feature - pg_file_sync
. Using it, you can do fsync for files written by the server to disk, for example, via pg_file_write
or COPY TO
.Monitoring
pg_stat_slru
In the server’s shared memory, there is not only a large buffer cache, but also a number of other, simpler caches (for example, for transaction status). They use a simple algorithm for crowding out the least frequently used pages (simple least-recently-used, or SLRU). Until now, such caches “just worked”, but there was a need to monitor them, primarily for developers of the PostgreSQL kernel to figure out if something needs to be changed in them. For this and purpose, a new view of pg_stat_slru has appeared .pg_stat_activity
In the view, the pg_stat_activity
new column is leader_id
. For processes participating in parallel requests, it is populated with the number of the leading process. And leading process leader_id
is a process number pid
.The following query shows which queries and which processes are currently running in parallel:SELECT query, leader_pid,
array_agg(pid) filter(WHERE leader_pid != pid) AS members
FROM pg_stat_activity
WHERE leader_pid IS NOT NULL
GROUP BY query, leader_pid;
There are changes to the list of waiting events. Added two new events : BackupWaitWalArchive
and RecoveryPause
. And the other two were given more accurate names: RecoveryWalStream -> RecoveryRetrieveRetryInterval
, RecoveryWalAll -> RecoveryWalStream
.And two more new wait events occurring on the replica : RecoveryConflictSnapshot
(conflict with VACUUM, which deleted the necessary version of the rows) and RecoveryConflictTablespace
(conflict related to the removal of the table space).pg_stat_statements
Until now, the extension has pg_stat_statements
treated requests with FOR UPDATE
and without a phrase as the same request. Now requests with FOR UPDATE
are recorded separately .The amount of information collected has increased. From now on, not only information on resources for executing commands is recorded, but also statistics on generated journal entries . New presentation columns: wal_bytes
- volume of generated records, wal_records
- number of generated records, wal_num_fpw
- number of full page images (full page writes).This was made possible thanks to the prepared infrastructure for tracking the use of WAL. Therefore, now EXPLAIN
with a new option it WAL
will show the volume of generated records:CREATE TABLE t (id int);
EXPLAIN (ANALYZE, WAL, COSTS OFF, TIMING OFF, SUMMARY OFF)
INSERT INTO t VALUES(1);
QUERY PLAN
--------------------------------------
Insert on t (actual rows=0 loops=1)
WAL: records=1 bytes=59
-> Result (actual rows=1 loops=1)
Extension auto_explain
, VACUUM
s VERBOSE
, and autovacuum
also use the created infrastructure and will output volumes of WAL.Coming back to pg_stat_statements
. If the new parameter pg_stat_statements.track_planning is enabled , then additional statistics related to the scheduler will be recorded for each operator: number of plan builds; total planning time; minimum and maximum time of one planning, as well as mean and standard deviation.Accounting for resources allocated to the scheduler is reflected in another patch that is not related to pg_stat_statements
. EXPLAIN
with the option BUFFERS
will report the number of buffers used at the planning stage :EXPLAIN (ANALYZE, BUFFERS, COSTS OFF, TIMING OFF)
SELECT * FROM pg_class;
QUERY PLAN
------------------------------------------------
Seq Scan on pg_class (actual rows=386 loops=1)
Buffers: shared hit=9 read=4
Planning Time: 0.782 ms
Buffers: shared hit=103 read=11
Execution Time: 0.219 ms
Magazine
log_statement_sample_rate
SQL, , log_min_duration_sample
( ).
, log_min_duration_statement
, .. log_min_duration_statement
, log_min_duration_sample
, , log_statement_sample_rate
.
, log_transaction_sample_rate
, , .- , ( log_min_error_statement), .
log_parameter_max_length_on_error
. 0, .
log_parameter_max_length_on_error
SQL, , .
( log_statements
log_duration
) , : log_parameter_max_length
, , . - You can now write the process type (
pg_stat_activity.backend_type
) to the server log . For this log_line_prefix
, a special symbol is provided in the parameter %b
. And if the log is written in csv ( log_destination=csvlog
) format , then the column is backend_type
already included there.
Progress
New ideas pg_stat_progress_analyze
and pg_stat_progress_basebackup
allow you to track the progress of the collection of statistics command ANALYZE
and the backup utility pg_basebackup
, respectively.Optimization
Calculation of immutable functions in the FROM clause at the planning stageThe patch of Aleksandr Kuzmenkov and Aleksandr Parfyonov (both from Postgres Professional) helps in cases where theFROM
call contains a function call that is actually a constant. In this case, instead of making the connection, the constant value is substituted in the necessary places of the request.Here's how this happens with an example of a query related to full-text search:EXPLAIN (COSTS OFF)
SELECT subject, ts_rank_cd(tsv, q) AS rank
FROM mail_messages, to_tsquery('english', 'tuple') q
WHERE tsv @@ q
ORDER BY rank DESC;
QUERY PLAN
------------------------------------------------------------------
Sort
Sort Key: (ts_rank_cd(mail_messages.tsv, '''tuple'''::tsquery)) DESC
-> Bitmap Heap Scan on mail_messages
Recheck Cond: (tsv @@ '''tuple'''::tsquery)
-> Bitmap Index Scan on mail_messages_tsv_idx
Index Cond: (tsv @@ '''tuple'''::tsquery)
There is no connection, and the value 'tuple' :: tsquery is substituted in the query already at the planning stage. Version 12 had a completely different picture:EXPLAIN (COSTS OFF)
SELECT subject, ts_rank_cd(tsv, q) AS rank
FROM mail_messages, to_tsquery('english', 'tuple') q
WHERE tsv @@ q
ORDER BY rank DESC;
QUERY PLAN
-----------------------------------------------------
Sort
Sort Key: (ts_rank_cd(mail_messages.tsv, q.q)) DESC
-> Nested Loop
-> Function Scan on q
-> Bitmap Heap Scan on mail_messages
Recheck Cond: (tsv @@ q.q)
-> Bitmap Index Scan on mail_messages_tsv_idx
Index Cond: (tsv @@ q.q)
Incremental sortingWhen sorting by many keys is needed (k1, k2, k3 ...), the scheduler can now take advantage of the knowledge that the data is already sorted by several of the first keys (for example, k1 and k2). In this case, you can not re-sort all the data again, but divide them into consecutive groups with the same values k1 and k2, and “sort” by the key k3.Thus, the entire sorting splits into several consecutive sorts of smaller size. This reduces the amount of memory needed, and also allows you to give out the first data before all sorting is complete.For example, in the demobase on the tickets table there is an index on the ticket_id column. Data received from the index will be sorted by ticket_id, so the following query will use incremental sorting:EXPLAIN (analyze, costs off, timing off)
SELECT * FROM tickets ORDER BY ticket_no, passenger_id;
QUERY PLAN
------------------------------------------------------------------------------
Incremental Sort (actual rows=2949857 loops=1)
Sort Key: ticket_no, passenger_id
Presorted Key: ticket_no
Full-sort Groups: 92184 Sort Method: quicksort Memory: avg=31kB peak=31kB
-> Index Scan using tickets_pkey on tickets (actual rows=2949857 loops=1)
Planning Time: 2.137 ms
Execution Time: 2230.019 ms
Incremental sorting functionality can be disabled with the enable_incrementalsort parameter. In this case, the sorting will take noticeably longer:SET enable_incrementalsort = off;
EXPLAIN (analyze, costs off, timing off)
SELECT * FROM tickets ORDER BY ticket_no, passenger_id;
QUERY PLAN
-----------------------------------------------------------------------
Gather Merge (actual rows=2949857 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (actual rows=983286 loops=3)
Sort Key: ticket_no, passenger_id
Sort Method: external merge Disk: 107024kB
Worker 0: Sort Method: external merge Disk: 116744kB
Worker 1: Sort Method: external merge Disk: 107200kB
-> Parallel Seq Scan on tickets (actual rows=983286 loops=3)
Planning Time: 3.994 ms
Execution Time: 12291.844 ms
The idea of incremental sorting was proposed back in 2013 by Alexander Korotkov (Postgres Professional), and now, seven years later, the patch was brought by James Coleman to a state accepted by the community.TRUNCATE accelerationWhen TRUNCATE
scanning occurs shared_buffers
to remove table buffers from shared memory. Previously, scanning was performed three times, for each table layer: MAIN (main data layer), FSM (free space map), VM (visibility map). Now the logic has changed, instead of triple operation, buffers are scanned only once. With large values, shared_buffers
this gives a tangible gain.Partial decompression TOASTWhen there is no need to read completely TOAST, limiting it to a slice at the beginning or close to the beginning, then it does not make sense to unclench it completely. Compressed TOAST is read in iterations: read a piece, if there is no necessary data, then expand it and read on. Suggested by a Google Summer of Code student, Binguo Bao, who gives an example:CREATE TABLE slicingtest (
id serial PRIMARY KEY,
a text
);
INSERT INTO slicingtest (a) SELECT
repeat('1234567890-=abcdefghijklmnopqrstuvwxyz', 1000000) AS a FROM
generate_series(1,100);
\timing
SELECT sum(length(substr(a, 0, 20))) FROM slicingtest;
Time: 28.123 ms
With the patch, an order of magnitude faster:Time: 2.306 ms
Parallel VACUUMIn his article on this subject, Yegor Rogov explains in detail this important step in parallelization. In short: “Patch Masahiko Sawada, which allows you to perform cleaning in parallel. The table itself is still cleared by one (leading) process, but to clean indexes, it can now start background workflows, one for each index. In manual mode, this allows you to speed up the cleaning of large tables with multiple indexes; automatic cleaning does not use this feature yet. ”Auto clean when pasted into a tableFor this patch (also known as the Berserk auto-vacuum), we need to thank Dorofei Proleskovsky, who proposed a solution to the following problem: auto-cleaning does not come to the append-only table, because they do not have “dead” versions of the rows. Because of this, the visibility map is not updated, making only index-only scans ineffective, and when cleaning does come to prevent transaction counter overflows, it needs to do a lot of work at once. Now this situation has been fixed: auto-cleaning will also work on adding lines. Two new server parameters ( autovacuum_vacuum_insert_threshold
and autovacuum_vacuum_insert_scale_factor
) appeared, similar to those for modifications ( autovacuum_vacuum_threshold
and autovacuum_vacuum_scale_factor
).Hash Aggregate Memory ManagementHash aggregation may require more RAM than the scheduler thought and than indicated in work_mem
. Previously, such a scheduler error led to the fact that the size was work_mem
ignored and memory was allocated as much as needed for the operation or arrival of OOM Killer. Now the algorithm may not go beyond work_mem
, and if necessary, use temporary files on disk. To control the behavior of the scheduler, the following parameters appeared: enable_groupingsets_hash_disk
and enable_hashagg_disk
.Optimizing UPDATE for tables with generated columnsIn version 12, the generated columns were recalculated during any row update, even if this change did not affect them in any way. Now they will be recalculated only when it is really needed (if their base columns have changed).This optimization, for example, can significantly speed up the updating of tables with a generated type column tsvector
, since the function is to_tsvector()
quite expensive.Access from the trigger to the list of changed columnsA small patch that adds a TriggerData
bitmap of changed columns to the structure . This information can be used by general trigger functions, such as tsvector_update_trigger()
or lo_manage()
, in order not to do unnecessary work.Use of several advanced statistics when evaluatingIn PG12, the scheduler was not able to use several advanced statistics for the same table at the same time. For example, imagine a situation where two advanced statistics are constructed for different sets of columns, and columns from one set and from another participate in the query. Now the planner has access to all available information.Infrastructure for parallelization and COPY (see also this patch. )PostgreSQL concurrency still works for read-only queries. There are difficulties with writers, and one of them is blocking processes that simultaneously perform one task (included in a common parallel group). It is believed that the locks of such processes do not conflict - for example, several processes can hold an exclusive lock on the same table. This requires special care from the kernel developers, but otherwise they would constantly have deadlocks.But there are two exceptions:- relation extension lock, which is captured when new pages are added to the end of the data file, and
- page lock, which is used when moving GIN index items from the wait list to the main tree.
(You can read more in this article. )Such locks should conflict even between processes of the same parallel group - which implements this patch. But these locks can never lead to deadlocks, so they are excluded from the scan.For the user, in general, nothing changes, but this patch is important because, firstly, it paves the way for parallel INSERT and COPY, and secondly, it eliminates one of the PostgreSQL bottlenecks under high load conditions (which can be heard in the report HL ++ ).Safety
SKH PRHprimes replaced Replaced EDH primes (Diffie-Hellman Ephemeral Keys) using the now defunct SKIP protocol.initdb: the default settings for authenticationhave changed The default access settings for local and network connections have changed when initdb starts. Now inpg_hba.conf
for local connections instead of the authentication methodtrust
will bepeer
(or md5 if peer is not supported), andmd5
for network connections. Initially, more liberal measures were discussed: warning in the documentation. Then tougher:scram-sha-256
. As a result, we decided to limit ourselves topeer
andmd5
.Using explicit_bzeroImportant patch. The bzero () and explicit_bzero () OS functions write bytes containing into the indicated memory areas '\0'
(see, for example , Linux). These patches are just the beginning: there are many sections of memory in which passwords and other sensitive information can remain. We decided to start from places like libpq, in which the whole file with passwords can remain in memory after reading .pgpass, and from cleaning after closing the connection. In be-secure-common.c now there is an overwriting of the entered secret phrase in SSL, which appears in the line (path) of the error.Added "password_protocol" parameter to libpqThis patch allows libpq to control which password transfer protocol is used during the connection. After receiving this parameter, libpq will refuse authentication if the protocol is weaker than the specified one. By default, this parameter plaintext
, that is, all protocols are suitable.Mandatory access for TRUNCATEThis patch enables extensions to embed Mandatory Access Control (MAC) for a TRUNCATE operation. Rights to it will now be checked by the sepgsql extension . SELinux Reference Policy and Redhat-based Linux distributions do not support SELinux checking on db_table {truncate}. In this case, sepgsql will be used with 'deny_unknown' equal to 1, and TRUNCATE will fail.Availability of GUC ssl_passphrase_commandA simple but useful patch. Now the value of the ssl_passphrase_command parameter will be seen only by superuser. The parameter specifies an external command that is called when a password is required to decrypt an SSL file, for example, a private key.Localization
Versioning libc collation rulesFor ICU collation rules, version numbers are stored in the database. Each time the rule is used (sorting, character comparison), the saved version number is checked with the current version in the ICU library in the OS, and in case of discrepancies, a warning is issued. This allows you to find that certain indexes built according to the modified sorting rules may be incorrect and should be rebuilt. By rebuilding the indexes with the commandALTER COLLATION ... REFRESH VERSION
, the version of the sorting rule in the database is updated and warnings are no longer issued.But that was only for the ICU. Now the version number is also stored for libc sorting rules:SELECT collname, collversion FROM pg_collation WHERE collname = 'ru_RU';
collname | collversion
----------+-------------
ru_RU | 2.27
Which makes it possible to issue warnings when a library changes in the OS. Very relevant in light of the transition to glibc 2.28, where many sorting rules have changed, and the corresponding indexes should be rebuilt.But until they switched to 2.28, everything is calm:ALTER COLLATION "ru_RU" REFRESH VERSION;
NOTICE: version has not changed
ALTER COLLATION
Full Text Search
Fulltext search for Greek languageNo comments.dict_int learned to handle absolute values.The dict_int template dictionary (aka extension) added the ability to remove the sign from a number.
CREATE EXTENSION dict_int;
SELECT ts_lexize('intdict', '-123');
ts_lexize
-----------
{-123}
(1 row)
ALTER TEXT SEARCH DICTIONARY intdict (ABSVAL = true);
SELECT ts_lexize('intdict', '-123');
ts_lexize
-----------
{123}
(1 row)
That is, this time the absolute value was recognized.Partitioning
BEFORErow triggersBEFORE
on a partitioned tableIn version 12, you cannot create row triggerson a partitioned table. On separate sections - please, but not on the entire table at once. Now, aBEFORE FOR EACH ROW
trigger created on a partitioned table will be automatically inherited and work for all sections. But with the condition that if it is a trigger onUPDATE
, then the partition key in it can only be changed within the current section.Support for partitioned tables in logical replicationPreviously, including a partitioned table in a publication caused an error:CREATE PUBLICATION pub FOR TABLE p;
ERROR: "p" is a partitioned table
DETAIL: Adding partitioned tables to publications is not supported.
HINT: You can add the table partitions individually.
Now it works.Improved sectional JOIN algorithmStarting from the 11th version, the scheduler can join partitioned tables section by section, but only if the boundaries of the sections exactly match. Now the algorithm has been improved: it will work if the section of one table is completely included in the section of another, even if their sizes do not match (for example, if one table is partitioned by day and the other by month). The new algorithm works for partitioning by ranges and by lists.Sectional FULL OUTER JOIN Sectionalization joinnow works for full outer joins recorded with a phrase USING
.tableam
In this attractive and promising, but difficult area, there are no radical advances regarding PostgreSQL 12. There are no ready-made plug-in storages such as zheap
other than heap), but work continues on the API.A higher level of abstraction in determining the size of the tableRobert Haas rewrote the code, changing its architecture in favor of abstract layers, so as not to duplicate the code in the future. This piece refers to estimate_rel_size
- the size of the layers (forks) of the table.You can use table access methods with relcache.This patch brings the memory management capabilities of table access methods to the capabilities of index methods.tableam and TOASTTOAST are largely designed for storageheap
Therefore, when creating new table access methods, you can go in two ways: help developers of new methods integrate insert, update and delete TOAST records into them or delegate work with TOAST to code using the traditional PostgreSQL storage - heap. A series of 5 patches uses tuple slots to implement insert / update / delete operations and can help those that go both ways.fsync
Handling fsync errors in pg_receivewal and pg_recvlogicalThe fight against fsync () continues. PostgreSQL believes that a successful fsync () call means that all data in the file has been flushed to disk, but this does not always happen (OS dependent) and may result in data loss. PG13 decided that it was necessary to deal with the utilitiespg_receivewal
andpg_recvlogical
. Currently, the default behavior is this: these utilities will write fsync errors to the logs, restore the connection and continue as if nothing had happened. As a result, the WAL contains information about files that were successfully copied, which, in fact, were not correctly flushed to disk. So it’s better to interrupt the utility. The fate of pg_dump, pg_basebackup, pg_rewind and pg_checksums was also discussed, but so far have limited themselves to these two.Protection against setting incorrect flags for fsync ()This patch checks if the flags are set correctly when receiving the file descriptor for fsync () - directories are open only for reading, and files for writing or both.Backup and replication
Pause during recovery before reaching the recovery pointIf during the recovery the WALs have ended, butrecovery_target_time
they have not reached thespecifiedone, the server completes the recovery and switches to normal operation mode. Now it will not be so. The recovery process will pause, as reported in the log, and the administrator will have the opportunity to insert the missing WAL segments and continue the recovery.The ignore_invalid_pages parameterWhen the recovery process on a replica finds a link to an invalid page in the WAL record,panic
-ahappens. The inclusion of the parameter will help to overcome it.ignore_invalid_pages
. Recovery will continue with possible loss of integrity, data and other most serious consequences. The parameter is intended for server developers and should be used in those cases when you still need to try to complete the recovery and start the replica.Changing primary_conninfo without restartingSergey Kornilov's patch, which allows you to change the settings primary_conninfo
, primary_slot_name
and wal_receiver_create_temp_slot
without restarting the server. Actually, for the sake of this, they abandoned the file recovery.conf
in the 12th release. Pg_basebackupbackupmanifests now creates a “manifest” - a JSON file containing information about the backup made (file names and sizes, necessary WAL files, as well as checksums for everything and everything).The new pg_validatebackup utility checks the backups for compliance with the manifest, and also checks for the availability and correctness of the WAL files necessary for recovery using the pg_waldump utility (this applies only to WAL files inside the backup itself, and not in the archive).This will allow you to detect situations where the backup files were damaged or disappeared, or when recovery became impossible due to the lack of the necessary log files.Limiting unread data replication slot The replicationslot is a convenient but dangerous mechanism: if the client does not read the data from the slot on time, unread WAL records can take up all the space on the server. Now using the parametermax_slot_wal_keep_size
You can set a limit on the maximum amount of disk space that can be occupied by unread data. If at the next checkpoint it turns out that the size is exceeded, the slot is disabled, and the place is freed.Windows
Support for Unix sockets on Windows Unix-domain sockets are supported on Windows 10, although they are disabled by default.Documentation
There are two new applications in the documentation.After a long discussion , Appendix M. Glossary appeared . There are currently 101 terms in the glossary.The ability to highlight the color of diagnostic messages of console utilities using a variable PG_COLOR
was earlier. This is now documented in Appendix N. Color Support . Peter Eisentrout's original intent in this patch was to make the colorized output turned on by default. And for those who did not want this, it was proposed to explicitly set the variableNO_COLOR
. But there were more opponents of color differentiation of messages among those discussing the patch. Therefore, they decided only to document the available opportunities. And we got a new section of the first level in the documentation.
PG13, , PG14 . , . .