👨🏼 👨🏻 🙋🏽 Simple detection of performance problems in PostgreSQL ✏️ 🦇 ☄️

Is there a very large and large database in the world that does not suffer from performance problems from time to time? I bet there aren't that many. Therefore, every DBA (database administrator) responsible for PostgreSQL needs to know how to track potential performance problems in order to figure out what is actually happening.

PostgreSQL performance improvement after setting options

Many people think that changing parameters in postgresql.conf is the real path to success. However, this is not always the case. Of course, most often good database configuration options are very useful. However, in many cases, real problems will arise due to a strange request hidden deep in some application logic. It is even likely that the queries causing the real problems are not the ones you noticed. A natural question arises: how can we track these requests and find out what is actually happening? My favorite tool for this is pg_stat_statements, which should always be turned on in my opinion if you are using PostgreSQL 9.2 or higher (please do not use it in older versions).

Enabling pg_stat_statements

To enable pg_stat_statements on your server, change the following line in postgresql.conf and restart PostgreSQL:

shared_preload_libraries = ‘pg_stat_statements’

After downloading this module to the server, PostgreSQL will automatically begin to collect information. The good thing is that the module overhead is really very low (the overhead is basically just “noise”).

Then run the following command to create the view needed to access the data:

CREATE EXTENSION pg_stat_statements;

The extension will create a view called pg_stat_statements and make the data easily accessible.

Detecting Slow Queries in PostgreSQL

The easiest way to find the most interesting queries is to sort the output of pg_stat_statements by total_time:

	
SELECT * FROM pg_stat_statements ORDER BY total_time DESC;

The beauty here is that the type of request that is the most time-consuming will naturally appear at the top of the list. The best way is to go from the first to, say, the 10th request and see what happens there.

In my opinion, it is impossible to configure the system without viewing the most time-consuming queries on the database server.

Learn more about detecting slow queries in PostgreSQL.

PostgreSQL in-depth performance analysis

pg_stat_statements can offer much more than just a request and the time it took. Here is the presentation structure:

	
test=# \d pg_stat_statements
View "public.pg_stat_statements"
Column               |     Type         | Collation | Nullable | Default
---------------------+------------------+-----------+----------+---------
 userid              | oid              |           |          |
 dbid                | oid              |           |          |
 queryid             | bigint           |           |          |
 query               | text             |           |          |
 calls               | bigint           |           |          |
 total_time          | double precision |           |          |
 min_time            | double precision |           |          |
 max_time            | double precision |           |          |
 mean_time           | double precision |           |          |
 stddev_time         | double precision |           |          |
 rows                | bigint           |           |          |
 shared_blks_hit     | bigint           |           |          |
 shared_blks_read    | bigint           |           |          |
 shared_blks_dirtied | bigint           |           |          |
 shared_blks_written | bigint           |           |          |
 local_blks_hit      | bigint           |           |          |
 local_blks_read     | bigint           |           |          |
 local_blks_dirtied  | bigint           |           |          |
 local_blks_written  | bigint           |           |          |
 temp_blks_read      | bigint           |           |          |
 temp_blks_written   | bigint           |           |          |
 blk_read_time       | double precision |           |          |
 blk_write_time      | double precision |           |          |

It is quite useful to look at the stddev_time column as well. If the standard deviation is large, you can expect some of these queries to be fast and some to be slow, which can lead to poor user experience.

The “rows” column can also be quite informative. Suppose 1,000 calls returned 1,000,000,000 rows: in fact, this means that each call returned an average of 1 million rows. It's easy to understand that returning so much data all the time is not too good.

If you want to check whether a particular type of request shows poor caching performance, shared_ * will be of interest. In short: PostgreSQL can tell you the cache hit rate for each individual query type if pg_stat_statements is enabled.

It also makes sense to look at the temp_blks_ * fields. Each time PostgreSQL needs to access a disk for sorting or materialization, temporary blocks are required.

Finally there are blk_read_time and blk_write_time. Usually these fields are empty if track_io_timing is not enabled. The idea here is to be able to measure the amount of time that a particular type of request spends on I / O. This will allow you to answer the question of whether your system is tied to I / O or CPU. In most cases, it is recommended that you enable I / O timing, as this will give you important information.

Work with Java and Hibernate

pg_stat_statements gives good information. However, in some cases, the request may be aborted due to a configuration variable:

test=# SHOW track_activity_query_size;
track_activity_query_size
---------------------------
1024
(1 row)

For most applications, 1024 bytes is absolutely sufficient. However, this is usually not the case if you are using Hibernate or Java. Hibernate tends to send insanely long queries to the database, and so SQL code can be cut long before the relevant parts are run (e.g. FROM clause, etc.). Therefore, it makes sense to increase track_activity_query_size to a higher value (possibly 32.786).

Useful Queries for Identifying Bottlenecks in PostgreSQL

There is one query that I found particularly useful in this context: The following query shows 20 queries that take a lot of time:


test=# SELECT substring(query, 1, 50) AS short_query,
              round(total_time::numeric, 2) AS total_time,
              calls,
              round(mean_time::numeric, 2) AS mean,
              round((100 * total_time / sum(total_time::numeric) OVER ())::numeric, 2) AS percentage_cpu
FROM  pg_stat_statements
ORDER BY total_time DESC
LIMIT 20;
           short_query                              | total_time | calls | mean | percentage_cpu
----------------------------------------------------+------------+-------+------+----------------
SELECT name FROM (SELECT pg_catalog.lower(name) A   | 11.85      | 7     | 1.69 | 38.63
DROP SCHEMA IF EXISTS performance_check CASCADE;    | 4.49       | 4     | 1.12 | 14.64
CREATE OR REPLACE FUNCTION performance_check.pg_st  | 2.23       | 4     | 0.56 | 7.27
SELECT pg_catalog.quote_ident(c.relname) FROM pg_c  | 1.78       | 2     | 0.89 | 5.81
SELECT a.attname, +                                 | 1.28       | 1     | 1.28 | 4.18
SELECT substring(query, ?, ?) AS short_query,roun   | 1.18       | 3     | 0.39 | 3.86
CREATE OR REPLACE FUNCTION performance_check.pg_st  | 1.17       | 4     | 0.29 | 3.81
SELECT query FROM pg_stat_activity LIMIT ?;         | 1.17       | 2     | 0.59 | 3.82
CREATE SCHEMA performance_check;                    | 1.01       | 4     | 0.25 | 3.30
SELECT pg_catalog.quote_ident(c.relname) FROM pg_c  | 0.92       | 2     | 0.46 | 3.00
SELECT query FROM performance_check.pg_stat_activi  | 0.74       | 1     | 0.74 | 2.43
SELECT * FROM pg_stat_statements ORDER BY total_ti  | 0.56       | 1     | 0.56 | 1.82
SELECT query FROM pg_stat_statements LIMIT ?;       | 0.45       | 4     | 0.11 | 1.45
GRANT EXECUTE ON FUNCTION performance_check.pg_sta  | 0.35       | 4     | 0.09 | 1.13
SELECT query FROM performance_check.pg_stat_statem  | 0.30       | 1     | 0.30 | 0.96
SELECT query FROM performance_check.pg_stat_activi  | 0.22       | 1     | 0.22 | 0.72
GRANT ALL ON SCHEMA performance_check TO schoenig_  | 0.20       | 3     | 0.07 | 0.66
SELECT query FROM performance_check.pg_stat_statem  | 0.20       | 1     | 0.20 | 0.67
GRANT EXECUTE ON FUNCTION performance_check.pg_sta  | 0.19       | 4     | 0.05 | 0.62
SELECT query FROM performance_check.pg_stat_statem  | 0.17       | 1     | 0.17 | 0.56
(20 rows)

The last column is particularly noteworthy: it shows the percentage of total time spent on one query. This will help you figure out how the query affects overall performance or not.