PostgreSQL Industrial Tuning Approach: Database Experiments. Nikolay Samokhvalov

I suggest that you familiarize yourself with the decoding of the report by Nikolay Samokhvalov "Industrial approach to PostgreSQL tuning: database experiments"


Shared_buffers = 25% - is it a lot or a little? Or just right? How to understand if this recommendation is quite outdated, recommendation in your particular case?


It's time to approach the issue of selecting postgresql.conf parameters "in an adult way." Not with the help of blind "autotuners" or outdated tips from articles and blogs, but based on:


  1. strictly calibrated database experiments performed automatically, in large quantities and under conditions as close as possible to "combat" ones
  2. deep understanding of the features of the DBMS and OS.

Using the Nancy CLI ( https://gitlab.com/postgres.ai/nancy ), we will consider a specific example - the notorious shared_buffers - in different situations, in different projects and try to figure out how to choose the optimal setting for our infrastructure, database and load .



It will be about experiments on databases. This is a story that lasts a little more than six months.



. Postgres 14 . -. Postgres .


RuPostgres Meetup, 2- . 2 000 . RuPostgres.org.


, Highload, , Postgres .



Postgres- 11 .



, Postgres, , 2010- . , DBA, - . , - , .


, . . . . . , , . . , DBA, .



:


  • « » – 8 GB 25 % shared_buffers . shared_buffers .
  • «».


?


  • , . , , Open Source, . . Open Source. , Open Source. , , . .
  • , : .


?



-, DBA instances, . ., — .



. , - , - . , , .



. Pg_stat_statements – . Postgres pgBadger.


. . SELECT * FROM table where «?» «$» Postgres 10. – index scan seq scan. . , index scan. , 90 % , seq scan , Postgres . pg_stat_statements, - .


, «log_min_duration_statement = 0», . . , . - , , , , .


DBA ?



, - . ? , - - instance, . DBA, staging. . . , production. DBA production, . , , , , . . .



  • .
  • .
  • SQL- ( ).
  • ( ).


. Postgres. . Postgres, . , DBA , DBA . . . 10 , .


– . , , .




, . , 1 000 default_statistict_target – . , production.



, , , .



. .


  • – . . - , , production. . instance Google, . . . . Postgres.
  • – . . . .
  • – . .
  • – , , . . . , , . .


. pg_stat_statements. – , . – .



default_statistics_target = 100, =1 000. , . 8 % .



, pgBadger pg_stat_statements. . , - 88 %. . , , . , . .



, «ALTER TABLE … ALTER COLUMN» 100 . , . . , , .




. CI- . .



: , , , . - - . , .


.



-, . production, . , , . .


, , , . . , . Postgres. . production . , , auto_explain.


, , . , , .


Nancy CLI – « »



. . . , . Open Source Nancy CLI. , .



Nancy — Open Source, Gitlab. , . . help .


, . . , . – delete 40 000 000 IO, , , . . . . , - , . production. .



? , . . , MacBook . , . . - instance , , .


EC2 Instance, . . , 500 i3 instance, i3-16-xlarge. 500 64 . 15 . . . , , – 70 %, . . .



Postgres . - 12- .



. :


  • Dump/sql-.
  • – PGDATA . -. , . , Google . production. .
  • , , Postgres - . pgbench. pgbench. «db-pgbench». , scale. , .


:


  • SQL . .
  • . . . . , . pgreplay , Nancy.
  • . , . , . pgbench .


  • SQL - , . . - , , ANALAZE . , . , SQL.
  • . , , , 100 . . , . , , . . pgdata . Postgres , . .


  • , , pgstat***. – pg_stat_statements, pg_stat_kcacke. , . pg_stat_bgwriter pgwriter , checkpoint , . . , shared_buffers , , .
  • Postgres. – .
  • – FlameGraphs.
  • , pgreplay pgbench , . latency TPS. , .
  • .
  • CPU IO. EC2 instance , 100 instances 100 , 10 000 . , instance, - . . . sysbench , , . . , CPU IO .


?



, . , Open Source pgreplay . . , , .


duration timestamp. . – ?



https://gist.github.com/NikolayS/08d9b7b4845371d03e195a8d8df43408


, . , . pg_stat_statements, ( ) , .


. , , , . , . , .


, 802 . , bytes_per sec – 300 kB/s . , , .



! , . «syslog».



syslog, . pgbench, , .



– . 161 000 TPS. syslog – Ubuntu 16.04 37 000 TPS. , . . . , , .



CentOS 7, journald , . ., , 44 TPS.



, . , , . syslog, , .



  • IOPS .
  • .
  • , .


pg_stat_statements. , . . pgbench – «-f».


«-f». «@» , . . . , 10 % , 20 %. , production.



, production? ? . postgres-checkup. Open Source. .


. , . . . , , , . , , health_check. DBA, health_check. . . OKmeter, . Postgres. OKmeter.io – , , . .


, , , , . CPU, IO , . . , , checkpoint, io checkpoint bgwriter . .


, - , - . OKmeter. , . - .


, , , . . production. , observing server, . : , , Postgres, - , production .


Postgres-checkup. -, . , – . , . ? . . ? . , checkup, .


. . , .



«» – 003 Postgres-checkup


. . 003. pg_stat_statements, total_time.


total_time , , , . . . ? . , , . . .


, , . .


, , , , , , 15 . , , .



? . Postgres_checkup : total-time, calls, rows, shared_blks_read . . , . pg_stat_statements , , reset. pg_stat_database , pg_stat_statements . , 1 000 000 , , .



, . , 56 . . total_time . , . . duration. duration, .


total_time per second – . , , . . . , , . . , , , .


-, . – – . , CPU 100 %, , . . .


rows . , .


. shared_buffers shared_buffers. , , . , , , , , .


– . . – . , . 101 . , .


. 8 . . , . .


– . calls. , 1 000 000. , . , , 0,01 %. . . , . – 5 % . . . 5 % – .


total_time . 14 % . – 11 % . .


, . , , , , . . - , - . . 0, . . 20 %, .



. workload. , 80 % 90 %. 10-20 . pgbench. random. , , . Postgres 12 .


80-90 % total_time. «@»? calls, , , - . , . pgbench .



001 002.


001 – . . . , , . . , . 75 % . . 10 , . .


002 – , . . SELECT, INSERT, UPDATE, DELETE. SELECT FOR UPDATE, .


, SELECT – 82 % , – 74 % total_time. . . , .



: « shared_buffers?». , – , throughput , . . . TPS QPS.


. 311 select.



. . . , . production, 100% CPU. , , - , .


, 20 , 50 %. . . . , latency 20%- , . , .



:


  • Database Lab.
  • on demand, - – . , , . . standing.
  • . - , , . Nancy , , , .
  • .
  • Postgres. , . , .
  • , , Postgres-checkup.



! .


.


, . . Nancy , ?


- . . , , .


. ? , . , , . . .


, , . . . . . , . , , . , . . .


. , . , ?


. . . . , . . - , - - . , – .


, , . . !


! ! , . GSP?


. . , . . . run on localhost. instance . , . Getlab , GSP. , Google . ??? instances, . -, 70 % . 5-10 % , , . . . , . , , . Google . – 24 . 5 . , .


! ! checkup. stat_statements?


. . – , : . : total_time calls, . , . : . , .


, - ?


. . ?


, , , . - , ?


, .


. , , . , reset stat_statements , , , stat_statements . , , , .


-.


- .


, , – queryid pg_stat_statements . queryid, , .


, .


id?


.


. . . , , 0 …


, , , , , stat_statemetns .


Pg_stat_statements . , track_utility = on, .


, .


java hibernate, , -. , 50-100 . - . – pg_stat_statements.max .


, , . - . . . . pg_stat_statements.max. , 70 %. , , . reset. . 70, , , .


Yes. The default is 5,000. And a lot of this is enough.


Usually yes.


Video:



PS I’ll add on my own that if Postgres contains confidential data and can’t get into the test environment, then you can use PostgreSQL Anonymizer . The scheme is approximately as follows:



All Articles