😘 🔚 👨🏽‍✈️ Python ORM performance testing using the TPC-C benchmark method 👨🏾‍🌾 🀄️ 👶🏿

When writing applications in Python, object-relational mappers (ORMs) are often used to work with databases. Examples of ORMs are SQLALchemy, PonyORM, and the object-relational mapper included with Django. When choosing ORM, its performance plays a rather important role.

On Habr, and on the Internet as a whole, it is possible to find not one performance test. As an example of a quality python ORM benchmark, you can use the Tortoise ORM benchmark ( link to the repository ). This benchmark analyzes the speed of six ORMs for eleven different types of SQL queries.

In general, the tortoise benchmark makes it possible to evaluate the speed of query execution using different ORMs, but I see one problem with this approach to testing. ORMs are often used in web applications where several users can send different requests at the same time, but I have not found a single benchmark that evaluates the performance of ORM under such conditions. As a result of this, I decided to write my benchmark and compare PonyORM and SQLAlchemy with it. As a basis, I took the TPC-C benchmark.

Company TPC since 1988, develops tests, aimed at processing data. They have long become an industry standard and are used by almost all vendors of equipment on various samples of hardware and software. The main feature of these tests is that they are aimed at testing under enormous load in conditions as close as possible to real ones.

TPC-C simulates a warehouse network. It includes a combination of five simultaneously executed transactions of various types and complexity. The database consists of nine tables with a large number of records. Performance in the TPC-C test is measured in transactions per minute.

I decided to test two Python ORMs (SQLALchemy and PonyORM) using the TPC-C testing method adapted for this task. The purpose of the test is to evaluate the speed of transaction processing when several virtual users access the database at the same time.

Test description

In the test I wrote, a database is first created and populated, which is a database of a network of warehouses. The database schema looks like this :

The database consists of eight relationships:

Warehouse - warehouse
District - warehouse area
Order - Order
OrderLine - order line (order item)
Stock - quantity of a certain product in a specific warehouse
Item - item
Customer - customer
History - Customer payment history

, e . . , :

new_order ( ) — 45%
payment ( ) — 43%
order_status ( ) — 4%
delivery ( ) — 4%
stock_level ( ) — 4%

, TPC-C.

TPC-C , , ORM, . 64+ , .

,
. : Stock 100 000 * W, W — , : 100 * W
5 . Payment ID, . ID,
NewOrder. , , Order, NewOrder. , NewOrder. , , , , , . Order bool “is_o_delivered”, False, ,

, .

New Order

: id id
id
()
. Item.
, .

Payment

: id id
id
.
1
, ,
.

Order Status

Transactions served by customer id
The client and his last order are taken from the database
The status is taken from the order (delivered or not) and order items

Delivery

Transactions served by warehouse id
The warehouse is requested from the database by id and all its sections
For each site, the oldest of the undelivered orders is taken. In each of them, the delivery status changes to True
From the database are taken users whose orders were delivered during this transaction, and each of them increases the delivery counter

Stock level

Transactions served by warehouse id
The warehouse is requested from the database by id
The last 20 orders of this warehouse are requested from the database
For each item of these orders from the database, the quantity of the remaining goods in the warehouse is requested

Test results

Two ORMs are involved in testing:

SQLAlchemy The graphs are depicted by a blue line.
PonyORM. The graphs are depicted by the yellow line.

10 2 , . multiprocessing.

—
—

PostgreSQL

, TPC-C. Pony .

:
Pony — 2543 /
SQLAlchemy — 1353.4 /

ORM . .

“New Order”

Average speed:
Pony - 3349.2 trans / min
SQLAlchemy - 1415.3 trans / min

Transaction “Payment”

Average speed:
Pony - 7175.3 trans / min
SQLAlchemy - 4110.6 trans / min

Transaction “Order Status”

Average speed:
Pony - 16645.6 trans / min
SQLAlchemy - 4820.8 trans / min

Transaction “Delivery”

Average speed:
SQLAlchemy - 716.9 trans / min
Pony - 323.5 trans / min

Transaction “Stock Level”

Average speed:
Pony - 677.3 trans / min
SQLAlchemy - 167.9 trans / min

Test Results Analysis

After receiving the results, I analyzed why, in various situations, one ORM is faster than another and came to the following conclusions:

4 5 PonyORM , , SQL PonyORM Python SQL, , SQLALchemy SQL . PonyORM:

stocks = select(stock for stock in Stock
if stock.warehouse == whouse
and stock.item in items).order_by(Stock.id).for_update()

SQLAlchemy:

stocks = session.query(Stock).filter(
Stock.warehouse == whouse, Stock.item.in_(items)).order_by(text("id")).with_for_update()

SQLAlchemy Delivery , UPDATE, , .

, SQLAlchemy:

INFO:sqlalchemy.engine.base.Engine:UPDATE order_line SET delivery_d=%(delivery_d)s WHERE order_line.id = %(order_line_id)s
INFO:sqlalchemy.engine.base.Engine:(
{'delivery_d': datetime.datetime(2020, 4, 6, 14, 33, 6, 922281), 'order_line_id': 316},
{'delivery_d': datetime.datetime(2020, 4, 6, 14, 33, 6, 922272), 'order_line_id': 317},
{'delivery_d': datetime.datetime(2020, 4, 6, 14, 33, 6, 922261))

Pony Update:

SELECT "id", "delivery_d", "item", "amount", "order"
FROM "orderline"
WHERE "order" = %(p1)s
{'p1':911}

UPDATE "orderline"
SET "delivery_d" = %(p1)s
WHERE "id" = %(p2)s
  AND "order" = %(p3)s
{'p1':datetime.datetime(2020, 4, 7, 17, 48, 58, 585932), 'p2':5047, 'p3':911}

UPDATE "orderline"
SET "delivery_d" = %(p1)s
WHERE "id" = %(p2)s
  AND "order" = %(p3)s
{'p1':datetime.datetime(2020, 4, 7, 17, 48, 58, 585990), 'p2':5048, 'p3':911}

Based on the results of this testing, I can say that Pony is much faster when fetching from the database, and SQLAlchemy in some cases can produce significantly faster Update queries.

In the future, I plan to test other ORMs (Peewee, Django) in this way.

References

Test code: SQLAlchemy repository link
: documentation , community
Pony: documentation , community

Python ORM performance testing using the TPC-C benchmark method

Test description

Test results

Test Results Analysis

More articles: