A few words about R2DBC and PostgreSQL

Recently, I again see that hype has intensified around reactive programming in general, and reactive work with databases in particular. I have a couple of phrases that I would like to say about this.

image

Let me remind you what happened in previous series. At the end of 2017, Spring 5 came out, in which support for reactive programming appeared. In particular, we are talking about WebFlux . Everyone began to sharply try this technology (a typical hype curve), and then an understanding appeared in the industry that not everyone needed reactivism. Key lessons learned:

  • Reactive programming is more difficult to debug
  • Developing reactive systems is more difficult than classic, blocking ones; if you unintentionally block the thread, you are finished. That is why the project appeared (one of the authors -bsideup )
  • Reactive programming, and in particular elasticity, is needed mainly at high loads a la Netflix. If you have 100 RPS, you should probably write code in a blocking style - it's cheaper and easier

In addition, everyone understood one more thing. In order for reactive programming to really unfold, you need the whole system to be reactive, including your database. The problem is that the database may not even have a reactive, or at least asynchronous API. Well, or such an API may be, but it may not be supported in your driver.

Actually, almost everyone around me is developing PostgreSQL systems when I need a classic relational database. That being said, the default choice. We work with PostgreSQL using the old (over 10 years old) JDBC driver - pgjdbc . This driver is good for everyone. There are practically no bugs. This driver is greatly improved in performance (thanks for this, includingvladimirsitnikov) But pgjdbc has one fatal flaw - there is no support for the asynchronous API .

In recent years, a big race has begun to develop alternative drivers for PostgreSQL, which included the development of an asynchronous API. Even Oracle tried to do this, but already closed development in favor of the Loom project and its Fibers . But then Oracle thought better of it and started doing it again .

There are 3 solutions right now:

  1. https://github.com/jasync-sql/jasync-sql - asynchronous driver for PostgreSQL, written in Kotlin, where Netty is the transport
  2. The solution Vert.x uses is https://github.com/mauricio/postgresql-async . It is noteworthy that the project is archived, and the previous jasync-sql driver rewrites just the same (though, the original is written in Scala, and the new solution is written in Kotlin).
    Update I was corrected in the comments, the client from Vert.x is alive and feels good in benchmarks .
  3. The driver from the Spring team is https://github.com/r2dbc/r2dbc-postgresql . This driver immediately provides a reactive API (and not asynchronous), and Netty is used for transport, as in jasync-sql.

Conceptually, these drivers are built on similar principles. All of them use exclusively existing PostgreSQL features:


Unfortunately, drivers cannot change Wire protocol Postgres (as if Oleg Dokuk would not want to - https://www.youtube.com/watch?v=n9tL2I_Big8 ). That is, Postgres still requires a separate Connection for a SELECT query. In addition, Postgres is also creating a new process for each Connection - https://www.postgresql.org/docs/current/tutorial-arch.html .

However, there is good news. Modifying requests can be sent in batches without waiting for the result (using pipeline). For this case, Connection is not required for each request.

That is, to summarize. Reactive (or asynchronous) PG drivers improve the situation in applications (we now use Netty based Non-blocking IO, not spending thread on a query), but specifically on the PostgreSQL side - things are not so good (let's hope for an improvement in the network protocol and database architecture )

Which driver should you look at more closely? If you, like me, use Spring in your work, then r2dbc-postgresql is probably a good choice. It is important that in the latest versions of their framework, the guys from Pivotal pretty well integrated the driver with the usual things, for example, with Spring Data R2DBC(this is still RC). As a result, we do not have to work with a low-level driver API and the Connection Pool, which reduces the likelihood of making a mistake in such a complex project.

The minimum example (which I have successfully spied in the documentation) using Spring Data R2DBC looks pretty familiar:

1. Configure credentials and connection pool:

spring.r2dbc.username=postgres
spring.r2dbc.password=P4$$W0RddD
spring.r2dbc.url=r2dbc:postgresql://localhost:5432/r2dbc
spring.r2dbc.pool.enabled=true
spring.r2dbc.pool.initial-size=10
spring.r2dbc.pool.max-idle-time=1m
spring.r2dbc.pool.max-size=30
spring.data.r2dbc.repositories.enabled=true

2. Create an entity:

public class Customer {
    @Id
    private Long id;

    private String firstName;

    private String lastName;

    public Customer(String firstName, String lastName) {
        this.firstName = firstName;
        this.lastName = lastName;
    }
}

3. Create the usual Repository:

public interface CustomerRepository extends ReactiveCrudRepository<Customer, Long> {
}

4. Use the repository:

@RestController
@RequestMapping("/")
public class CustomerController {

    private CustomerRepository customerRepository;

    @PostMapping("/customer")
    Mono<Void> create(@RequestBody Publisher<Customer> customerFlux) {
        return customerRepository.saveAll(customerFlux).then();
    }

    @GetMapping("/customer")
    public Flux<Customer> list() {
        return customerRepository.findAll();
    }

    @GetMapping("/customer/{id}")
    public Mono<Customer> findById(@PathVariable Long id) {
        return customerRepository.findById(id);
    }
}

A complete example can be found in this repository - https://github.com/Hixon10/spring-boot-r2dbc-postgresql-example .

Finally, let's talk about the important: do you need to use r2dbc now in your applications? In my opinion - it’s too early. I would wait for a set of technologies (driver, Spring DATA) to be released, and to collect the first shys. I think that in 2021 it will already be possible to look at this driver more closely.

The current state of affairs - the entire reactive system (Spring WebFlux + R2DBC database driver) looks very cool. Literally todayolegchirpublished a digest in which there is a link to an article with benchmarks - https://technology.amis.nl/2020/04/10/spring-blocking-vs-non-blocking-r2dbc-vs-jdbc-and-webflux-vs- web-mvc / . In short - reactivity wins.

All Articles