Production-scale Postgres


#1

I am a Rust beginner with decades of experience in other languages and frameworks. I like some of the basics of Rust I have learned about on a theoretical basis. I would like to try to use Rust for Postgres for simplified coding of Postgres data processing integrated with something like mio for handling I/O completion, etc.

But in PostgreSQL work, I always come back to the libpq C bindings for production-scale db work instead of simplified object frameworks because libpq as of version 9.1 supports single row mode, which is not the default way of processing returned Postgresql results. My table partitions have many billions of rows in them and a significant fraction of that could return when I query them, so using Postgres methods or frameworks that load the result entirely into memory (by not enabling single row mode) before accessing the any result are unacceptable.

See https://www.postgresql.org/docs/current/static/libpq-single-row-mode.html.

Are there examples or documents that might give me a head start to doing this sort of thing in Rust? I am beginning to think I may be starting from scratch to setup the Postgres access the way I need it to behave. If so, is there something else of similar difficulty that has been integrated with mio that I might look at as a starting place. Is there an existing Postgres support integration that would be recommended to enhance for line at a time and full asynchronous support? Etc.


#2

Well, there is https://crates.io/crates/pq-sys which is a binding to libpq, so you could use that if it’s something you’re already comfortable with.

That said, the author @sgrif, might have more advice for you.


#3

Yeah pq-sys is probably the place to go. I’m working on a new driver which will be similarly low level soon as well.

I could probably extract some code from Diesel that wraps the libpq stuff to handle memory management, etc which might make your life easier.


#4

Does Rust’s native postgres driver not solve the problem at hand?

(I’m mostly asking this question because I haven’t used Postgres from Rust yet and I’m curious why, for example, pq-sys is needed.)


#5

The native driver supports something like single-row mode, though I believe via a different mechanism.

Specifically, while in a transaction you can execute a prepared statement lazily, limiting the number of rows returned by the backend at any one time.

The performance characteristics will be a bit different than single row mode since each batch of a lazy query involves a round trip to the database. If you chose the batch size reasonably this hopefully shouldn’t matter though.


#6

Ah, another option if you want to “really” stream (on a per-cell basis) is to perform a COPY query. The postgres-binary-copy crate should do what you need. Example: https://github.com/sfackler/rust-postgres-binary-copy/blob/master/src/lib.rs#L575