Requesting review of some unconventional API choices

I'm the maintainer of the rtrb crate, which provides a wait-free SPSC (single producer, single consumer) ring buffer that's suitable for real-time use case like audio processing.

For the simplest case, it provides very straightforward Producer::push() and Consumer::pop() methods for writing and reading single items (of some type T).

In addition, it also provides methods for writing/reading chunks of multiple items (like, e.g., chunks of f32 audio samples as provided by the sound card). Since the underlying data structure is a circular buffer, those chunks sometimes happen to "wrap around", which means that the beginning of the chunk is stored at the very end of the underlying buffer and the rest of the chunk is stored starting at the very beginning of the buffer. This is natural and expected, but if I know that I'm always using the same chunk size, I can choose the ring buffer length in a way that this "wrap around" never happens within a chunk.

So far so good, there is a RingBuffer::with_chunks(chunks, chunk_size) method which allows me to define a ring buffer with a size that's an integer multiple of my chunk size. However, there are two drawbacks that I don't like about this:

  • Even though I as a user know that my chunks are always the same size, the compiler doesn't know that. The "chunk" methods still have to provide two slices of data, even though I know that the second one is always empty (because there is no "wrap around").
  • The constructor has two unsigned integer arguments, which might be easily confounded.

As a solution to those two issues I've come up with a pull request and I'd like to hear your opinions on that:

https://github.com/mgeier/rtrb/pull/50

I've split the with_chunks() constructor into two parts to make clear what each of the two numbers mean: RingBuffer::with_chunks(...).of_size(...). I guess this is somehow related to the "builder" pattern, but it isn't a typical case, because there are always exactly two arguments in exactly the same order.

So my first question: is this a bad idea?

I could just as well keep the more conventional (?) two-argument form RingBuffer::with_chunks(..., ...). Are there any further suggestions?
In a future version of Rust there might be named arguments, which would help here, but this might never happen. And I would like to have something that works now.

The next question is about the return type of this constructor. Instead of returning a RingBuffer, it returns a pair of producer and consumer structs, which are typically passed on to different threads. What makes this a bit unconventional is the fact that not both sides have to use a fixed chunk size. It's perfectly reasonable to have one side use .push() or .pop() with single elements, while the other side uses fixed chunks. Therefore, there are three possible combinations of return types:

  • (ChunkProducer, Consumer)
  • (Producer, ChunkConsumer)
  • (ChunkProducer, ChunkConsumer)

In order to achieve this, I've created a trait for the return value. This trait has to be public, but its name is not important for the user and therefore it is not documented.

So my second question: is this a bad idea?

As I have currently implemented it in my PR, the concrete return types are inferred from assignments to variables with known types or they can be specified with type annotations. I guess it would also be possible to provide multiple constructors with different names that each return a concrete pair of types. I guess this would be more conventional, but I couldn't come up with good names for those constructor functions and I somehow like the generic return type, but it may be a bit too fancy? Is there an alternative way to solve this?

Finally, to complete the API, there is one constructor for the "simple" case where no fixed chunks are used. This is implemented as RingBuffer::new(capacity) -> (Producer, Consumer). This is definitely unconventional, because I have to appease Clippy with #[allow(clippy::new_ret_no_self)].

So my third question: is this a bad idea?

Clippy would be silent if I chose a different name for the constructor function, any suggestions?

In the current API (before the aforementioned PR), I've worked around the issue by using the signature RingBuffer::new(capacity) -> RingBuffer. The returned RingBuffer then has a .split() method which in turn returns a pair of (Producer, Consumer).

However, with my new PR there are two "types" of RingBuffer: fixed size chunks or not. I don't really want to use two separate structs for that, so the solution would be to never actually return a RingBuffer.

In summary, I have two "constructor" methods on RingBuffer that among them return 4 possible pairs of types, none of which actually is a RingBuffer.

I'm not sure whether I should like this or if I should be disgusted, what is your opinion?

BTW, neither the number of chunks nor the chunk size is supposed to be const. This would be a nice usage for the recently stabilized const generics, but for this project I'm interested in the non-const case.

CC @matklad, I think you may also be interested in this one (it's about the spsc from crossbeam that has split off in a separate crate).

This seems to be a standard pattern in Rust to me. Alternatively, you could use something like

RingBuffer::new(RingBuffersize{chunks: ..., size: ...})

[...]

I'm not too keen on this one. Is it possible to not "overload" the return type and always return a (ChunkedProducer, ChunkedConsumer) pair? On the ChunkedProducer, you can maybe have a unchunk(self) method that returns a Producer and similar for the Consumer. The API can then be used as follows:

let (chunked_producer, chunked_consumer) = RingBuffer::with_chunks(CHUNKSIZE).of_size(SIZE);
let producer = chunked_producer.unchunk();

std::sync::mpsc uses a standalone function to create a producer, consumer pair.

1 Like

Not sure I have any relevant knowledge here, but that what I’d do:

  • remove RingBuffer type from the public API altogether — it seems to be a private implementation detail, the user always uses consumer or producer.
  • remove with_chunks constructor — the thing it does is essentially multiplying the two numbers. Abstracting that behind a builder with private super-trait spends, to my taste, to much language machinery for a trivial thing. with_capacity(n * chunk_size) syntax reads nice enough imo.
  • Add into_chunk_consumer(chunk_size) method to Consumer, that does runtime assertion that he buffer size is divisible by chunk size. I don’t think a mistake here is worth preventing at compile time, but this seems to simplify the API, and also allows for chunks of different sizes.
  • does the chunk needs to be a runtime parameter? If not, then perhaps just using [T; N] as the element will cover the majority of use-cases?
5 Likes

If I understand the situation correctly, you could try introducing a completely new type, maybe called ChunkRingBuffer. It can have a similar API, but some functions are chunk-specific.

Internally, you could reuse the normal ring buffer, and wrap it in a new type pub struct ChunksRingBuffer(RingBuffer);, which could discard the second slice as it knows that it's always empty. That way, nobody can accidentally misuse the existing Ringbuffer when using chunks.

For the constructor I too would recommend using the struct as named parameters approach :slight_smile:

Thanks a lot for the reviews!

I was thinking about that and I mostly like it, but I think it is quite cumbersome to bring rtrb::RingBufferSize into scope.

A variation of this would probably be to use the additional struct as kinda "builder", something like this:

let (p, c) = rtrb::RingBufferBuilder{ chunks: ..., chunk_size: ...}.build();

I'm not sure whether I like this or not ...

And I think it would be good if the "simple" case (with no fixed chunks) would work similarly, but how could that look like?

Yes, this would definitely be possible!

I'm not sure if that's better from a usability perspective, but the implementation would definitely be much simpler. In fact, the ChunkProducer already contains a Producer, so this would be trivial (same for the consumer).

I'm not sure if I would call it unchunk(), because this is only about fixed-size chunks. In general, chunks can still be used.

The same argument could of course be made for ChunkProducer, but I had the feeling that FixedChunkProducer is just a bit too long. Or isn't it?

Yes, we were talking about that during the discussions at Add a bounded SPSC queue · Pull Request #338 · crossbeam-rs/crossbeam · GitHub (there are a few Dropbox Paper links with further details). This would look like this:

let (p, c) = rtrb::ring_buffer(1024);

This could probably be extended to the "fixed chunk" case:

let (p, c) = rtrb::ring_buffer_with_chunks(22).of_size(128);

I think I like this.

That's true, there's currently only one method: RingBuffer::capacity().
I could move this to the Producer/Consumer, it just felt a bit more natural directly on the RingBuffer.

Another thing I wanted to add to RingBuffer in the future is a way to mlock() the used memory, see Ability to "mlock" the buffer? · Issue #3 · mgeier/rtrb · GitHub.

But those are minor points.

Yeah, it's definitely a lot!

Yes, that would definitely be possible. It would also have to check whether the Consumer has already been used to read a non-chunked amount of data.

I was kinda liking the impossibility of a panic, but it would indeed simplify the implementation.

I'm not sure whether it would actually simplify the API and the usage.
After all, now an additional method call would be needed.

But I'll try to implement this to have a closer look.

Yeah, that's not possible with my original suggestion.

It might be rarely used, but I could actually imagine a use case: A disk-reading thread could write into the ring buffer with a fixed chunk size, while the audio thread could read the data with a different fixed chunk size.

There are certainly cases where [T; N] is appropriate, but in my typical use case it's not, because the chunk size is chosen by the end user (e.g. via an "audio block size" setting).

I could definitely do that, but I'm not sure if that's an improvement for the user. Instead of ...

let (p, c) = rtrb::RingbBuffer::with_chunks(10).of_size(128);

... we would use ...

let (p, c) = rtrb::ChunkRingbBuffer::with_chunks(10).of_size(128);

Or did you think of something else?

That's more or less exactly what I'm doing with ChunkConsumer in my PR (see src/fixed_chunks.rs), except that I also have to store the chunk_size.

This makes it indeed impossible to misuse, but as @matklad said above, there might be advantages in allowing panics.


I think I will try to implement some of the suggestions I got here and probably make an alternative PR (or multiple?) to have a more concrete comparison.

In the meantime, I'm still open for further suggestions!

1 Like

OK, I've created an alternative PR:

https://github.com/mgeier/rtrb/pull/52

This gets rid of the with_chunks(...)... constructor.

As mentioned before, I don't really like the panicking behavior, so I created the fallible methods Consumer::try_fixed_chunk_size() and Producer::try_fixed_chunk_size().
A user can (and probably will) use .unwrap() on the result, but this way the decision for possibly panicking is delegated to the user.

Now I think there is no confusion between chunks and chunk_size, because the number of chunks never explicitly appears in the API.

I think this is much better than before, thanks again for the suggestions!

I kept the unconventional RingBuffer::new() behavior (for now), because I kinda like it. I'm still open to changing it if somebody strongly objects, though.

I've renamed ChunkConsumer to FixedChunkConsumer, just to be extra explicit.

What do you think?

1 Like

You're right, I shouldn't have mentioned it.

I think it's also better from a usability perspective: it's easier to understand what's going on and there is no type annotation needed anymore. I've learned the hard way that if I try to shift away too much from the user, it can quickly become too complicated.

That's why I proposed to "unchunk" rather than to "chunk" (but yeah, the naming could be better and then you need two parameters for the constructor).

FYI: jack has the possibility that the buffer size changes while the application is running and for LV2, a varying buffer size is the default. This reduces the number of situations in which a FixedChunk* can be used.

Judging by the docs, I like this one much more.

1 Like

I think you are right. Requiring an additional function call is of course more cognitive load, but if it makes the rest of the API much simpler, it might be well worth it (which I think it is in this case).

Yes, the advantage of unchunk() is that it never panics, but using a Result in my new PR at least moves the responsibility for potentially panicking to the user, which I think is also fine.

And the big advantage of the new PR is that there is only one constructor, which on its own should make the API less confusing.

I know, but have you ever seen an application that actually supports that callback?

I normally just disable it (https://github.com/AudioProcessingFramework/apf/blob/cc889678e45dbad8e962a9f156c40b8baf3574ae/apf/jackclient.h#L376-L383) but I guess most JACK clients are not even aware that the block size can change.

I didn't know that, I've never worked on an LV2 plugin.

There seems to be a way to request a fixed block size (LV2 Buf Size), but they issue a warning: "Note that requiring this feature may severely limit the number of hosts capable of running the plugin"

How common is it that LV2 hosts don't support this?

Even if no audio framework would support fixed block sizes, it might still be useful on the non-audio thread.

But most importantly, I like this feature because I am actually using it myself: GitHub - AudioSceneDescriptionFormat/asdf-rust: Library for loading ASDF files. This library requires a fixed block size. I could of course support arbitrary block sizes, but it would make the implementation much more complicated, and I'm not willing to put in that work right now.

Either way, the whole fixed-size-chunk thing is of course totally optional and I hope it is sufficiently unobtrusive in the docs.

Me too!

1 Like

Ardour doesn't seem to support it. I think very few hosts actually support this, honestly. I haven't really used lv2 myself yet, but as far as I understood, "normal" (i.e.: non-"Control-Voltage" parameters) only change between audio buffers, so having a fixed length audio buffer is hard to combine with sample-accurate parameter changes, I think.

I just wanted to be sure that you actually had a use case for this feature, and ...

... it seems you have a use case. Nice :slight_smile:

I've given all the input I have (and maybe even more than you need), and with my limited understanding, I have no objections to the alternative PR, so ... it's up to you.

1 Like

OK, so I have contemplated this for a while, and I'm starting to accept that the "fixed chunks" use case isn't as important as I initially thought. Thanks to @PieterPenninckx for pointing this out!

I'm still using fixed-size chunks in the project I've mentioned above, but maybe in the future I will re-implement it in a way to allow changing block sizes after all.

So instead of adding features to rtrb, I've now tried to remove some API surface, using a few things I've learned during the discussions here:

https://github.com/mgeier/rtrb/pull/57

None of those changes are revolutionary, but I think in the end they make the API simpler and the docs easier to understand.

What do you think?

Did you consider a mirrored ring buffer implementation (I found slice-deque, vmcircbuf, magic-ring-buffer, and vmap::Ring)? That allows a simpler API still—callers can always use a single slice because wraparound is handled by a VM trick. I was just considering using one in a project of my own...

Thanks @slamb for the suggestion! I've read about this trick, but I haven't really looked into it. As far as I understand, this is not platform-independent, even though it looks like slice-deque is available for quite a few platforms. I like that rtrb has a platform-independent implementation that should work wherever atomics (and the alloc crate) are available.

Also, I would like to stay true to the original implementation in https://github.com/crossbeam-rs/crossbeam/pull/338, which isn't my own.

But I do think it would be interesting to pursue this in a separate project (with a nicer API). And I would definitely be interested in potential performance differences.

Yet another option that would allow contiguous chunks is available at https://crates.io/crates/spsc-bip-buffer. I guess the trade-off here is a slight overhead in CPU.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.