Blog post: Making Slow Rust Code Fast

The raw BSON serializer / deserializer used in bson::to_vec and bson::from_slice were written recently actually (by yours truly), so I think it's more that we haven't had the chance to micro-optimize it yet rather than it being hacky.

More generally speaking, bson was originally a community maintained library, but it's since been transferred to MongoDB the company and these days is being actively maintained by my colleagues and me. So while there is some technical debt in it that needs addressing, it is a fully supported library intended to be production-ready. If you have any ideas for improvements though, we'd love to hear about them on our GitHub issue tracker or Jira project!

3 Likes

Online services serving many concurrent users/requests do spend a lot of time in deserializing user request and serializing response, since data is mostly cached in memory. That is why json is not considered for good communication between internal services. I have seen this in many production services.

1 Like

What I was criticizing was the obscene amount of code generated, not that there is an enum somewhere in it. The enum itself, as you say, is very innocent.

That's precisely what shouldn't happen. A non-self-describing format shouldn't be implemented with an inefficiency that is required only for self-describing formats. Simply deserialize each value in order directly into the struct to be returned. No need to have a loop match field IDs to deserialize values into Options that need to be unwrapped in the end.

Which again is because of a pathological case in JSON. BSON doesn't allow non-canonical keys shouldn't need Cow. As can be seen in the code I wrote for the benchmark, field names are matched directly in the input without even deserializing them as string slices first.

I didn't mean we shouldn't care. I clearly said BSON isn't a text format doesn't need this additional complexity that was designed for JSON. Besides, Serde_derive isn't conformant with all the bad parts of JSON anyway, as it returns an error when encountering duplicate keys in a struct, whereas other parsers would take the last value of each field, happily ignoring errors in anything except the last value. If Serde decides to return an error on duplicate fields, it could as well return an error on non-canonical field names, thereby taking plenty of complexity out of deserializers.

Obviously I wouldn't complain at Miniserde being tailored for JSON if its sole purpose is to work with JSON. Seems like a superior approach.

That's the point I was making. And this use case is BSON, and only BSON, which doesn't require all the complexity of other formats.

I wasn't thinking of the superficial view of the set of primitive types you're listing. I was thinking of the semantics of structs and errors.

I like that you added JSON and Message Pack for comparison. Maybe also add CBOR?

You can't make such a conclusion when comparing apples to oranges. You have to look into the details.

rmp_serde::to_vec achieves high performance by not serializing field names. You have to use rmp_serde::to_vec_named for a fair comparison.

Also to consider (after the important detail above) is that Message Pack is not BSON.

  • Arrays are prefixed with the number of items, so arrays can be allocated upfront with the correct capacity when deserializing.
  • Byte lengths of arrays and objects are not encoded, so no need to backtrack when serializing.
  • Keys of array elements are not encoded, significantly reducing the number of bytes to serialize and deserialize.

For further performance improvements in deserializing arrays, they could be allocated in a bump arena. This reduces the number of allocations (and deallocations) for any format, but is particularly helpful in making BSON competitive with Message Pack, because arrays could have a preallocated capacity for a pretty large number of items, and any capacity that is left over after the end of one array can be immediately used for the next array.

There's async for that. So what matters depends on what you're measuring.

thank you for careful read, this mistake is on me. new numbers (EDIT: for older, different benchmark) are less impressive, however there is still ~2.8x difference even for simple struct {i: i64} (and potentially i: ObjectId - this was the main reason for the benchmark as it was an order of magnitude slower).

bioinformatics is on a heavy side of postprocessing. on my websites & lighter number crunching projects, bson was really visible & resulting perf was worse than (c++,) python & even one old php webpage i rewrote in rust. it's significalntly better now, but numbers in this thread suggest there is still room to do several double digits improvements.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.