I'm playing around with how far I can take a zero-copy Json deserialization.
Imagine I have some JSON that has a field whose data (alpha ASCII bytes) are valid.
In that case I can simply take the data as a &[u8]
.
But I also want to normalize the field to uppercase with no surrounding ASCII whitespace and potentially remove any '-' chars from the middle of the bytes.
Best case scenario is to mutate the bytes in place, as normalizing involves leaving the slice len as is
or shrinking it. But I don't see how to deserialize into a &mut [u8]
using serde::Deserialize
.
I'm presuming it's not possible to acquire a &mut [u8]
from serde::Deserialize
so let's concentrate on allocation instead.
I can store the field as a Cow<'a, [u8]>
but that will allocate on the heap.
Is there a way I can have a Cow
like concept but allocate a fixed size array instead?
This is the most desirable given that I know the normalized form of the field will always be [u8; 5]
.
I don't know of any concept like a FixedCow<'a, T, const N: usize>
but ofc I can roll my own if needed.
Thanks!
#[derive(serde::Deserialize)]
struct Thing<'a> {
#[serde(borrow)]
field: Field<'a>;
}
struct Field<'a>(FixedCow<'a, [u8], 5>);
impl<'de: 'a, 'a> serde::Deserialize<'de> for Field<'a> {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: serde::Deserializer<'de>,
{
let bytes: &[u8] = Deserialize::deserialize(deserializer)?;
// validate and sanitize bytes
//
// I would like to mutate the bytes in place here
// but I don't see a way of deserializing to a `&mut [u8]`
// The next best thing would be to store it as a sort of `FixedCow<'a, const .: usize>`
Ok(Self(bytes.into()))
}
}