Hi I am trying to recreate the LengthDelimitedCodec in python so I can pass json between a python application and a rust application.
My rust code looks like this -
use futures::prelude::*;
use serde_json::Value;
use tokio::net::UnixListener;
use tokio_serde::formats::*;
use tokio_util::codec::{FramedRead, LengthDelimitedCodec};
#[tokio::main]
pub async fn main() {
let listener = UnixListener::bind("/tmp/example").unwrap();
loop {
let (socket, _) = listener.accept().await.unwrap();
let length_delimited = FramedRead::new(socket, LengthDelimitedCodec::new());
let mut deserialized = tokio_serde::SymmetricallyFramed::new(
length_delimited,
SymmetricalJson::<Value>::default(),
);
tokio::spawn(async move {
while let Some(msg) = deserialized.try_next().await.unwrap() {
println!("GOT: {}", msg);
}
});
}
}
FramedRead expects a u32 signed frame that denotes how long our json message is. So for example "\x00\x00\x00\x0b{\"h\": \"i\"} when parsed by the code above returns "{"h": "i"}".
What is your question? You can do this in python by first reading four bytes, converting it to an integer, then reading that many bytes, then parsing the json.
Sorry if I wasn't specific enough, I guess I want to make sure I understand how the serialization protocol works.
If I understand correctly, I need to get the length of the json string, convert that to a u32 number, prepend the hexadecimal length to the string and then write it to the socket (I need to do this is Python not Rust)?
Sure, that should be it. You need to make sure to get the endianess right in python. Rust's LengthDelimetered uses big-endian, and there are some resources on this here.
(To be precise, you're not supposed to write it in hexadecimal. That is, in your example, you should not literally be writing the character b to the TCP stream.)
Still struggling a bit to figure out what the python function would look like. I know that struct.pack is what I am looking for but struct.pack('>I', 100) doesn't return what I would expect.
What does it return and what do you expect it to return?
>>> struct.pack('>I', 100)
b'\x00\x00\x00d'
>>> [hex(x) for x in struct.pack('>I', 100) ]
['0x0', '0x0', '0x0', '0x64']
look correct to me. Maybe the letter d in the first representation is what's causing some confusion? (It's a letter, it isn't part of the previous hex number.)
How many bytes do you use for the length field? Is it 4 or do you send the actual string "\x00\x00\x00d"?
If I read the comments above correctly, you need to send the length as binary, not a string. For example the full payload (length+string) would look like:
>>> s = "hello".encode('utf-8')
>>> len(s)
5
>>> struct.pack('>I', len(s)) + s
b'\x00\x00\x00\x05hello'
>>> len(struct.pack('>I', len(s)) + s)
9