Confused about byte array indexing


#1

I’m trying to parse a HTTP response. The response can have chunked encoding, so I’m trying to read the response and check the chunk sizes to split it off the actual response body. Apparently I can read the proper chunk size and then try to index the Vec appropriately. but when I use the index on the slice it sometimes doesnt read/“index” everything and even worse the missing data differs, even though the underlying response always is the same (like the chunksize). The response body is already completed as the chunk is properly terminated. Hence it shouldnt be related to Http directly. How can I explain and fix this?
I tried so much different things, but I really didnt find a way…

I asked on the IRC and they mentioned something about some special characters taking more byte space, but I can’t really comprehend why this matters and afaik the chunk size should account for this.

This is my source

The relevant functions are I think: join_chunks and get_chunk_size

when you run it, you should see that it prints the complete response and then left Chunk (which is the part which is not read for some reason and differs).

the get_body_start_index function could also be wrong, but I’m always getting the right chunk size, hence I assumed that if there’s an error there, it could not amount to such a great difference of the grabbed chunk.


#2

Should be debugging this yourself.
My cursory read, (could be way off anything useful);
.and_then(|(stream, vec, _)| {
_ instead of read_len getting ignored.

let mut content_length = 0;
seems to be always be 0 at the later if but not spent time figuring out use.

vec.extend_from_slice(&vec2);
again read_len not included.


#3

I already did to no avail. Please tell me.how to debug. Afaik all variables are the same, but somezimes i get a bigger string and sometimes a smaller one from the same chunk size.

The socket read length should not be relevant as i can print the complete response from the Vec with all chunks in it. Before I actually try to look at it and remove the chunk info.

Content.length is irrelevant in this context. Im specifically looking at chunked data.


#4

Afraid debugging is too big a topic to cover in forum post.

read does not understand what a vec is. All it has knowledge of is access to a fixed size slice. It does not guarantee it will fill the slice so instead returns the read_len (as your code calls it.) The vec returned is not truncated as read does not have the knowledge.


#5

but is read relevant here? I print the whole slice/vec and the string is completed. when i check the read length it does display a lower number than the total string length. but why is it shown complete in the string? there has to be some logic issue…

jonh said something about truncate so I call truncate on the read bytes now. apparently it works now. thanks. I dont understand why though. maybe I used to append 0 (u8) to the original vec with extend_from_slice and maybe because the converted string from String::from_utf8_lossy() made these chars then invisible to the terminal, so I didnt realize they were there. could that be the reason?