Confused about byte array indexing

junkrat06 · April 18, 2018, 12:28pm

I'm trying to parse a HTTP response. The response can have chunked encoding, so I'm trying to read the response and check the chunk sizes to split it off the actual response body. Apparently I can read the proper chunk size and then try to index the Vec appropriately. but when I use the index on the slice it sometimes doesnt read/"index" everything and even worse the missing data differs, even though the underlying response always is the same (like the chunksize). The response body is already completed as the chunk is properly terminated. Hence it shouldnt be related to Http directly. How can I explain and fix this?
I tried so much different things, but I really didnt find a way...

I asked on the IRC and they mentioned something about some special characters taking more byte space, but I can't really comprehend why this matters and afaik the chunk size should account for this.

This is my source

The relevant functions are I think: join_chunks and get_chunk_size

when you run it, you should see that it prints the complete response and then left Chunk (which is the part which is not read for some reason and differs).

the get_body_start_index function could also be wrong, but I'm always getting the right chunk size, hence I assumed that if there's an error there, it could not amount to such a great difference of the grabbed chunk.

jonh · April 18, 2018, 1:25pm

Should be debugging this yourself.
My cursory read, (could be way off anything useful);
.and_then(|(stream, vec, _)| {
_ instead of read_len getting ignored.

let mut content_length = 0;
seems to be always be 0 at the later if but not spent time figuring out use.

vec.extend_from_slice(&vec2);
again read_len not included.

junkrat06 · April 18, 2018, 2:18pm

I already did to no avail. Please tell me.how to debug. Afaik all variables are the same, but somezimes i get a bigger string and sometimes a smaller one from the same chunk size.

The socket read length should not be relevant as i can print the complete response from the Vec with all chunks in it. Before I actually try to look at it and remove the chunk info.

Content.length is irrelevant in this context. Im specifically looking at chunked data.

jonh · April 18, 2018, 2:55pm

Afraid debugging is too big a topic to cover in forum post.

read does not understand what a vec is. All it has knowledge of is access to a fixed size slice. It does not guarantee it will fill the slice so instead returns the read_len (as your code calls it.) The vec returned is not truncated as read does not have the knowledge.

junkrat06 · April 18, 2018, 5:33pm

but is read relevant here? I print the whole slice/vec and the string is completed. when i check the read length it does display a lower number than the total string length. but why is it shown complete in the string? there has to be some logic issue...

jonh said something about truncate so I call truncate on the read bytes now. apparently it works now. thanks. I dont understand why though. maybe I used to append 0 (u8) to the original vec with extend_from_slice and maybe because the converted string from String::from_utf8_lossy() made these chars then invisible to the terminal, so I didnt realize they were there. could that be the reason?

Topic		Replies	Views
Performance issues with using Slice indexing buffer[4..20]? help	2	654	January 12, 2023
Split a bytes vec by a sequence of chars help	8	1911	February 14, 2021
Advice for HTTP Lib help	6	589	January 12, 2023
Reqwest read large files as byte stream help	5	2406	August 24, 2021
Can someone explain how `Bytes` is better than `Vec`? help	8	601	January 28, 2024

Confused about byte array indexing

Related Topics