Trying to split a file into arbitrary number of bytes and write that section to disk

Hey everyone,

As the title says, I'm trying to write a program that will allow me to give it any file, read it in as bytes, choose an arbitrary, sequential amount of bytes from anywhere in that file, then take that sequence and write it to disk. The program in the current written form has some old code in it from my work earlier today like the loop in the begin_write_loop() function. I know that doesn't work as is.

For the moment, all I'm trying to do is take the first half of the file from 0 to half the file's length (file length / 2) and I'm having trouble.

First, I'm not even sure what the best way to do this is. The code I have so far is here: Trying to split a file arbitrarily. - Pastebin.com

I'm specifically looking for help with the get_file_slices() and write_file_part() functions. I think if I can get those working the way I want, it'll solve my question.

The current iteration of the get_file_slices() function came primarily from this question on stack overflow: https://stackoverflow.com/questions/75957959/split-file-in-arbitrary-n-byte-slices

I'm a little confused why that answer has a vector of vectors. If I don't try to write the file out and just print the contents of file_slices then I do get a list containing 8 byte sequences of the whole file so it does seem to work to a degree. When I try to then go write the file, the write command needs it to be a &[u8] type which makes sense. I'm having trouble converting a Vec<Vec<u8>> to a &[u8].

I tried to flatten the temp_bytes in write_file_part() and then collect them but I get an error:

a value of type `&[u8]` cannot be built from an iterator over elements of type `&u8`
the trait `FromIterator<&u8>` is not implemented for `&[u8]`

So...it's reading the Vec<Vec<u8>> as a &[u8] already? I feel like that's not correct so I must be misunderstanding the error.

You have an arbitrary file, want to skip the first n bytes, and then write the next m bytes to disk? With arbitrary n, m, with the restriction that n + m is not larger than the initial file size? Is that what you want to do? Or do you intend to create portions of the initial file, as we did in old days, to fit a large file onto multiple floppy disks?

[EDIT]

For the first case, you could try https://stackoverflow.com/questions/68694399/most-idiomatic-way-to-read-a-range-of-bytes-from-a-file

1 Like

More like the second scenario...kind of. Ultimately what I'd like to do is this kind of scenario:

Let's say the whole file is 1000 bytes long (index 0-999). I want to be able to take any sequence within that range, pull out those corresponding bytes, and write them to a file. So I could say, give me bytes 143-592 and as long as it's within the 1000 bytes range, I want to take that and write to a separate file. Then later go, ok great, I now want bytes 439-674 and have it pull the bytes from the original file.

I'm a far way from that total functionality since I just got started with the project. So for now, all I want to do is take the total file, split the bytes in half, and write one half to a disk. Whether that's [0..n] or [n..999], doesn't really matter to me at the moment. I figure once I have one half working, I can get the other half working more easily then iterate from there.

Hopefully this makes sense.

EDIT:
That forum you post though is interesting. While their application isn't exactly what I'm looking for, the seek() functionality might be. I'll look into that.

1 Like

Here's something you could adapt to your needs. Now how in its current form, it's not an error to read beyond the end of the file. You could check for that if you wanted.

2 Likes

Oh that looks interesting. Thank you! I'll take a closer look!