As you can see in the image below, I already have the fix for this issue, but I would like to check if I am missing something (or if I have found a bug!) with set_len.
In the pane on the left I have set_len removing the contents of a file, and on the right I have OpenOptions recreating the file with truncate=true. What I have found is that set_len not only leaves behind null chars at the start of the file on each write, but the number doubles every time. If I don't delete the file, it eventually becomes too big to open. (See the results after running the program a number of times in the second image.
Can someone help explain what I am seeing? Thanks in advance.
// Line 32 uses set_len, which doesn't remove null chars at start of thing.txt
use std::env;
use std::fs::OpenOptions;
use std::io::*;
fn main() {
let arguments: Vec<String> = env::args().collect();
let mut big_file = OpenOptions::new()
.read(true)
.write(true)
.create(true)
.truncate(false)
.open("thing.txt")
.expect("make the file fail");
let mut content = String::new();
big_file
.read_to_string(&mut content)
.expect("could not read detail of file");
println!("{}", content);
let mut detail: Vec<String> = content
.split_whitespace()
.map(|s| s.to_string())
.collect();
for element in arguments.iter().skip(1) {
detail.push(element.to_string());
}
big_file
.set_len(0)
.expect("Leaves null chars at start of file");
for element in detail {
println!("{}", element);
writeln!(big_file, "{}", element).expect("what am I missing?");
}
}
// Line 32 uses set_len, which doesn't remove null chars at start of thing.txt
use std::env;
use std::fs::OpenOptions;
use std::io::*;
fn main() {
let arguments: Vec<String> = env::args().collect();
let mut big_file = OpenOptions::new()
.read(true)
.write(true)
.create(true)
.truncate(false)
.open("thing.txt")
.expect("make the file fail");
let mut content = String::new();
big_file
.read_to_string(&mut content)
.expect("could not read detail of file");
println!("{}", content);
let mut detail: Vec<String> = content
.split_whitespace()
.map(|s| s.to_string())
.collect();
for element in arguments.iter().skip(1) {
detail.push(element.to_string());
}
big_file
.set_len(0)
.expect("Leaves null chars at start of file");
for element in detail {
println!("{}", element);
writeln!(big_file, "{}", element).expect("what am I missing?");
}
}
The file’s cursor isn’t changed. In particular, if the cursor was at the end and the file is shrunk using this operation, the cursor will now be past the end.
What happens if you seek to the beginning after truncation?
Was not aware that the cursor could have a position of, say, 1234 if set_len(0) was reducing the file length to 0. That is, I would have thought that the cursor position would be set to x in set_len(x) if currently greater than x.
Added the following:
big_file
.rewind()
.expect("Cannot set cursor to start of file");
This doesn't remove the null char at the start of the file, but it's not creating any new nulls.
If I delete the file and re-run the program, it creates and maintains the file without nulls.
Success!
I'm going to stick with the truncate solution as I have an operation that reduces the text length, and I don't want a file with trailing nulls, but rewind() does seem useful for other cases.
For POSIX-like systems (and for Windows, which emulates them), seeking past the end of the file is the standard method to create a "sparse" file. Truncating a file to shorter than the position of some handle and then writing through the handle has the same effect. They're pretty niche and it's not surprising you wouldn't have encountered them.
On most FSes, the zeroes you're seeing are generated by the FS at read time, and are not stored to disk. That's what makes the file "sparse" in the first place: only the data that has been written is actually stored.