Why does serializing a Struct to a String result in much faster write speeds?

I wrote a small CLI program that parses an XML file into a collection of Structs and then serializes and writes the Structs to .JSON. I noticed that if I convert the Struct into a String using serde_json::to_string and then write that String to JSON using serde_json::to_writer, that part of the program runs >10x faster than if I directly write the Struct directly with serde_json::to_writer

Here is the struct along with two implementations of the function:

#[derive(Debug, Serialize, Deserialize, Clone)]
#[allow(non_snake_case)]
struct Scan {
    num: usize,
    scanType: String,
    centroided: bool,
    msLevel: u8,
    peaksCount: u64,
    polarity: String,
    retentionTime: String,
    lowMz: f64,
    highMz: f64,
    basePeakMz: f64,
    basePeakIntensity: u64,
    totIonCurrent: u64,
    msInstrumentID: String,
    peaks: Option<Peaks>,
}

#[derive(Debug, Serialize, Deserialize, Clone)]
#[allow(non_snake_case)]
struct Peaks {
    compressionType: String,
    compressedLen: usize,
    precision: u8,
    byteOrder: String,
    contentType: String,
    data: Vec<f64>,
}
// fast
fn scan_to_json_string(scan: &Scan) -> std::result::Result<(),serde_json::Error>{
    let mut file = File::create(format!("./temp/{}.json",scan.num)).expect("could not create file");
    let j = serde_json::to_string(&scan)?;
    serde_json::to_writer(&mut file, &j)?;
    Ok(())
    }

// slow
fn scan_to_json(scan: &Scan) -> std::result::Result<(),serde_json::Error>{
    let mut file = File::create(format!("./temp/{}.json",scan.num)).expect("could not create file");
    serde_json::to_writer(&mut file, &scan)?;
    Ok(())
    }

Try wrapping the File in a BufWriter. I suspect the difference will vanish.

The intermediate string is acting as buffer, similarly to what BufWriter does.

Note also that the emitted JSON are not equivalent.

6 Likes

Thank you for your reply. I will try using a BufWriter.

EDIT: Putting the file handler inside a Bufwriter resulted in a signficant speedup. Thanks!

Updated version:

fn scan_to_json(scan: &Scan) -> std::result::Result<(),serde_json::Error>{
    let mut file = File::create(format!("./temp/{}.json",scan.num)).expect("could not create file");
    let mut buffer = BufWriter::new(file);
    serde_json::to_writer(&mut buffer, &scan)?;
    Ok(())
    }

Sidenote, you should use cargo fmt, and rather than allow camelCase to slip into your codebase, use serde to automatically convert between snake and camel case.

https://serde.rs/container-attrs.html

4 Likes