Pretty printing XML?

Hi,
I have a buffer in form of a Vec<u8> containing XML.
What is the easiest/best way to have it be pretty printed on stdout (or to a String)?

I've had a look at xml-rs but I can't get my head around it...cry...

Cheers, Toby

It really depends upon the XML in question, some XML will easily map to serde
using crates like serde-xml-rs and quick-xml, other types (It has been a while but ISTR mixed content as being problematic), For these, i've generally parsed by hand with roxmltree, but it is definitely much easier to try using serde first, falling back to an xml specific parser only if you must.

Edit: You can perhaps disregard the above, I wasn't quite thinking in the case that you are just pretty printing. In which case it is perhaps easier to use an xml specific parser over serde. And I don't think roxmltree being readonly does it.

Something like this seems to work for xml-rs:

use xml::{reader::ParserConfig, writer::EmitterConfig};

fn format_xml(src: &[u8]) -> Result<String, xml::reader::Error> {
    let mut dest = Vec::new();
    let reader = ParserConfig::new()
        .trim_whitespace(true)
        .ignore_comments(false)
        .create_reader(src);
    let mut writer = EmitterConfig::new()
        .perform_indent(true)
        .normalize_empty_elements(false)
        .autopad_comments(false)
        .create_writer(&mut dest);
    for event in reader {
        if let Some(event) = event?.as_writer_event() {
            writer.write(event).unwrap();
        }
    }
    Ok(String::from_utf8(dest).unwrap())
}

You can adjust the EmitterConfig to change the output formatting.

1 Like

Wow, fantastic!
But now I get a compile error:

36 | fn format_xml(src: &[u8]) -> Result<String, xml::reader::Error> {
   |                              ^^^^^^         ------------------ help: remove this generic argument
   |                              |
   |                              expected 1 generic argument
   |
note: type alias defined here, with 1 generic parameter: `T`
  --> /home/tobbe/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/error.rs:45:10
   |
45 | pub type Result<T> = result::Result<T, Error>;
   |          ^^^^^^ -

So I tried to remove the xml::reader::Error as suggested by the compiler, but then I get:

36 | fn format_xml(src: &[u8]) -> Result<String> {
   |                              -------------- expected `std::io::Error` because of this
...
48 |         if let Some(event) = event?.as_writer_event() {
   |                                   ^ the trait `From<xml::reader::Error>` is not implemented for `std::io::Error`

Hm...tricky, how do I solve all this?

This means you have use std::io::Result somewhere - do you really need it?

Well, I'm running inside a thread which returns Result(). It gets the (not pretty printed) XML on a crossbeam channel, then write it out to stdio, like this:

pub fn write_loop(args: Args, write_rx: Receiver<Vec<u8>>) -> Result<()> {
    let outfile = &args.outfile;
    let mut writer: Box<dyn Write> = if !outfile.is_empty() {
        Box::new(BufWriter::new(File::create(outfile)?))
    } else {
        Box::new(BufWriter::new(io::stdout()))
    };

    loop {
        let buffer = write_rx.recv().unwrap_or_default();
        if buffer.is_empty() {
            break;
        }

        let mut xml = format_xml(&buffer).unwrap_or_default().as_bytes();

        if let Err(e) = writer.write_all(&xml) {
            if e.kind() == ErrorKind::BrokenPipe {
                // "stop the program cleanly"
                return Ok(());
            }
            return Err(e);
        }
    }
    Ok(())
}

So I thought I should try and hook in a pretty printing of the XML before writing it to stdout.

Or you can use std::io::Result as IoResult, so that this function returns IoResult<()>, and the default Result is untouched.

Hm...so you mean that write_loop should return IoResult<()> ?
That still give me the same error as in my first example...me scratching head...

Could you share a self-contained example (by stabbing everything save the problematic part with todo!()? It looks like your function in fact could return two kinds of errors, not one, and this should be reflected in its signature, but I want to be sure.

Sure, the code can be found here:

I've modified the function so you can use it like this :

if let Err(error) = to_writer_pretty(&mut writer, &buffer) { ... }

Which, I feel, works better in the context you provided.

It's this :

pub fn to_writer_pretty<W>(writer: &mut W, buf: &[u8]) -> std::io::Result<usize>
where
    W: std::io::Write,
{
    let reader = ParserConfig::new()
        .trim_whitespace(true)
        .ignore_comments(false)
        .create_reader(buf);
    let mut writer = EmitterConfig::new()
        .perform_indent(true)
        .normalize_empty_elements(false)
        .autopad_comments(false)
        .create_writer(writer);
    for event in reader {
        if let Some(event) = event.map_err(to_io)?.as_writer_event() {
            writer.write(event).map_err(to_io)?;
        }
    }
    Ok(buf.len())
}

fn to_io<E>(e: E) -> std::io::Error
where
    E: Into<Box<dyn std::error::Error + Send + Sync>>,
{
    std::io::Error::new(std::io::ErrorKind::Other, e)
}

It just avoids some string conversion and vec allocation, otherwise it's the same thing.

I'm using io::Result as a catch-all, which is not best practice but often works fairly well.

Thanks! That worked!
Nice "trick" to deal with that error, I'll keep it in mind for future use :slight_smile:
And thanks to everyone who replied, I'm pleasantly surprised with the quick and
kind responses I've got in this forum.
Cheers, Toby

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.