What crate to parse and modify Html?

Hello everyone,

I am looking for a crate that allows me to create HTML elements from scratch, parse HTML strings and also has a DOM representation which I can modify (e.g. search, insert, remove, update elements etc).
There are a couple of crates that do either one or the other, but none of them do all of the above.
The crates I looked into are:

  • causal-agent/scraper
  • skubalj/build_html
  • yoshuawuyts/html
  • y21/tl

Can someone recommend some crates?

1 Like

kuchikiki (a fork of the now-unmaintained kuchiki) is what I use.

1 Like

Thanks for the reply. It seems to be good for parsing and DOM modifications, but what about constructing html fragments? (Other than parsing a string)?
Is there a way to construct elements nicely, similar to how the build_html crate does it?
Unfortunately the docs of the crate you mentioned aren't that good, not many examples etc..

html5ever is good, official parser used by the firefox web browser.

2 Likes

How would I use it to construct HTML fragments from scratch tho?

1 Like

Huh, that could be a problem. Go make a crate yourself or Graydon Hoare is disappointment in you.

Unfortunately, I do not have the time for that. Hence I am looking for an existing solution. Again, I need:

  • Parsing
  • DOM Manipulation
  • HTML Fragment creation from scratch.

You made Graydon Hoare mad. Anyway, I'll try to find something, and if I can't (although this could be a bit ambitious) I could try to make something.

1 Like

This could be helpful.

I looked a bit into this one, but I don't think creating fragments is very intuitive. At least from a first glance. Also, there aren't really many examples, only found one unit test, that doesn't really show much.

As an example, the following crate looks really good (at least from how its used): GitHub - yoshuawuyts/html: Type-safe HTML support for Rust

Unfortunately, DOM modification is somewhat limited, - for example removing or updating individual segments of the DOM isn't really possible. Otherwise, that crate would be perfect.
Couldn't find anything similar to this, either.

"Nicely" is a bit of a stretch, but here's an example from my crate that uses it:

        let element = NodeRef::new_element(
            QualName::new(None, ns!(html), local_name!("a")),
            vec![
                (
                    ExpandedName::new(ns!(), local_name!("href")),
                    Attribute {
                        prefix: None,
                        value: "https://en.wikipedia.org/".to_string(),
                    },
                ),
                (
                    ExpandedName::new(ns!(), local_name!("rel")),
                    Attribute {
                        prefix: None,
                        value: "mw:WikiLink".to_string(),
                    },
                ),
            ],
        );

I don't think it would be too hard to write wrapper functions to hide some of the boilerplate.

First, I'm very new to Rust, so unable to help.
Second, I think I'm looking for the same thing, something like jsoup in Java world.

The following is a new crate for creating HTML elements from scratch using Rust structs: https://crates.io/crates/html_compile