Rust Remove HTML tags from a String

wangyiran125 · November 28, 2019, 7:25am

i crawler html,but i want just content which remove all html tag.
just like

My cat is very grumpy.

and i can get the content: My cat is very grumpy. i search many html parse lib,but none sacrify me. how to do that?thanks your any suggestion!

tafia · November 28, 2019, 7:28am

Have you tried using an xml parsing library where you just get the Text (quick-xml) / Characters (xml-rs) events?

wangyiran125 · November 28, 2019, 7:43am

not yet.i tried use reqwest get text from response,and use select = "0.4.3",but no method can get content without html tag.
i just want the method like jsoup in java,

Document document = Jsoup.connect(url).get();
String content = document.body().text();

very thanks your suggestion!

dunnock · November 28, 2019, 10:00am

@wangyiran125 pls see similar question, I think you may find answers useful Recommendations for HTML parsing

wangyiran125 · November 29, 2019, 8:38am

thank you very much~

Topic		Replies	Views
Removing HTML Tags from a String obtained from Select Crate help	2	2425	June 27, 2020
Using .replace to replace html tags code review	1	685	December 23, 2022
Scraper help: "<b>foo</b> bar" -> "bar"	1	893	September 10, 2022
I have printer HTML on my console now how to separate the tag like meta or link and How to analyse it for SEO perpose community	5	494	September 13, 2023
How to use rust get a web page content	1	4633	December 20, 2015

Rust Remove HTML tags from a String

Related topics