i crawler html,but i want just content which remove all html tag.
just like
My cat is very grumpy.
and i can get the content: My cat is very grumpy. i search many html parse lib,but none sacrify me. how to do that?thanks your any suggestion!i crawler html,but i want just content which remove all html tag.
just like
My cat is very grumpy.
and i can get the content: My cat is very grumpy. i search many html parse lib,but none sacrify me. how to do that?thanks your any suggestion!Have you tried using an xml parsing library where you just get the Text (quick-xml) / Characters (xml-rs) events?
not yet.i tried use reqwest get text from response,and use select = "0.4.3",but no method can get content without html tag.
i just want the method like jsoup in java,
Document document = Jsoup.connect(url).get();
String content = document.body().text();
very thanks your suggestion!
@wangyiran125 pls see similar question, I think you may find answers useful Recommendations for HTML parsing
thank you very much~
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.