I'm building a web scraper and so far I'm able to get html data and print the same. The output has a lot of html tags as well. I found a way to remove all tags and print only text part but I need it formatted.
For example :
some text in p block
some textI can ignore the output text and href links but I want to print output of list items the way they're present in website.
I found
htmlq
in crates.io and it seems to do the things I need but I could not find a way to use this directly within my program. It is a cli tool installed using cargo and all examples pipe curl output to it.Is there any other crate that can do things same as htmlq? Is there a way I can use htmlq directly in my program?