I wrote a very small crate which provides a writer-type for markup-like languages, e.g. HTML, XML and JSON, and also the writers itself. It is a first small version to be extended in the future. For my personal training and to find in the world of open source, I would like to publish this, but not without a review in several point of views.
Is there an important code structuring issue, which should be improved?
Are there any beginner-coding-issues I can improve?
Is the toml-file ok (keywords, categories etc.) or should it be extended?
Is there a certain type of licence which could be added and is it recommended?
Thank you for any kind of suggestion for improvement
You need a license if you want people to be able to use your crate freely and legally. The convention here seems to be MIT/Apache but it's really your (political) choice. (you can also choose GPL, AGPL, LGPL, BSD, public domain, etc.)
Public methods take &str then convert them to String. If the caller was forced to pass a reference to a String, this may lead to more allocations than needed. It may be better to ask for a String directly in this case. (but maybe not in certain cases... the more flexible would be a trait accepting both)
Maybe you should use format! instead of &["foo".to_string() + &bar].concat(). I'm not sure it's more efficient, but it sure is clearer.
I'm seeing some .len() > 0, which should have been detected by clippy. Have you run cargo clippy?
On line 200, you remove the end of a String using multiple pop in a loop. There are String methods to do that more efficiently in only one call.
Globally, each time you create a String or Vec, wonder if it's really necessary, or if &str or slice is enough. For example when comparing strings, it's cheaper to do &s == "blah" than s == "blah".to_string().
You don't escape anything, so you'll get invalid strings with "realistic" inputs. It's not clear what the w prefix means on the all your methods. Every key and every value are not necesserarily strings, you shouldn't expect the user to convert everything to strings.
In general I don't find this API particularly ergonomic. Just looking at a test (kudos for having tests):
Yes, I forgot to use clippy and now I did it and implemented all of its suggestions
About the way of string concatenation I somewhere read an article, but it is not clear to me, which way would be the best
The line 200 thing - I reimplemented it much more beautiful and easier to read
And I added now the default license, like you recommended, MIT+Apache2.0 - but I don't have any feeling about the differences betwenn several open-source licences...
Is there a certain trait, that forces to implement using both &str and String?
I tried to separate methods, which are writing to the content-String (edit the content of desire) and those methods, which only modify the setup of the writer... Would you recommend, to avoid this and just remove all prefixes?
What are kudos? And do you have an idea, how to implement it much better?
let mut wr = HTMLWriter::new();
wr.w_open_element("div><script>alert('hello');</script");
// with or without calling close_element
println!("{}", wr.core.content);
This will print <div><script>alert('hello');</script> which browsers will happily digest and run. Think also about quotation marks and many other "special" characters.
In general in Rust we really like APIs that can't be misused, where you can't forget to call methods like close (look at the File API for example).
Make an independant configuration struct (generally called a "builder") to build your writer, that builder should guarantee a valid state.
Kudos are praise, appreciation for something
In general, your "writers" should build up a tree of objects and only when requested serialize that tree into a string (or into a Writer). You may be interested to look at the internals of serde and serde_json.
You're probably coming from JavaScript and read one of the several articles that claim that concatenating many strings using ["foo", "bar", "qux"].concat() is faster than repeatedly adding them. But:
if you have two strings, then nobody cares, and clarity trumps performance.
it's simply not true in Rust, because String::add() takes the LHS by value, so it's not quadratic.
There are many ways to do this, but the two main ones is to take arguments that implements either Into<String> or ToString.
The Into<String> variant mainly accepts String and &str (but also some variants such as boxes and cows). It takes something that is already a string (as in a sequence of bytes that is a utf8-encoded text) and you can convert it (by the foo.into() method) to a heap-allocated String.
The ToString variant is more general, it takes anything that is "formattable" (anyting that implements Display) and you can format it into a String with the foo.to_string() method.
Note that in many cases (e.g. if you need to do some escaping before storing the result in a String, or if you only store it by copying it into a longer String (or buffer)), it can be both more efficient and a nicer api to just take the argument as an &str anyway.