Continuing the discussion from (Wrong) Direction of rust :
This may very well happen. The upside is that from the multiple implementations a portable and efficient version can be picked (or created with insight gleaned from the existing impls) and put into std. This boils down to: it’s always easier to add to std than remove/change.
As mentioned, the thinness of std is a contentious point - there are valid arguments on both sides. But look at Java - even it, despite a massive stdlib, has large auxiliary external libs that are widely used - the nume…
Read in std::io - Rust is what doesn't really belong there, but I suspect was added as a convenience. An implementation that doesn't have UTF8 strings internally can return an error for that method, but that method ought to not be there in the first place.
Branch to Rust compared to C# on all except string comparison
Have you never written a parser?
What happens when you read a stream of bytes that's actually UTF-16 encoded?
You get a stream of 16-bit codepoints. Not bytes.
Then if you wish to parse this further with a lexer, you'll get a stream of tokens, typically 32-bit integers. Not bytes.
Not everything is byte, that's why we have strongly typed languages.
Not everything that streams large chunks of contiguous data around is a POSIX file handle and returns 32-bit integer I/O error codes.
In my min…
How do you plan represent UTF-8 in such approach?
Your posts seems to be written under assumption of fixed-sized encodings. I understand that you come from the Windows world, but Rust has made a consious desicion to use UTF-8 as the main string encoding and supporting all other kinds of endoings in the std will just lead to bloat. And if I understood your proposal it will result in needless compexity in a lot of the code.
Do we need Windows-oriented ecosystem of crates? Yes, of course. Rust p…
The relevant quotes from the first post are shown below. This topic derives from (Wrong) Direction of Rust
peter_bertok:
If a gun was pointed to my head and I was forced to pick one thing that I could point to in Rust that I feel is a total failure and stops me using it is its strings. They are by default mutable, a concrete type instead of a trait, and there are way too many variants of it, not even including char arrays and the like that turn up in interop. I feel like this is a catastrophic design mistake. C++ got this wrong and Java and C# got it right by having one immutable string type.
Vitalyd:
Automatic dereferencing is just one example in this category. Someone got lazy and decided they didn’t like typing asterixes all over the place and added an ill-thought layer of magic for their own convenience that is now a landmine for even the most trivial manual refactorings. Combined with the over-use of macros, I just don’t see how the language will ever get IDE support equivalent to what Java had 15 years ago, let alone today.
I suspect you’re referring to deref coercions rather than auto deref; the latter is completely useful and obviates lots of noise that would ensue otherwise. Deref coercions, however, can appear magical in the beginning, particularly when coupled with heavy type inference use. They’re less magical with some experience, and in fact a useful feature.
There is a good reason for that. For example, Rust uses UTF-8 for its own strings, null chars are allowed, and string are not null-terminated. One can't happily call C functions using them, because C expects a different kind of strings. Windows also doesn't work with UTF-8 natively, it expects UTF-16. So all this string types simply state that they are internally incompatible among each other.
In C# there is the same problem, just upside down. It uses UTF-16 internally as does Windows, but at the same time it lacks UTF-8 strings, one only can have them as byte arrays. One practically can't have an "one-size-fits-all" string type.