Do you feel Rust is too verbose for simple things?

To get an owned string from a literal I've to do:

String::from("");

To slice an already owned string and get an owned string I've to do:

s[0..1].to_owned();

Sometimes a common method returns Cow and I've to do cow.to_owned().

Because there are multiple built-in strings type (unsized str and String), some functions have to add a generic type parameter like this:

fn takes_string<S: AsRef<str>>(s: S) {
    // s.as_ref().to_owned(); gives String
}

Out of bounds access in strings cause panic instead of returning a default value when compared to EcmaScript and, in some ways, when compared to Python. In EcmaScript you can do s.substr(0, 2) == 'fo' always fine.

A lot of crates mix these things, making the code unproductive. And the generics I described above appear very often.

I wonder if there are crates for inspecting Rust code so I can make my own dialect without too much effort, so I can resolve these things I said. Any idea?

1 Like

Unproductive in what way? You’ve listed a bunch of observations about how some Rust code works, but you haven’t yet connected the dots for us— What do these examples have in common that you find problematic, and how does combining them exacerbate the situation?

6 Likes

Well...

  • clap crate's App structure has 2 lifetime parameters. There's been some time since I last used clap though.
  • The standard library returns &str in many places, where String would be preferred. This includes the result of slicing a String.
  • Iterators seem complex to implement. In EcmaScript you can simply define a generator [Symbol.iterator]*() {} using the yield operator. Some folks here helped me with defining iter(), but I still didn't get it at all. Delegate iteration to field Vec In reality I've not been able to make a specific type "iterable", I was just able to provide an iter() method.

OK, hard to list, because most of the time I end up wrapping some of Rust std library functionalities with my own crates, but I feel pushed to use both string types due to the standard library. I've just considered a new idea: create my own built-in objects for Rust and prelude them. However, the problem is that "" still returns &str, and that's causing the following kind of verbose code, String::from("");. include_str!() too will return &str!

Do you feel Rust is too verbose for simple things?

No. On the contrary – thanks to its high-level abstractions and the reasonably-sized/reasonably-complete standard library, I find myself rewriting and re-inventing half-wrong versions of commonly needed low-level algorithms and data structures a lot less. I can thus focus on the interesting things, the "business logic" instead

The fact that you have to sometimes do a .to_owned() to get from borrowed to owned data is… just a technical necessity, and typing an extra method call is nothing compared to e.g. having to write Yet Another Concretely-Typed Hashmap™ if I were to code in C, for example.

Once you get better acquainted with the language, you will care less and less about low-level syntax and be able to focus on higher-level elegance much more.

11 Likes

This is, in fact, a trade-off between performance and convenience. Returning String unconditonally means unconditional allocating and copying, and in many cases that's unnecessary - so Rust chooses to make the performance cost explicitly opt-in, instead of having us to pay it even when it doesn't improve anything.

9 Likes

@hydroper1 I think it would help if you consider what would be lost if the verbosity you're talking about were resolved. Just as one very blatant example:

If all routines returning a &str were changed to return a String, then how would you perform said operations without incurring an allocation? You couldn't. The only way would be to duplicate all of the operations such that there is one for returning String and one for returning &str. Now the API surface has been doubled. Would you prefer that to status quo?

The key bit here is that &str is more flexible than String. So if you can return &str, it's generally a good idea to do so. Why? Because it gives freedom to the caller to either continue using &str if they can, or write a little bit of code to convert it to a String.

There are perhaps other resolutions, but I just considered the most obvious one here.

The main lesson here is that things don't exist in a vacuum. Most things are relative. It's really important that you measure both what you want but also what you have to give up to get it. (And perhaps different ways of achieving what you want have different costs. Exploring that space is what I personally think of as "craft" or "art.")

23 Likes

Thanks for the replies! I agree that having separate string types allows for some special optimizations. But there are times this causes code to look unecessarily complex. Like I said, having fn f<S: AsRef<str>>(s: S); is annoying. In my case I just want to use an interned string type and done (like in any high-level language). Unfortunately I'm not sure if it'd be possible for a Rust edition to exist, where "" and include_str!(...) can return the string type you want.

I understand that supporting an unique built-in interned string type is problematic, but at least in my case it'd be good.

Frankly, it looks like you just want to write ECMAScript in Rust. Because Java is not use an interned string type and done, C++ is not use an interned string type and done, lots of other languages are not use an interned string type and done.

And if you want to write ECMAScript in Rust then the question is: why? It's perfectly fine to use ECMAScript to write ECMAScript!

Sure, but it wouldn't be good in my case and it's unclear why Rust have to be modified to suit your needs and not mine. I don't think ECMAScript compilers would disappear any time soon… why do you want to use Rust to write ECMAScript?

3 Likes

Current EcmaScript tooling ran out-of-memory for me once (bundling etc.). Also, I tried to write a compiler for an EcmaScript dialect in JS and its symbol solver ran in minutes! That's why I wanted to use Rust. I can still try creating my own EcmaScript dialect, though.

When I tried to implement the symbol solver in ActionScript once, it was pretty fast, but my dialect is far from done and I've to redevelop its symbol solver.

1 Like

Rust does not build in an interned string type but it does have a refcounted string type: Arc<str>. You can convert any &str or String into it with just .into(). (Of course, this is a tradeoff with other considerations, like the ability to mutate the string.)

That's been simplified in clap 4.0 — you either use only &'static str or String, but either way there's no lifetime parameters.

Most Rust code does not bother to be generic over AsRef — it just takes &str or String and requires the caller to convert as applicable. No, it's not zero characters, but it fades into the background after a while of practice.

4 Likes

I was aware of Rc, Arc and Weak. But they don't perform internation. Anyway, my complaint is mostly being able to directly use my own string type.

You can do that except you can not turn string literals into your type. And I know exactly zero languages which would give you that ability, Rust is not unique there.

The closest thing that I know in any language is something like C++ where you can define a ways to interpret suffix after string literal.

I guess making "Hello, world!"s into String like with C++ would be nice, but it's not a priority.

1 Like

There have been some rumbling about this over the years, but I don't think anything has gotten too far. And I don't think anything has been done recently? But I could be wrong. See:

Haskell has it, but I believe you have to opt into it: 6.9.7. Overloaded string literals — Glasgow Haskell Compiler 9.9.20231128 User's Guide

4 Likes

It's still not literal of your own type (e.g. you can not use it to select overloaded function), but yeah, it comes close.

You can simulate something similar with providing From<&'static str> for your type and always using T: Into<MyString>. This would be similar, technically, but wouldn't reduce busywork, because Rust never calls these conversion methods implicitly (which IMO is a good thing).

1 Like

Rust's default string types are all about being building blocks with very predictable performance consequences. If you want something more streamlined, you might try something from a crate that tries to make smarter automatic choices.

A quick look finds FlexStr — Rust text processing library // Lib.rs, which looks like it tries to use the best of three different internal versions, so you can always just pass it and it'll be fine, with only minimal overhead.

5 Likes

By the way, it is never necessary and rarely is it even desireable (imo) to write a function signature like this. fn takes_string(s: &str) does the job without losing any flexibility. Moreover, if you're turning it into an owned string inside the function like in your example you definitely don't want an AsRef bound, a better option is fn takes_string(s: String). I wanted to mention this since "have to add" makes it sound like it's necessary, but it's just a small convenience for the users of the API.

7 Likes

So you want to use Rust because it's more performant, but you prefer if the language by default would do the slow thing (e.g. allocate Strings everywhere)?

7 Likes

I updated my post to say that I was able to implement a high-performance symbol solver (aka. type checker) in ActionScript. ActionScript is like EcmaScript, except it has runtime-concrete static typing. So "string internation" is not a slow thing. (I'm not sure how does ActionScript VM works though, compared to JavaScript's V8).

So, yes, Rust can be performant while using a main interned string type, considering I was able to achieve such thing in ActionScript.

My problem with ActionScript is that it doesn't support destructuring patterns, asynchronous functions, generators, type inference and so on. Their compiler is also buggy when it comes to control flow.

1 Like

It's a mixed bag.

Some things are incredibly concise, e.g. iterator chains in Rust can do a lot in a few lines, often even more compact than JavaScript. Error handling with ? beats if err != nil { return err; }.

Lack of custom literals is a drag. Not only they're missing for String, but also CStr, and numeric types like NoisyFloat.

But in general I don't mind. Rust is also quite locally explicit with its syntax, and that makes sense for a language that gives low-level control. I'd rather type .into() here and there than have an opposite problem of hunting invisible conversions or allocations that may be undesirable in some contexts.

19 Likes

Out of bounds access in strings cause panic instead of returning a default value

This can be solved by using str.get(0..2) instead of str[0..2]. The substring will be wrapped in an Option and, if it's an invalid substring, it will return None instead of panicking.

4 Likes