My first rust project (launcher for Python scripts)

pfmoore · February 7, 2021, 2:18pm

I've been looking at rust for a while, but I thought I'd try to get some "proper" experience, by converting one of my existing projects into Rust. The project is a Windows "launcher" for Python scripts, it looks for a script alongside the executable, finds an appropriate Python interpreter and runs the script with it.

It's not a particularly complex program, but it's a pain in C because of all the filename manipulations (as well as the usual hassles of memory management, etc).

The code is here: https://github.com/pfmoore/pylaunch2

I was pleasantly surprised at how easy it was to get the program working in Rust, and while it took me a little while to work out a good structure (I want two executables, one a console-mode Windows executable and one a GUI-mode executable), I got something working super fast. I'm now looking at how to make it mode "idiomatic", and sort out any rough edges where I could be doing things better.

I used the anyhow crate for errors, mainly because it gave me nice errors with very little effort. I'd be fine with hearing about how I could improve this, but honestly, it's good enough for me as it is.

Particular things I am unhappy with:

Having to hard-code the project name in use pylaunch::{...} statements in the binaries. I expected to use a relative name there, but I couldn't get it to work.
The whole Config struct. It feels messy, with all the 'static lifetime annotations. But I really wanted to avoid unnecessary copies of strings (yes, I know, premature optimisation...) as everything here is entirely static - it's basically just encapsulating the "magic values" that distinguish the GUI and the console versions of the code. Someday I may add the ability to read a "config" from a file, but right now I have no need for that, and I'd like to prioritise the basic case of static values.

And obviously, hints on any general style issues where someone with more experience would write things differently would be much appreciated.

eko · February 7, 2021, 2:44pm

I think you can use generic lifetimes for your config struct instead of 'static:

pub struct Config<'a> {
    pub exe_name: &'a str,
    pub launcher_name: &'a str,
    pub lib_location: &'a str,
    pub env_locs: &'a [&'a str],
    pub extensions: &'a [&'a str],
}

Hyeonu · February 7, 2021, 4:01pm

Or let it own its contents? I don't think saving some allocations within the config struct improve the performance with observable difference.

pub struct Config {
    pub exe_name: String,
    pub launcher_name: String,
    pub lib_location: String,
    pub env_locs: Vec<String>,
    pub extensions: Vec<String>,
}

droundy · February 7, 2021, 4:25pm

I'd make all your functions methods on Config, and in your binaries I'd put the Config directly into main.

pfmoore · February 7, 2021, 4:42pm

Thanks. I'm sure you're right regarding performance, but does that mean that using static string constants directly is not idiomatic in Rust? Or am I reading too much into your comment?

I guess I'm trying to get a feel for where Rust fits on the convenience/performance scale. It's definitely far more convenient than C (it feels as high level as Python, to be honest) but I'm not sure how much performance I'm trading for that convenience. I know I lose the ability to work in OS-native wchar_t, taking the hit of conversions to and from UTF-8 for the convenience of a nice string type, but I'm not sure whether I also have to stop caring about memory allocations (which is nice, I don't like managing memory, but I also don't like worrying about whether I can trust the runtime, I'd rather know )

That sounds like an interesting alternative approach, but I'm not sure I follow what you mean. Could you expand a bit?

Rustaceous · February 7, 2021, 5:01pm

Ownership is good. Your library would prevent a program from doing anything other than keep Config's values around for the whole program, even if it did not want to. Those fields are hard-coded to the static lifetime, which is usually only done for const values or when instantiating top-level types that need a borrowed value in one of its fields for some reason. Avoiding ownership by explicitly setting the static lifetime seems like an anti-pattern. Avoiding lifetime generics in a type by forcing the static lifetime is definitely an anti-pattern. At the very least you can let the user decide which lifetime to use, but Rust programs are supposed to have places where the important values are owned.

Config also doesn't implement Clone, so you're not going to accidentally duplicate it somewhere in its methods. You're also using Config as a const, so that instance is implicitly in the static lifetime.

Libraries should be flexible, programs should do only what they need to do. Your program seems good. I would just make the library more flexible by using ownership to be sure that you get the practice for writing flexible Rust code.

Aside from that, it would be more idiomatic to have the functions that use Config in the Config impl, making it clear that they are coupled to that type. It's purely a matter of syntax, replacing "&Config" with "&self".

2e71828 · February 7, 2021, 5:03pm

&'static references are extremely limited. They have to either be baked into the executable binary or be produced by a memory-leaking function. Usually, structures and function arguments will be generic over the strings’ lifetimes instead. This allows the caller to produce the strings however is most appropriate to the wider problem. The borrow checker will ensure that the structure is destroyed before its internal references expire.

Using owned Strings instead of references makes the object more flexible, as it’s no longer tied to the stack frame responsible for the original string data. @eko’s version, for example, won’t let you write a function that calculates some of the data and then returns a Config object: That data is stored in a local variable and is destroyed when the function exits, which would leave dangling references inside the Config struct.

Hyeonu · February 8, 2021, 8:36am

Static strings are good for non-configurable values. But when I see a struct named Config, I would expect I can fill it with some runtime values parsed from the cli params or the config file.

Michael-F-Bryan · February 8, 2021, 9:04am

I would say it's more that by using a String you allow strings that are created at runtime as well as string literals.

You may also want to store this config in a file, meaning the strings won't be available until runtime. That means you'd need to either give the Config a lifetime (e.g. exe_name: &'src str) or use heap allocated Strings.

2e71828 · February 8, 2021, 9:11am

Also note that, in this situation, the lifetime option will (mostly) prevent the Config<'src> from leaving the function that loads the config file: The file contents need to remain on the stack somewhere so that Config's references remain valid.

pfmoore · February 8, 2021, 10:16am

Thanks for the clarification. As I said originally, this isn't a goal right now, although it may be in future. By calling the struct "Config" maybe I give the impression it's more likely than I expect - a better term may be "magic constants". If it weren't for the fact that the GUI and console versions had different values, I'd have just hard coded them (or used consts).

I think what I'm struggling with here is not so much "performance" as the fact that it seems to not be possible to seamlessly use hard-coded static strings the same way as dynamic Strings when all you want is read-only behaviour. Maybe that's just my C background showing through, though. I'll see how things go and think about this a bit more later

Michael-F-Bryan · February 8, 2021, 10:34am

I think the part that's causing a bit of friction is that C would normally track whether a string is dynamic or not by either runtime state (e.g. a boolean flag) or just "knowing" that the string came from a string literal. On the other hand, Rust prefers to encode a lot of that stuff into the type system so a dynamically created string (String) has a different type to a static string (&'static str) and you can't assign one to the other without some sort of explicit conversion.

You could also use a std::borrow::Cow<'a, str> to hold either a Cow::Borrowed(&'static str) string literal or a Cow::Owned(String) dynamic string. That's about as close as you'll get to the C version while also making sure memory is freed properly.

pfmoore · February 8, 2021, 10:52am

Cool, that sounds like way more complexity than I want or need for an app this simple, so I'll stop obsessing about this now

The key things for me are (1) it's part of how the type system ensures things get allocated and freed correctly, so it's a good thing that it's strict (losing track of what's static and what's allocated on the heap is a nasty problem in C, so not having that issue is worth the cost) and (2) if it ever really matters, there are ways to handle it safely.

By the way, when initialising a String field, am I right that I need to use String::from("whatever")? I'm mildly sad that it looks clumsy compared to a straight "whatever", but I'm guessing that being explicit about the allocation/copy is the point here, is that correct?

Michael-F-Bryan · February 8, 2021, 11:01am

There are a couple mechanisms which will do a &str -> String conversion.

The general From and Into conversion traits (i.e. String::from("whatever") or "whatever".into())
The ToString trait which explicitly converts something to a string ("whatever".to_string()) and is implemented for &str and any type implementing Display as a shorthand for format!("{}", thing)
The ToOwned trait for converting from a borrowed type to its owned equivalent ("whatever".to_owned())

I'll normally use either "whatever".into() (less typing) or String::from("whatever") (more explicit about the destination type), but all mechanisms call into the same code (String::from()) so have identical performance characteristics.

pfmoore · February 8, 2021, 7:12pm

Thanks to everyone for the helpful and interesting responses. My code definitely looks a lot nicer now, and I feel like I understand the logic behind the changes, and have learned a bunch of stuff.

I'm still a little bit sad about the Vec<String> initializers - vec![".venv/Scripts".into(), "python".into(), "embedded".into()] feels a bit verbose and obscures the important parts (the strings) a bit. But I do understand why it's like it is.

I saw on StackOverflow a suggestion of using a dedicated macro

macro_rules! vec_of_strings {
    ($($x:expr),*) => (vec![$($x.to_string()),*]);
}

Is that a reasonable thing to do, or would it be over-complicating things? My instinct is that if it was in the standard library, like vec!, I'd use it, but having to include the code inline tips it over the line into "too clever". But I've not really looked at macros, so I don't have a feel for good practices there.

One other thing I might look at later, is whether I could make a generic version of Config that handles String and &str cleanly. That would be a nice practical exercise for learning a bit more about generics.

Thanks again everyone for the help

2e71828 · February 8, 2021, 7:25pm

Instead of a macro, I'd use an extension trait:

trait RefToVec<T> {
    fn as_vec(&self)->Vec<T>;
}

impl RefToVec<String> for [&'_ str] {
    fn as_vec(&self)->Vec<String> {
        self.iter().copied().map(Into::into).collect()
    }
}

fn main() {
    let v: Vec<String> = ["hello", "world", "goodbye"].as_vec();
    dbg!(v);
}

pfmoore · February 8, 2021, 8:37pm

Ooh, that looks cool! I'll do some research into this - I can read the code and see what it does, but I'd like to make sure I understand the mechanism behind it. Thanks for this.

So many neat features to explore

pfmoore · February 19, 2021, 10:25am

I've now reached the point where I want to read (at least some of) the config values from a file. So I added a bit of code using serde, and I was really impressed - it's so easy to do, and it basically worked first time (unlike most of my code ).

But one thing I want to do is allow the user to specify some of the values with defaults for the others. That's fine, serde has #[serde(default="function_name")] which does that. But I want the defaults to depend on whether it's the "gui" or the "console" version of the program.

So what I have is Config defined in lib.rs with a #[serde(default="default_exe")] annotation on the exe_name attribute, and fn default_exe() -> String { "python.exe".into() } in my pylaunch.rs main program.

But that doesn't work - the compile fails with

error[E0425]: cannot find function `default_exe` in this scope
  --> src\lib.rs:11:21
   |
11 |     #[serde(default="default_exe")]
   |                     ^^^^^^^^^^^^^ not found in this scope

That makes sense, insofar as lib.rs and pylaunch.rs are different files, and hence I guess different scopes (I've read Managing Growing Projects with Packages, Crates, and Modules in the rust book, but I must admit it confused me a bit and my experiments didn't match my expectations, so I suspect I'm misunderstanding something).

So how would I do what I want? Define the Config struct in the common library code, but have the function that provides the default be specific to the executable? Any suggestions or explanations would be appreciated

alice · February 19, 2021, 12:16pm

You can specify the full path to the function, e.g.

#[serde(default = "crate::pylaunch::default_exe")]

pfmoore · February 19, 2021, 1:26pm

Thanks. But when I did that I got

error[E0433]: failed to resolve: maybe a missing crate `pylaunch`?
  --> src\lib.rs:11:21
   |
11 |     #[serde(default="crate::pylaunch::default_exe")]
   |                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ maybe a missing crate `pylaunch`?

Even making the default_exe function in pylaunch.rs "pub" didn't help...

And I'm not entirely sure why I'd need to. My crate is called pylaunch in Cargo.toml, and both lib.rs and pylaunch.rs should be part of that crate, so I thought they would be able to use internal names without qualification?

I've "fixed" this sort of issue before by randomly adding "use" statements and "pub" declarations until stuff works, but I don't feel like I really understand what's going on - and it's at times like these that it shows

Topic		Replies	Views
Announcing config_struct - generating structs from config files announcements	2	1015	January 12, 2023
Code review for small-ish CLI application code review	8	954	June 30, 2022
Rust idiomatic way to collect and provide program configuration help	7	758	October 22, 2023
Configuration file formats help	23	2159	October 1, 2022
Ferium, my first Rust project! code review	8	1084	December 16, 2021

My first rust project (launcher for Python scripts)

Related topics