Storing Strings In A Crate Library?


#1

Creating a simple passphrase generator and would like to make it available to import into other programs via crates.io. Passphrases are made up of a series of words and currently I’m storing those words are in vectors. That worked in the context of a Rust Playground demo, but Cargo won’t let me build a lib.rs file that has let in it. Not sure how to fix that…

Ideally the crate stores the bank of words (ideally very large ones) so that you can use it to generate passphrases in an application without having copypasta a huge vector. It would be cool too if users could substitute their own word banks for various languages or just to add more entropy to the passphrase by using very obscure words.

Would a hashmap or other data structure work for this? Thanks in advance :slight_smile:


#2

Could you share the working playground example? From your words it looks like you were trying to use let outside any function, and this should be illegal, since static objects are handled differently.


#3

Yeah! Just click run: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=9803042f76739a9aa570e14ddb3a38ae


#4

You could always try something like this:
pub static PHRASES: &'static [&'static str] = &["alpha", "beta", "gamma"];


#5

Wow that worked! Thank you :slight_smile:

Do you by any chance know where I could learn more about why that worked?


#6

static is syntax to introduce a global variable - usually immutable - which can be accessed anywhere. let only ever introduces local variables, which is why it only works in functions. statics are built at compile time (unless you use the lazy_static crate), and can only be accessed by immutable reference.

See https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html#accessing-or-modifying-a-mutable-static-variable for a tiny bit more information? I swear there used to be more information about static and const in the book…


#7

https://doc.rust-lang.org/reference/items/static-items.html


#8

Constants are here: https://doc.rust-lang.org/book/ch03-01-variables-and-mutability.html#differences-between-variables-and-constants :slight_smile:


#9

Thanks for all the help everyone :slight_smile:

@camden-smallwood I still don’t get why we declare &'static twice:

pub static PHRASES: &'static [&'static str] = &["alpha", "beta", "gamma"];

All the examples of Rust code I’ve come across before declare the type once like is shown in the Static items page:

static mut? IDENTIFIER : Type = Expression ;

#10

Ah, it’s because of the following:

pub static PHRASES

is the first part of the declaration, meaning that the variable itself is static, and therefore global, then

PHRASES: &'static

makes the stuff it refers to static as well, meaning that the contents will be embedded into the binary, and if this is true, then it is implied for the contents, (Which &strs are, like "abc" etc.) therefore making the 'static reference be to a slice of 'static variables (The strings which are &str, or more accurately &'static str because they are embedded into the binary.)

To directly address your question:

static mut? IDENTIFIER : Type = Expression ;

follows your declaration, it’s just that Type is &'static [&'static str] and expression is &["alpha", "beta", "gamma"]


#11

Either pub static or pub const would work for this declaration, because I don’t think the reference itself will be changing to another slice… Anywho, let’s take a look at the type declaration: &'static [&'static str]

Basically read left-to-right: it’s a reference with a 'static lifetime to a slice of references of str which also have 'static lifetime.

The first 'static lifetime means the data of the slice reference lives throughout the whole lifetime of the program. The second 'static lifetime means same for the data of each str reference in the slice data.


#12

In this particular case, it still matches. PHRASES is the identifer, &'static [&'static str] is the type, and &["alpha", "beta", "gamma"] is the expression.

As for why you need &'static [&'static str], not just [&'static str] - it’s because [T] is not a sized type. It’s an array of some size, but that size is necessarily stored in the pointer to the array. [T] can never live on the stack, and it can never be the direct value of an variable (or static) because it is “unsized” and has no fixed size.

You can do a fixed size static array like this, without the indirection:

pub static PHRASES: [&'static str; 3] = ["alpha", "beta", "gamma"];

This is just not as generally useful since then the size is part of the type, changing that size is technically a breaking change, and you have to keep updating the 3 to be whatever the size actually is.

So, instead, we use &'static [&'static str]. This is a pointer to an array - rust automatically puts the ["alpha", "beta", "gamma"] in one section in static memory, then the & in &["alpha", ...] grabs a pointer to that memory, grabs the length (3) to make it a fat pointer, and sticks that fat pointer (containing the pointer and length) in your static variable PHRASES.


For the &'static str inside the array - it’s a similar situation. str is unsized, but unlike arrays there is no sized variants. Rust automatically puts string literals into program memory and they just come out as pointers to that memory - static references with a length referencing some string in memory. That &'static is in your type signature because each string is that type, &'static str.

Hope that makes sense and gives rational for the double &'static?


#13

@daboross Thanks! I’m including the vector of strings into an API so that someone can generate passphrases from those strings without having to load or worry about loading or managing a library of words. With that in mind, I think it would make more sense to do a fixed size static array since that vector should never change. This would make the program to run faster because it would be accessing the stack rather than a pointer to the heap right?


#14

You don’t need to say 'static when defining references in statics, so this is fine

pub static PHRASES: &[&str] = &["alpha", "beta", "gamma"];
// or 
pub static PHRASES: [&str; 3] = ["alpha", "beta", "gamma"];

#15

@KrishnaSannasi Thanks!


This does not define a fixed length for the array, so does it create a pointer to the heap?

pub static PHRASES: &[&str] = &["alpha", "beta", "gamma"];

And does this allocate a fixed array on the stack?

pub static PHRASES: [&str; 3] = ["alpha", "beta", "gamma"];

#16

Both don’t allocate anywhere, they are stored in the program binary. This is how all statics are handled. The pointers are pointers into the binary.


I’m not sure about this, but I think this is how it is


#17

Your static variable will still be in static memory, neither in the stack nor the heap - but it’s true that it will be a direct variable in static memory rather than a pointer to another place in static memory. I don’t think the extra indirection will matter too much for performance, though - the compiler should be able to optimize it out.

This does not define a fixed length for the array, so does it create a pointer to the heap?

pub static PHRASES: &[&str] = &["alpha", "beta", "gamma"];

PHRASES here is in static memory - a place loaded along with program code. In particular, it is a pointer to another place in static memory which stores the array.

And does this allocate a fixed array on the stack?

pub static PHRASES: [&str; 3] = ["alpha", "beta", "gamma"];

PHRASES here is also in static memory, but unlike the example above, it’s just a 3-pointer long bit of memory.

As mentioned above, I don’t think one or the other will be better for performance. If you access it in a very tight loop the extra indirection might hurt, but even then I wouldn’t consider it important unless profiling found that to be the case.

And besides the above, if fast access matters more than other things, there’s also const:

pub const PHRASES_CONST: [&str; 3] = ["alpha", "beta", "gamma"];

This will insert PHRASES_CONST onto the stack whenever it’s used - it’s not stored separately from the code static memory (well, the strings in it are, but not the array itself). It acts as though ["alpha", "beta", "gamma"] were written instead of PHRASES_CONST whereever it’s used. I wouldn’t recommend this method because it means your array will be duplicated every time a different function uses it, and it will be created on the stack every time the function is called, but it’s another alternative if profiling does show you this is a problem.

I’ve also brought it up because I think you might have been thinking static acts like const does?

Hope the distinction between those and further explanation helps.