Hello! It's always exciting to see a new person interested in Rust.
I'm having a hard time figuring out how that code example you gave would fit in a bigger program. The purpose of that for
loop is particularly confusing! I don't think I would recommend this kind of problem for a beginner to solve if they are just trying to learn the language.
However, I think I understand what it is supposed to do. I don't think it would help you if I just rewrote it in Rust for you, so I will walk you through it, instead!
First of all, we have to create a new struct. The struct you gave doesn't match the struct that the code appears to be expecting, so we will design a struct that will work for this code, and then try to translate the code into something that will work in Rust.
We should probably name it PackChar
just to stay consistent with the C++ version. That kind of struct name is perfectly allowed in Rust (where each word is capitalized, we call this "camel case"), so we will stick with it. Now, for the fields.
Rust does not have an actual unsigned int
type, but it does have something very close! Since unsigned int
is an unsigned, 32-bit integer, at least on all modern machines, u32
is the equivalent in Rust. The struct that is actually supposed to work with the GetPacksChar
function probably takes an int
instead of an unsigned int
, because that is what i
is defined as when we get to this line:
CurPackPnt->IntVal=i;
However, since we never use negative numbers here, unsigned works just fine.
While Rust technically does have a char*
type (to support interaction with C code), we will use something a little better for our Rust code, because it handles all the nasty pointer-stuff for us: String
. This is similar to std::string
in C++. The way we use it will be quite a bit different than the way we would use char*
, but it will be so much better.
So our struct definition should look like this:
struct PackChar {
int_val: u32,
buffer: String,
}
I used the field names from the function code because they looked cleaner. Notice that I changed IntVal
to int_val
. This form is the preferred style in Rust; it is called "snake case".
The next thing we need to do is come up with a new signature for our GetPacksChar
function, because Rust won't accept void GetPacksChar(int size, PackChar** DpArrPnt)
. So let's take a second and think about what this function is supposed to do.
It takes an argument, int size
, that appears to be the number of PackChar
to create, since we use it later to allocate in CoTaskMemAlloc(size * sizeof( PackChar ));
, and then as our stopping point for our for
loop. That will work for us. However, since Rust doesn't have an int
type, we have to pick an alternative again. Let's just go with u32
to be consistent.
The PackChar** DpArrPnt
argument is very interesting. It looks like we're taking a pointer to a pointer, probably because the caller wants our function to update that pointer so it can use it after our function returns. This pointer is probably meant to point to several PackChar
allocated in a row; this is reinforced by us allocating size * sizeof( PackChar )
and assigning the result to this out-pointer. Calling this function probably looks something like this:
PackChar* MyPacksChar;
GetPacksChar(10, &MyPacksChar);
However, in Rust, taking a pointer to return a value in is an anti-pattern. This means it's the opposite of what we actually should be doing, because the language gives us better tools to solve this problem. Additionally, we really do not want to deal with raw pointers like this, because it gives us a lot of chances to screw up.
Instead of taking an out-pointer like this and returning void
, we will instead return an instance of a type that handles a "row of the same type" situation like this for us: Vec
. The C++ equivalent would be std::vector
.
So our final function signature should look something like this:
fn get_packs_char(size: u32) -> Vec<PackChar>
Notice that I changed the function name to snake case. This is the preferred style in Rust here, too.
Now, we look at the guts of the function. That first line looks to be pointless—count
is never used—so let's just ignore it.
TmpStrSize
and its interaction with dummyStringDataObject
is a little scary. If someone accidentally made TmpStrSize
equal to 11, then this line:
dummyStringDataObject[TmpStrSize-1] = '0' + i % (126 - '0');
would write past the bounds of dummyStringDataObject
and change something totally unrelated! This is very, very dangerous! In fact, the Heartbleed bug is caused by a very similar mistake; in that case, they actually let the user determine the value of the index into the string, and read out everything before it, so an attacker could make the OpenSSL library read out everything in memory after the string object if they wanted to, just by giving a number much larger than the size of the string!
However, because Rust is super awesome, we don't have to do bad stuff like this just to make our programs work. In fact, we won't be doing anything like this at all. Instead, we'll use the format!()
macro, which works a lot like C's sprintf()
function, to create the String
right when we need it, which will also eliminate our strdup()
call.
So, without further ado, let's look at our updated function and then go down line-by-line:
fn get_packs_char(size: u32) -> Vec<PackChar> {
use std::char;
let mut out_vec = Vec::new();
for i in 0 .. size {
let int_0 = '0' as u32;
let last_char_val = int_0 + i % (126 - int_0);
let last_char = char::from_u32(last_char_val).unwrap();
let buffer = format!("abcdefgHi{}", last_char);
let pack_char = PackChar {
int_val: i,
buffer: buffer,
};
out_vec.push(pack_char);
}
out_vec
}
That's pretty concise, if I do say so myself! It's a few lines longer than the original version but I wanted to make it pretty easy to read. Let's go down line-by-line, shall we?
We can skip the function signature since we already talked about it earlier.
use std::char;
We need this import to get the char::from_u32()
function. Imports are usually done at the top of the file, but putting it here is fine for our purposes.
let mut out_vec = Vec::new();
This line is pretty simple. It just says, "Create a mutable variable called out_vec
, and initialize it to a new (empty) Vec
." We don't pre-allocate like we did in the C++ version, because Rust won't let us access uninitialized memory in safe code. (Vec
does have the with_capacity()
function which does pre-allocate, but it still won't let you access that memory so it doesn't mean much to us. It's just there for optimization, mostly.) The Vec
will adjust its allocation as we add elements to it, all behind the scenes so we don't have to deal with it.
for i in 0 .. size {
This is Rust's take on a for
loop. Since Rust is based heavily around iterators, the for
loop takes them directly: 0 .. size
creates an iterator that will yield integers in-order from 0
up to, but not including size
. This is roughly equivalent to for(int i = 0; i < size; i++)
, but it's much more powerful because it lets you loop through collections with the same construct.
let int_0 = '0' as u32;
let last_char_val = int_0 + i % (126 - int_0);
This is the same arithmetic we see in the first line of the for
loop in the C++ version. I just translated to Rust and cleaned it up a bit. Since char
does not directly support arithmetic, we have to convert '0'
to a u32
first, and I figured it would be cleaner if we used an immutable variable instead of doing the conversion twice.
let last_char = char::from_u32(last_char_val).unwrap();
This call converts last_char_val
, which is a u32
, to a char
. However, not every u32
can be safely converted to a char
(because of Unicode screwiness), so this function does a checked conversion, returning Option<char>
, which in Rust means a value that may or may not be present (kind of like pointers in C++ that can be NULL
but much safer to work with and not necessarily involving pointers). The .unwrap()
call just says, "Convert this Option<char>
to a char
. If it's not available, then just quit; I don't want to deal with that case."
Since our arithmetic guarantees that last_char_val
will be in a valid range for converting to char
, this error should never happen. Rust just forces us to acknowledge the possibility. This might seem annoying in this case, but it's very beneficial overall, because programmers are generally horrible at remembering to handle edge cases.
let buffer = format!("abcdefgHi{}", last_char);
This line combines the first line of the for
loop in the C++ version with the strdup()
call. The format!()
is a macro invocation that basically says, "Take this string, "abcdefgHi{}"
and replace the {}
in it with the value of last_char
. Then, give me a new String
as a result." This is much safer than manually replacing the last character in the string and then copying it. Since we never deal with indices, we can't accidentally go out of bounds.
let pack_char = PackChar {
int_val: i,
buffer: buffer,
};
out_vec.push(pack_char);
Since Vec
won't allow us to access uninitialized memory like we could with PackChar**
in C++, we have to create the struct on the stack first before adding it to the vector. out_vec.push()
simply adds pack_char
as the last element of the vector. Again, this is much safer than in C++ because we're not using indices directly and potentially accessing uninitialized memory.
out_vec
This is an implicit return. Since everything is an expression in Rust, the last line of any function that returns anything has to be an expression that evaluates to that value, unlike in Java or C++ where we would have to write something explicit like
return out_vec;
though you can do this in Rust if you want to return from an earlier point in the function.
And that's it! The best part is, you don't have to do any of the cleanup yourself! The returned Vec<PackChar>
will deallocate when the caller stops using it, and it will also make sure every String
inside every PackChar
is deallocated as well. You don't have to worry about any of that. Rust takes care of it all for you.
As for tutorials, have you read The Rust Programming Language book yet? It's the official guide to Rust, maintained by the Rust team and community. From what I can tell, it doesn't assume too much about the reader. It especially shouldn't assume any systems programming experience, because a lot of our community has come from higher-level languages like Ruby.
If you do find yourself struggling with a particular section, please let someone know! We need to make sure that our official literature on the language is easy enough to comprehend for most people. You can open an issue on the Rust GitHub if you spot something that you think someone should take a look at.
It sounds like English isn't your first language. That's okay! Rust has a thriving multilingual community. There might be a translation in your native language, if you think that would be easier to understand. If not, perhaps you can help start on one? Sometimes the best way that you can help yourself learn something is to try explaining it to others.
If you have IRC, I do recommend getting on #rust and #rust-beginners on the Moznet IRC. Those are the best places to get immediate responses to your questions, though they're not always active. (Just please don't try pasting code snippets in IRC; it's way too much for the chat.) There might be a Rust channel for speakers of your language as well, or you can ask if you can start one!
Note that to join #rust you will need to register a nickname; we've had some problems with spammers recently and have had to restrict the channels to registered users only. This may have been lifted since I last got on IRC, but if you find yourself unable to join, this is probably the reason why.
If nothing else, you can always come back here! That's what this forum is for.
Was this helpful at all, or was it just far too much to read?