So, I have an error in my code and I have a hard time understanding mutability and borrowing. I'm kinda new to Rust, so excuse my ignorance.
I would like to tokenize the given text with given charset. Actually, I have created a Playground example so that you can test it live.
It seems buffer is borrowed immutable in for loop. That's oddly counter-intuitive for me since I have already defined it mut above the for loop. Why does it borrow the mutable variable as immutable?
Also, the compiler complains about borrow, not move. I thought curly braces create new scope, thus move occurs without &buffer. What does it complain about borrow while I do not pass a reference to buffer with ampersand (aka &buffer)?
I'm also new to Rust ( And not english native speaker ) so I will try to do my best for helping here
The major problem here is that buffer have a lifetime of the function ( This is the third error ) error[E0515]: cannot return value referencing local variable `buffer` buffer is owned by the function and you push a string slice that is a 'reference' of the dynamic memory allocated by the buffer.
When you will exit the function, the buffer will be dropped and the related string will be freed.
If the compiler allow this, we will have a 'use after free' if a push is done line 17.
Thanks for quick reply. Now that I look at it I see the problem.
So, according to this assumption, cloning the buffer will get rid of the problem. Let me try that. I will edit this post.
Edit 1: It does not seem like a simple cloning problem since the clone itself also stays in the scope of tokenize_dirty and cannot get it out. Will try something else and provide an edit again.
If you're constructing the strings within your function, you'll need to return Vec<String>, not Vec<&str>. Those strings have to be stored somewhere, and if you don't return them, they'll just get dropped.
To do that, you'll need to return references into the original text, and avoid allocating any new strings - the function signature for that would look like this:
It's right, but the string allocate more memory than needed.
Maybe the usage of String is not relevant here, another algorithm with just &str can force lifetime cohesion.
The returned vector of string slice must have the same lifetime as the string slice text here.
I think the algorithm must be rethink with only &str .
I was sitting in front of computer to think a way about how to allocate a &str, then I found you use slices from the original text all along. (Damn, why did I not think that?)