I would like to replace strings in a template conditionally. However, the following naive approach fails due to lifetime issues. What's the best way to go (without doing an unnecessary extra allocation)?
use regex::{Captures, Regex};
fn main() {
let s1 = "Hello World!";
let s2 = Regex::new("World").unwrap().replace_all(s1, "Universe");
assert_eq!(s2, "Hello Universe!");
let s3 = Regex::new("World").unwrap().replace_all(s1, |caps: &Captures| {
let universal: bool = false; // actually some more complex computation
if universal {
"Universe"
} else {
&caps[0] // don't replace
}
});
assert_eq!(s3, "Hello Universe!");
}
Compiling playground v0.0.1 (/playground)
error: lifetime may not live long enough
--> src/main.rs:12:13
|
7 | let s3 = Regex::new("World").unwrap().replace_all(s1, |caps: &Captures| {
| - - return type of closure is &'2 str
| |
| let's call the lifetime of this reference `'1`
...
12 | &caps[0] // don't replace
| ^^^^^^^^ returning this value requires that `'1` must outlive `'2`
error: lifetime may not live long enough
--> src/main.rs:12:13
|
7 | let s3 = Regex::new("World").unwrap().replace_all(s1, |caps: &Captures| {
| ---- - return type of closure is &'2 str
| |
| has type `®ex::Captures<'3>`
...
12 | &caps[0] // don't replace
| ^^^^^^^^ returning this value requires that `'3` must outlive `'2`
error: could not compile `playground` (bin "playground") due to 2 previous errors
// required as defined
impl<F, T> Replacer for F
where
F: FnMut(&Captures<'_>) -> T, // T has no lifetime bound
T: AsRef<str>,
// your closure is
FnMut(&'s Captures<'_>) -> &'s str
where T has a lifetime bound with the input
But if T has no lifetime bound, why is it required to live longer than the argument passed to the closure? T doesn't have a 'static bound either, right?
The closure has to work for all input lifetimes, and for all such lifetimes, return the same type T. So it's impossible for T to capture an input lifetime.
In general, returning a reference derived from an argument is perfectly fine. The issue is that the generic function like that requires that the return type outlive the argument - however, this is not mentioned anywhere in the error message.
Note: the error differs between closures and fn items, so if you rewite it as fn item, you'll see a more clear ( ) error msg saying the lifetime requirement is introduced here
Update: the error is the same after fixing the higher order closure, and refer to the great answer provided by @quinedot below
I was thinking about the strict outliveness again: the pattern in OP is that for FnMut(&T) -> R, R should strictly outlive &T for any lifetime on &T.
But there is a subtlety: how come generic function like ["1", "2"].iter().map(|x| *x).map(|x: &str| x) works? Note that Iterator::map doesn't require the generic return type on F to strictly outlive the argument. The pattern now becomes for FnMut(T) -> R, strict outliveness doesn't apply. I.e
fn use_it<R, F: FnOnce(&Outer) -> R>(_val: F) {}
use_it(|outer| &outer.field); // error: lifetime may not live long enough
fn use_it<T, R, F: FnOnce(T) -> R>(_val: F) {}
use_it(|outer: &Outer| &outer.field); // works
Of course, I could define MyReplace in such a way that it contains the variablesVec. But I wonder if this is really worth it. In practice, I might just do .to_string() and not worry about the extra allocation. But I would like to understand how I generally do a "conditional replacement" idiomatically. I guess the answer is: it depends?
Yes. If that works, then you should do that absolutely. But you established a requirement in your initial post that you not do that. I figured you had done some benchmarking to rule it out as an option.
The key here is that by using a Captures for this, you're already asking the regex engine to do a bunch of extra work---including an allocation for Captures---on your behalf. So cloning the String is probably not going to do much to your runtime.
I would say the idiomatic approach is to just call .to_string() and be done with it.
But if you need something to be as fast as possible and don't need capture groups, then I'd suggest just writing your own replacement routine. It's not that much code, and it's not that complex either given the existence of Regex::find_iter.
The Replacer trait tries to give you a way to write some common cases with a couple tricks for optimization (like Replacer::no_expansion). But it is by design not going to work for every use case mostly because I don't know how to design a replacement API that works well for all use cases. (This is a lot trickier than it sounds, because even if you think you know how to do it, how much more complicated have you made it for the simple use cases that cover 99% of what people need? Because if you've made that harder, then that a design I would reject. And at some point, you have to pop up a level and consider the complication of the API versus the code you're actually saving someone from writing. You could probably write a simple version of replace_all that isn't generic in about 5 minutes.)
And here's an example that R doesn't have to outlive (any possible) &T to satisfy a FnMut(&T) -> R bound.
It's true that in the latter case, R could be borrowing from T, but I don't know that I'd call it a lending pattern per se, as neither the inputs nor outputs can't differ by lifetime, say. If there are lifetimes involved, they're fixed.
The last question becomes what does FnMut(&T) -> R mean on the input and output?
The one thing I can tell now is that FnMut(&T) -> R doesn't allow R to borrow from &T (R can borrow from T though) .
I wonder: Are Rust's closures not powerful enough to allow me to do what I like to do? (Assuming a different interface of regex.)
It seems like with a lot of effort, it's possible to let Rust do what I intended to do. We can demand that a closure returns a type that "depends" on a specific lifetime:
/// Type that "captures" a lifetime
pub trait CaptureLt {
type WithLt<'a>;
}
fn use_it<R, F>(_val: F)
where
R: CaptureLt,
F: for<'a> FnOnce(&'a Outer) -> R::WithLt<'a>,
{}
However, type inference fails, so using this complex interface is like:
fn main() {
// We need this to aid type inference in regard
// to the return type of the closure or function:
struct TyCon;
impl CaptureLt for TyCon {
type WithLt<'a> = &'a Inner;
}
use_it::<TyCon, _>(|outer: &Outer| &outer.field );
use_it::<TyCon, _>(f);
}
Note that this works on stable and doesn't require #![feature(closure_lifetime_binder)].
I wonder if Rust's type system could be refined/replaced to avoid this sort of trouble. This also reminds me of the necessity to box certain futures, solely due to issues with lifetimes.
Anyway, these are just hypothetical thoughts. Back to the original problem:
for<'any> F: FnOnce<(&'any Outer,)>,
for<'any> <F as FnOnce<(&'any Outer,)>::Output: AsRef<str>,
But you can't use FnOnce(&Outer) -> R for this because you're forced to name the return type and we don't have generic type construction parameters (or higher-kinded types or whatever). So then on stable you make your own trait with some blanket implementations etc... but probably run into heaps of inference issues and/or normalization issues.
So it's sometimes possible, but not always practical, and almost always at a cost.