Conditional regex replacement

Indeed, we could make our own trait with blanket implementation (aka trait alias) to work around this (just hypothetically exploring this idea):

pub trait GenericFnOnce1Arg<Arg>
where
    Self: FnOnce(Arg) -> <Self as GenericFnOnce1Arg<Arg>>::Output
{
    type Output;
}

impl<T: ?Sized, Arg, Ret> GenericFnOnce1Arg<Arg> for T
where T: FnOnce(Arg) -> Ret,
{
    type Output = Ret;
}

Now we can use a bound like that:

fn use_it<F>(_val: F)
where
    F: for<'a> GenericFnOnce1Arg<&'a Outer>,
{}

Here, the return type does not need to be named (but could have additional bounds added, like AsRef<str>, if desired [1]).

Now, using the nightly #![feature(closure_lifetime_binder), we can use this API:

fn f(o: &Outer) -> &Inner {
    &o.field
}

fn main() {
    use_it(for<'a> |outer: &'a Outer| -> &'a Inner { &outer.field });
    use_it(f);
}

(Playground)


Update: Combining all these ideas, and getting back to the original problem, we could do:

#![feature(closure_lifetime_binder)]

struct MyReplacer<F>(F);

impl<F> Replacer for MyReplacer<F>
where
    F: for<'a> GenericFnMut1Arg<&'a Captures<'a>>,
    for<'a> <F as GenericFnMut1Arg<&'a Captures<'a>>>::Output: AsRef<str>,
{
    fn replace_append(&mut self, caps: &Captures<'_>, dst: &mut String) {
        dst.push_str((self.0)(caps).as_ref())
    }
}

fn main() {
    let s1 = "Hello World!";
    let s2 = Regex::new("World").unwrap().replace_all(s1, "Universe");
    assert_eq!(s2, "Hello Universe!");
    let s3 = Regex::new("World").unwrap().replace_all(
        s1,
        MyReplacer(
            for<'a> |caps: &'a Captures<'a>| -> &'a str {
                let universal: bool = false; // actually some more complex computation
                if universal {
                    "Universe"
                } else {
                    &caps[0] // don't replace
                }
            }
        )
    );
    assert_eq!(s3, "Hello World!");
}

(Playground)

I don't want to say it's a good or idiomatic way to go. But it's curious that it's possible to solve this issue in that way (on Nightly, and being willing to use a newtype wrapper to adjust regex's API)!


  1. (Playground) ↩︎

1 Like

I have been experimenting more with this, and tested if the regex API could provide a more generic implementation for regex::Replacer, and it seems it can:

diff --git a/src/regex/string.rs b/src/regex/string.rs
index 65a7674..4aa8a75 100644
--- a/src/regex/string.rs
+++ b/src/regex/string.rs
@@ -2371,6 +2371,22 @@ impl<'c, 'h> ExactSizeIterator for SubCaptureMatches<'c, 'h> {}
 
 impl<'c, 'h> core::iter::FusedIterator for SubCaptureMatches<'c, 'h> {}
 
+/// Trait alias for `FnMut` with one argument, which allows adding a bound
+/// without specifying the closure's return type.
+pub trait GenericFnMut1Arg<Arg>
+where
+    Self: FnMut(Arg) -> <Self as GenericFnMut1Arg<Arg>>::Output
+{
+    /// Return type of the closure.
+    type Output;
+}
+
+impl<T: ?Sized, Arg, Ret> GenericFnMut1Arg<Arg> for T
+where T: FnMut(Arg) -> Ret,
+{
+    type Output = Ret;
+}
+
 /// A trait for types that can be used to replace matches in a haystack.
 ///
 /// In general, users of this crate shouldn't need to implement this trait,
@@ -2501,10 +2517,10 @@ fn no_expansion(&mut self) -> Option<Cow<'_, str>> {
     }
 }
 
-impl<F, T> Replacer for F
+impl<F> Replacer for F
 where
-    F: FnMut(&Captures<'_>) -> T,
-    T: AsRef<str>,
+    F: for<'a> GenericFnMut1Arg<&'a Captures<'a>>,
+    for<'a> <F as GenericFnMut1Arg<&'a Captures<'a>>>::Output: AsRef<str>,
 {
     fn replace_append(&mut self, caps: &Captures<'_>, dst: &mut String) {
         dst.push_str((*self)(caps).as_ref());

(GitHub)

It passes regex's tests, and will allow to run this code on Nightly:

#![feature(closure_lifetime_binder)]

use regex::{Captures, Regex};

fn main() {
    let s1 = "Hello World!";
    let s2 = Regex::new("World").unwrap().replace_all(s1, "Universe");
    assert_eq!(s2, "Hello Universe!");
    let s3 = Regex::new("World").unwrap().replace_all(
        s1,
        for<'a> |caps: &'a Captures<'a>| -> &'a str {
            let universal: bool = false; // actually some more complex computation
            if universal {
                "Universe"
            } else {
                &caps[0] // don't replace
            }
        }
    );
    assert_eq!(s3, "Hello World!");
}

I don't think it would be a breaking change. However, the extra trait alias is somewhat ugly in the generated documentation.

What do you think?

Whether this is interesting for regex, I feel like it's a general pattern that might come in handy in other scenarios.


For future reference, I created a draft PR: More generic impl of Replacer for closures #1048.

In this context, I also found: the Replacer trait should be parameterized over a lifetime #777.


Update:

By implementing this less generically, we can actually get a quite clean API:

/// If a closure implements this for all `'a`, then it also implements
/// [`Replacer`].
pub trait ReplacerClosure<'a>
where
    Self: FnMut(&'a Captures<'a>) -> <Self as ReplacerClosure>::Output,
{
    /// Return type of the closure (may depend on lifetime `'a`).
    type Output: AsRef<str>;
}

impl<'a, F: ?Sized, O> ReplacerClosure<'a> for F
where
    F: FnMut(&'a Captures<'a>) -> O,
    O: AsRef<str>,
{
    type Output = O;
}

impl<F: for<'a> ReplacerClosure<'a>> Replacer for F {
    fn replace_append(&mut self, caps: &Captures<'_>, dst: &mut String) {
        dst.push_str((*self)(caps).as_ref());
    }
}

(GitHub)

The only thing that's a bit ugly is the redundant associated type "Output".

1 Like

Can you explain how it works?

I'll try to respond later in the discussion of the draft PR #1048, where I just added some more notes.

To workaround it, quinedot has provided a way in stable Rust[1], the coerce helper function that describes lifetime connections. Rust Playground


  1. you'll see it if you've clicked some example link ↩︎

1 Like

Thanks! I updated the draft PR and commented.

1 Like

I hear your warning and I'm curious if my above linked PR draft #1048 for regex is also subject to these "heaps of inference issues and/or normalization issues". I tried to hide all implementation details in a private module and believe there should be no issues (but happy to be taught otherwise if there is an issue).

In particular, I would like to know if you see any "cost" that I didn't think of yet.

Update: One issue might be: Why can't I use lifetime bounds in HRTBs?

The good things about the version I looked at is

  • inference related
    • There was only one implementation on the workaround trait
    • And the inputs were generic on lifetimes but not anything else
    • So whatever inference is possible is probably as good as it gets
  • coherence related
    • the Fn traits can't be manually implemented on stable, making it less of a breaking coherence change on stable to widen the generic implementation

And I think most of the bad things I've thought of are already being talked about in that thread (requiring lifetimes to be the same in the version I looked at, inability to ergonomically assert an inferred bound so you can split lifetimes and still be higher ranked (and if there is a way to do that on stable it will involve even more "type shenanigans"); inference breakage is hard to predict and test; necessity of explaining somewhere what your sealed trait actually does since the list of implementations is no longer sufficient to tell what implements Replacer; ...)

1 Like

Thanks a lot for your assessment.

That's interesting. Does that mean if it will be possible to implment Fn traits manually on stable on day, then the PR would become breaking in some cases? Maybe that depends on future coherence rules on that matter?

To understand this, I created one crate with this code:

pub trait Trt {}

impl<T: ?Sized + Clone> Trt for T {}

And another crate implementing this trait as well as ToOwned for a struct S that it also defines:

struct S;

impl ToOwned for S {
    type Owned = S;
    fn to_owned(&self) -> S {
        S
    }
}

impl first_crate::Trt for S {}

Coherence rules allow me to implement ToOwned as well as Trt for S. But if I now widen the implementation in the first crate:

-impl<T: ?Sized + Clone> Trt for T {}
+impl<T: ?Sized + ToOwned> Trt for T {}

Then I get for my second crate:

error[E0119]: conflicting implementations of trait `Trt` for type `S`
  --> src/main.rs:10:1
   |
10 | impl first_crate::Trt for S {}
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: conflicting implementation in crate `b`:
           - impl<T> Trt for T
             where T: ToOwned, T: ?Sized;

For more information about this error, try `rustc --explain E0119`.

I guess this falls under loosening generic bounds being a minor breaking semver change (or rather: becoming a breaking semver change if manual implementation of FnMut would be made possible on stable)?

This would be (at least) a (weak) a pro-argument for the PR, I guess, because it might be non-breaking today, but could become breaking tomorrow.

I'm not sure what you mean. Note that F: for<'a> &'a Captures<'a> is a more loose bound than F: for<'a, 'b> &'a Captures<'b>. The latter bound is stricter.

The "requirement" is that it's necessary to use the less-strict bound if you want the return type of closures to depend on the type parameter 'b of Captures<'b> (problems pointed out there). But it's not a "requirement" for the user of the API. Not sure which of those you meant.

In an ideal world, do you think closures passed to Regex::replace_all should fulfill the stricter bound (F: for<'a, 'b> &'a Captures<'b>) for tidiness, even if it's not necessary (because Captures<'b> is covariant over 'b and the caller of the closure can always coerce Captures<'b> to Captures<'a> if b: 'a)? I feel like this is an aesthetical question without any real-life impact, but I stumble upon this question in other contexts too:

Do I use two lifetimes or one lifetime when (co)variance allows me to combine two lifetimes into one lifetime.

Gotta run, so this will be an ultra abbreviated reply until later, but to give you something to play with real quick...

It's already a (minor) breaking change because I could have something like

impl<T> MyTrait for T: Replacer {}
impl MyTrait for Box<dyn for<'a> FnMut(&'a Context<'a>) -> &'a str> {}

And your loosened bound will cause it to fail. But this is less likely then people implementing traits for their own types probably (and they can't implement FnMut for their own types).

(If you try this you'll get a forward-compatibility warning, but if you read the issue, the current plan is to continue excepting anything that gets the warning.)

I don't understand what you mean (yet). If I try this, I get a hard error (even with the current regex):

use regex::{Captures, Replacer};

trait MyTrait {}

impl<T: ?Sized + Replacer> MyTrait for T {}
impl MyTrait for Box<dyn for<'a> FnMut(&'a Captures<'a>) -> &'a str> {}

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
error[E0119]: conflicting implementations of trait `MyTrait` for type `Box<(dyn for<'a> FnMut(&'a regex::Captures<'a>) -> &'a str + 'static)>`
 --> src/lib.rs:6:1
  |
5 | impl<T: ?Sized + Replacer> MyTrait for T {}
  | ---------------------------------------- first implementation here
6 | impl MyTrait for Box<dyn for<'a> FnMut(&'a Captures<'a>) -> &'a str> {}
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ conflicting implementation for `Box<(dyn for<'a> FnMut(&'a regex::Captures<'a>) -> &'a str + 'static)>`

For more information about this error, try `rustc --explain E0119`.
error: could not compile `playground` (lib) due to previous error

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.