I'm trying to write WebAssembly bindings for the gcode crate and the main Parser type borrows from the original source string. For soundness reasons it's not possible for a type with a lifetime to be exposed from WebAssembly so I need to come up with some sort of "creative" solution...
This is what I've arrived at. Is it a sound use of Pin<Box<str>> and ManuallyDrop?
use wasm_bindgen::prelude::wasm_bindgen;
use std::{mem::ManuallyDrop, pin::Pin};
#[wasm_bindgen]
extern {
pub type JavaScriptCallbacks;
}
#[wasm_bindgen]
pub struct Parser {
text: Pin<Box<str>>,
inner: ManuallyDrop<gcode::Parser<'static, JavaScriptCallbacks>>,
}
#[wasm_bindgen]
impl Parser {
#[wasm_bindgen(constructor)]
pub fn new(text: String, callbacks: JavaScriptCallbacks) -> Parser {
let text: Pin<Box<str>> = text.into_boxed_str().into();
// SAFETY: Because gcode::Parser contains a reference to the text string
// it needs a lifetime, however it's not sound to expose a struct with
// a lifetime to JavaScript (JavaScript doesn't have a borrow checker).
//
// To work around this we turn the string into a `Box<Pin<str>>` (to
// statically ensure pointers to our string will never change) then
// take a reference to it and "extend" the reference lifetime to
// 'static.
//
// The order that `text` and `inner` are destroyed in isn't really
// defined, so we use `ManuallyDrop` to ensure the `gcode::Parser` is
// destroyed first. That way we don't get the situation where `text` is
// destroyed and our `inner` parser is left with dangling references.
unsafe {
let text_ptr: *const str = &*text;
let static_str: &'static str = &*text_ptr;
let inner = ManuallyDrop::new(gcode::Parser::new(static_str, callbacks));
Parser { text, inner }
}
}
}
impl Drop for Parser {
fn drop(&mut self) {
// SAFETY: This is the only place `inner` gets destroyed, and the field
// can never be touch after `Parser::drop()` is called.
unsafe {
ManuallyDrop::drop(&mut self.inner);
}
// the text will be destroyed somewhere after here because Rust's drop
// glue destroys fields after the containing type is destroyed.
}
}
In your code Pin is just playing a "linting" role, which can be good to express intent, but can also bad in that it may inspire a false sense of safety within the privacy boundary.
The ManuallyDrop usage is nice! It's indeed better to express drop order explicitely by using it rather than by relying on fields drop order (struct Parser { inner: Parser..., text: ... } would be sound but struct Parser { text: ..., inner: Parser... } wouldn't: that situation is too implicit and thus fragile to refactoring).
XY suggestion: what about defining Parser<Str : AsRef<str>> (for which your current Parser<'input> is just the subset of Str = &'input str) ?
Then you could "just" have in this case a Parser<Box<str>>
other (sub)option: using a Cow<'lt, str> internally (just .into() the ref), and have a new API that creates a Parser<'static> by taking a String.
I personally only resort to this self-borrowing-with-fake-lifetime unsafe constructs when the lifetime-infected type (here, Parser), comes from an external crate I do not control, and otherwise try to make the lifetime-infect type "go full impl 'lt + Borrow" genericity-wise to avoid the unsafe
I think that's the case here. I've created a shim crate which imports gcode and will export WebAssembly bindings. The #[wasm_bindgen] attribute doesn't allow exported types to have lifetimes (for soundness reasons) or type parameters (WebAssembly doesn't do compile-time generics), hence these unsafe shenanigans.
The gcode::Parser is also infected with another type, you just can't see it because there's a default value. This type contains a bunch of associated types for determining how data should be stored (e.g. Vec<T> when compiling with std or ArrayVec<[T; 42]> for embedded platforms) so for readability reasons I'm not overly keen to go down the Parser<Str: AsRef<str>> path.
Thanks, I think I actually got suckered into that false sense of security
A Box<T> is always Unpin so using Pin<Box<str>> here actually made me think it was okay, even though it's perfectly fine to change the text field or swap out the Box<str> it is meant to be pinning and leave Parser with dangling references. I'm going to change the field to Box<str> (not String because that tells people it's okay to mutate the string) and add a nice # Safety to the text field's docs.
Thanks for the RFC link! I wasn't 100% sure on the guarantees around drop order so decided to err on the side of caution and write the drop code manually.