Can I use `setjmp` in rust?


#1

I’m working on some FFI bindings to the nvml (also known as libpmem or pmem.io) family of C libraries - specifically, libpmemobj. This library makes heavy use of macros which hide the use of setjmp for transactions which persist ‘objects’ (C structs). I’m slowly unwrapping these and trying to translate them into Rust code. My understanding of using setjmp (apart from don’t) is that any locally defined variables used need to be volatile in C-speak. Rust doesn’t have the equivalent - it uses explicit loads and stores. I’m nervous of potential gotchas when wrapping this code:-

  1. Should I use a function wrapping the core logic which is marked as #[inline(never)] to isolate the setjmp code?
  2. Do any function arguments passed to such an isolating function also need to be treated as if they were volatile?
  3. Is it safe for a panic! to cross the boundary of a setjmp?
  4. This code needs to perform well, as object persistence transactions could easily be in an application’s hot path; in some of the use cases I am envisaging, it could be the majority of the work (reading data at 10Gb/sec+ and persisting). Is passing Fn() pointers the best way to achieve this? Would a macro with blocks be better? Is there anyway of telling the match code that work is the most likely if branch?

The code I have at the moment (a bit long, but I think it makes the question clearer) is below. It assumes the answers are:-

  1. Yes
  2. No
  3. No
  4. N/A
/// Please note that work() may not ever be called - in which case, the next logic called is onAbort()
pub fn persistentObjectTransaction<Committed: Sized, Aborted: Sized, W: Fn(), C: Fn() -> Committed, A: Fn() -> Aborted>(pop: *mut PMEMobjpool, work: W, onCommit: C, onAbort: A) -> Result<Committed, Aborted>
{
	// Must be used as a function, to prevent the volatile restrictions of setjmp leaking out
	#[inline(never)]
	fn internal
	<
		Committed: Sized,
		Aborted: Sized,
		W: Fn(),
		C: Fn() -> Committed,
		A: Fn() -> Aborted
	>
	(
		pop: *mut PMEMobjpool,
		work: W,
		onCommit: C,
		onAbort: A,
		panicPayload: &mut Option<Box<Any + Send + 'static>>,
		functionResult: &mut Option<Result<Committed, Aborted>>
	) -> Result<Committed, Aborted>
	{
		const panicOsErrorNumber: c_int = 2;
		let mut txSetJmpEnvironment: jmp_buf;
		
		// != 0 if returning from longjmp()
		if setjmp(txSetJmpEnvironment) != 0
		{
			//setErrorNumber(pmemobj_tx_errno());
		}
		else
		{
			pmemobj_tx_begin(pop, txSetJmpEnvironment, TX_PARAM_NONE, TX_PARAM_NONE);
			// let osErrorNumber = pmemobj_tx_begin(pop, txSetJmpEnvironment, TX_PARAM_NONE, TX_PARAM_NONE);
			// if unlikely(osErrorNumber != 0)
			// {
			// 	setErrorNumber(osErrorNumber);
			// }
		}

		let mut stage: pobj_tx_stage;
		while
		{
			stage = pmemobj_tx_stage();
			stage != pobj_tx_stage::TX_STAGE_NONE
		}
		{
			match stage
			{
				pobj_tx_stage::TX_STAGE_WORK =>
				{
					match catch_unwind(AssertUnwindSafe(|| work())
					{
						Ok(someOsErrorNumberForAbort) =>
						{
							if likely(someOsErrorNumberForAbort == 0)
							{
								pmemobj_tx_commit();
							}
							else
							{
								pmemobj_tx_abort(someOsErrorNumberForAbort);
							}
						},
						Err(payload) =>
						{
							pmemobj_tx_abort(panicOsErrorNumber);
							*panicPayload = Some(payload);
						},
					};
				
					pmemobj_tx_process();
				},
			
				pobj_tx_stage::TX_STAGE_ONCOMMIT =>
				{
					match catch_unwind(AssertUnwindSafe(|| onCommit())
					{
						Ok(result) =>
						{
							*functionResult = Some(Ok(result))
						},
					
						Err(payload) =>
						{
							if panicPayload.is_none()
							{
								*panicPayload = Some(payload)
							}
						}
					};
				
					pmemobj_tx_process();
				},
			
				pobj_tx_stage::TX_STAGE_ONABORT =>
				{
					match catch_unwind(AssertUnwindSafe(|| onAbort())
					{
						Ok(result) =>
						{
							*functionResult = Some(Err(result))
						},
					
						Err(payload) =>
						{
							if panicPayload.is_none()
							{
								*panicPayload = Some(payload)
							}
						}
					};
				
					pmemobj_tx_process();
				},
			
				pobj_tx_stage::TX_STAGE_FINALLY =>
				{
					pmemobj_tx_process();
				},
			
				_ =>
				{
					pmemobj_tx_process();
				},
			}
		}
		
		pmemobj_tx_end();
		// let ifAbortedTheTransactionErrorNumber = pmemobj_tx_end();
		// if unlikely(ifAbortedTheTransactionErrorNumber != 0)
		// {
		// 	setErrorNumber(ifAbortedTheTransactionErrorNumber);
		// }
	}

	let mut panicPayload = None;
	let mut functionResult = None;
	
	internal(work, onCommit, onAbort, &mut panicPayload, &mut functionResult);
	
	if let Some(payload) = panicPayload
	{
		resume_unwind(payload);
	}
	
	functionResult.unwrap()
}

On a side-note, objects that are persist-able will need to defined using structs in Rust that are #[repr(C)]. Is there a marker trait that I can rely to enforce this anything passed to a generic fn persist<T: ReprIsC>(value: T) function?

Many thanks if you’ve read this far.


#2

Drop checking and borrow checking know nothing about longjmp control flows, and your code will exhibit undefined behaviour if you longjmp across, into, or out of a Rust stack frame that has any non-trivial ownership. (Box is non-trivial.)

You might be interested in this thread from a couple of months ago - it’s even possible you can get in touch with the author to find out how they eventually handled that project.


#3

Thank you very much for the insight and pointers to more info. It’s looking like it might be better to do any work inside a transaction purely as C.


#4

You shouldn’t use setjmp/longjmp for your bindings :wink:
Look at how the transactions are handled in libpmemobj C++ language integration. Each C tx_ function is wrapped in a C++ equivalent and simply throws exception when there’s an error. That exception is caught around a lambda function that is the transaction.

In my opinion, the idiomatic way for handling transaction aborts would be very similar, but instead of throwing an exception, you would simply return a Result from a function. You could also think about simulating the try! macro so that each transactional function also propagates the Result up.


#5

Just like in C++ (when you use anything that doesn’t come from C in call stack between setjmp and longjmp), setjmp is undefined when there is a Rust function between setjmp and longjmp. Even if setjmp is called by function called by C compiler. You may be able to run away with it, but the code will be prone to breaking.

Instead, you may want to keep handling FFI entirely on side of C. Write code to handle all longjmp/setjmp interactions on C side, and FFI with a C->Rust conversion module written in C. Don’t let longjmp cross Rust.


#6

@xfix, @pbalcer Thank you very much for your replies. I’m sorry I didn’t spot them earlier. They’re very useful, and confirm what I’m thinking - that this transaction code all belongs in C. That seems a bit sad. Hopefully one day the rust compiler will learn what to do with setjmp… but it is almost certainly a special case with a lot of work.