Implementing an API in Rust

I'd like to make sure the following can be implemented via unsafe. I'm quite sure the answer is an unqualified"yes". I'm at the design/analysis phase and I want to make sure I'm not asserting something to upstream that isn't true.
Given the following typedefs

  data type			    interpretation in typical 32-bit C
  t_stat				status code, int
  int32, uint32			signed int, unsigned 
  REG *sim_PC	     	pointer to simulated program counter

and structures

DEVICE			        device definition structure

and how they're used

Interface                                   Function
char sim_name[]                             simulator name string
int32 sim_emax                              maximum number of words in an instruction or data item
DEVICE *sim_devices[]                       table of pointers to simulated devices, NULL terminated
const char *sim_stop_messages[SCPE_BASE]    table of pointers to error messages
t_stat sim_load (…)                         binary loader subroutine
t_stat sim_instr (void)                     instruction execution subroutine
void (*sim_vm_fprint_addr) (…)	            pointer to address output routine

I'm not looking for answers, I just want to make sure I'm not getting out over my skis asserting this API can be implemented in Rust.

Thanks!

It sounds like you want to expose this API via C FFI. Is that correct?

1 Like

Yes. Those are the search terms that led me to

Thanks!

To finish the post - I'll begin by implementing shim functions in Rust and integrating them into upstream's CI process. If I can do that, I'll proceed with the implementation. There should be no changes to upstream's C code. The upstream C runtime must be unaware it's calling Rust routines.

In general, anything you can implement in C can also be implemented in Rust.

Some questions you generally want to ask when designing an API...

  • Where is all my state stored?
  • How will this state be initialized and cleaned up?
  • Do I want there to be only one global instance of the simulator running at a time, or do I want people to create multiple simulators that are independent of each other?
  • What's the story around thread safety? If you initialize the simulator, is it okay to access it from multiple threads (i.e. is your API Sync), or must it stay on the thread that it was created on (most GUI widgets are !Send, for example).
  • How much context do I want to give people when an error occurs? An error code is better than a boolean success/fail, but "Load failed: invalid config" is a lot less useful than "Load failed: the version field must be an integer".

I don't really know enough about your use case to give more specific feedback, although I know if I were designing a simulator-like API I would want to hide my state behind some opaque Simulator * that the caller needs to pass around. That way the resource management is explicit and you don't get into the situation where the simulator is effectively a big singleton implemented via unsynchronised static mut variables.

1 Like

hi thanks your thoughtful reply. I don't get to design the API. by "implement", I must write rust code that follows the existing API contract .

In general, you should not expect to encounter any marked difficulty in exposing extern "C" functions from rust code that conform to a particular ABI. (notice I've said ABI rather than API, as this appears to be what you're really asking about). Interoperability with other languages through FFI is part of rust's design.

If the functions involve variadic arguments, you might need to enable a nightly feature for ... syntax.

Ah, sorry, I misread your original post.

You should be able to implement those functions in Rust. The fact that it's exposing global variables like sim_devices instead of putting them behind a level of indirection (e.g. a getter function) could be annoying because then it's hard to test/mock your code, but it's still quite doable.

Looking at the bits you've shown us so far, a literal translation would look something like this:

#[repr(i32)]
pub enum t_stat {
    ok = 0,
}
type sim_PC = *mut REG;

pub struct REG { ... }
pub struct DEVICE { ... }

#[no_mangle]
pub static mut sim_name: *mut u8 = std::ptr::null_mut();
#[no_mangle]
pub static mut sim_emax: i32 = ...;
#[no_mangle]
pub static mut sim_devices: *mut DEVICE = std::ptr::null_mut();
#[no_mangle]
pub static mut sim_vm_fprint_addr: unsafe extern "C" fn() = default_fprint;

const SCPE_BASE: usize = 1;
#[no_mangle]
pub static mut sim_stop_messages: [*const u8; SCPE_BASE] = ["STOP".as_ptr()];

#[no_mangle]
pub unsafe extern "C" fn sim_load() -> t_stat {
    todo!()
}

#[no_mangle]
pub unsafe extern "C" fn sim_instr() -> t_stat {
    todo!()
}

unsafe extern "C" fn default_fprint() {
    // Some no-op that we can use as a default
}

(playground)

There'll be a fair amount of unsafe involved because you are locked into an API that is fundamentally not thread-safe, meaning you'll only really be able to leverage Rust's safety guarantees for the internals, but it's still quite feasible to implement in Rust.

You should be able to implement those functions in Rust. The fact that it's exposing global variables like sim_devices instead of putting them behind a level of indirection (e.g. a getter function) could be annoying because then it's hard to test/mock your code, but it's still quite doable.

Agreed. I was thinking of a compile-time parser for such files. I really don't want to/can't copy/pasta them into this sub-system.

There'll be a fair amount of unsafe ...

Agreed. the fun part will be discovering invariants. I'm sort of/kind of looking forward to that analysis. The fun part will be convincing upstream that this is OK. I might not be able to do that. It's a conversation that has not yet started.

... you are locked into an API that is fundamentally not thread-safe

Agreed, I think. I'm not entirely sure. I'm interested in seeing if threads are possible during the sim_instr() call, maybe using co-routines. It's a complex question in the context of this project, in that I can't rule out threading up front. It's not required for the MVP. It's just one more reason to try Rust as opposed to C. I'm not entirely unfamiliar w/ Rust. I have a grand total of 1 program in production.

Thank you for the playground.

ABI vs. API

Thanks for the link. The routines I'm presenting don't describe

  • Processor instruction set,
  • Sizes, layouts of basic that the processor can directly access
  • Calling convention
    ... etc

The upshot is that I think it's an API

Variadic args

Possibly. I haven't looked at specific implementations. Thanks for theads-up.

Yeah, that could be interesting.

In theory, if you can immediately switch from static variables to &mut references inside sim_instr(), then you should be able to lean on the borrow checker and Send/Sync like normal Rust code as long as the caller doesn't do anything funny from another thread (which they shouldn't because the lack of thread-safety is part of the API).

The devil is always in the details though, and we'd need to write some code to see if it's actually feasible.

Hm, so it looks like ABI doesn't quite mean what I thought it did, either.

But what I mean that you seem to be interested in is binary compatibility: making sure arguments are read from the right registers and places on the stack, so that the rust code can correctly accept the data structures passed in from the C side. And also that the rust code can define data structures that have fields in all the correct places where they are expected by C code.

These are things that you should have no difficulty with.


Really, the point of the correction was that the term "API" is generally understood to refer to the higher level concepts behind the functions provided by the library, which includes restrictions on how the caller is allowed to use it, and invariants that it upholds. This is why some answers were a bit fuzzier.

There's a number of tricks you may need to be aware of for writing correct extern C functions in rust. E.g. if the calling code is allowed to give aliasing pointers, you may need to accept &Cell<T> instead of &mut T. You probably want to wrap your function bodies in std::panic::catch_unwind.

Even without thread safety, if the API includes callback functions there could still be issues of re-entrancy.

These are things that you should have no difficulty with.

In theory, yes. practical experience? Not so much. Thank-you for the compliment!

Even without thread safety, if the API includes callback functions there could still be issues of re-entrancy.

The controller uses callbacks, but not co-routines. I'm pretty sure that because C doesn't have native co-routines (but for assembly language that intervenes in an architecture-specific way), such logic isn't an issue. But I have to admit that I don't really know right now. It is a concern.

Re-entrancy shouldn't be a problem. Just make sure all access to the simulator goes through a guard which sets/clears a flag and will return some sort of "you can't touch the simulator while it's running" error code if a re-entrant call is detected.

You could even tie it in with the way you manage simulator state.

use std::sync::atomic::{AtomicBool, Ordering};

#[no_mangle]
pub static mut sim_devices: *mut *mut  DEVICE = std::ptr::null_mut();

pub struct Simulator<'a> {
  sim_devices: &'a mut [*mut Device], // or &'a [RefCell<Device>] or whatever
  ...
}

static SIMULATOR_ACQUIRED: AtomicBool = AtomicBool::new(false);

impl<'a> Simulator<'a> {
  pub fn acquire() -> Result<Simulator<'a>, ConcurrencyError> {
    let acquired = SIMULATOR_ACQUIRED
      .compare_exchange(false, true, Ordering::Acquire, Ordering::Relaxed);

    if acquired { 
      unsafe {
        Ok(Simulator { sim_devices: std::slice::from_raw_parts_mut(sim_devices, num_devices) })
      }
    } else {
      Err(ConcurrencyError)
    }
  }
}

impl<'a> Drop for Simulator<'a> {
  fn drop(&mut self) {
    SIMULATOR_ACQUIRED.store(false, Ordering::SeqCst);
  }
}

pub struct ConcurrencyError;
btw, has anyone else noticed how deprecating compare_and_swap() in favour of a compare_exchange() with two separate orderings makes compare-and-swap operations both less ergonomic and easier to screw up the memory orderings?

I understand the reasoning behind the change, but this feels like one of those things where being technically more correct actually makes the API worse.

Anyways, rant over.

In this case, while you can technically lock up the simulator by leaking the Simulator's destructor and therefore never clearing the SIMULATOR_ACQUIRED flag, that can't happen in practice because you'll be saving it to a local variable on the stack and all future methods would take &mut self.

You would then use the Simulator like this:

#[no_mangle]
pub unsafe extern "C" fn sim_instr() -> t_stat {
  let mut sim = match Simulator::acquire() {
    Ok(s) => s,
    Err(_) => return t_stat::err_concurrent_access,
  };

  match sim.execute_next_instruction() {
    Ok(_) => t_stat::ok,
    Err(e) => e.into(),
  }
}

Nice! Thank you for the hint!

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.