My first Rust project : bindings for the unicorn emulator. Comments welcome!


#1

Hi fellow rustaceans,

As my first real Rust project I have written bindings for the unicorn emulator (http://www.unicorn-engine.org/). You can find the project on https://github.com/ekse/unicorn-rs. It’s still a work in progress but it covers most of the features. There are build instructions in the README.md, I have tested only on Ubuntu Linux so far. I would be really glad if you have feedback on the code. The C API is fairly easy to understand by looking at https://github.com/unicorn-engine/unicorn/blob/master/include/unicorn/unicorn.h.

I would especially like your help on the following design limitation. I had to define regid in reg_read() and reg_write() as an i32 instead of a Register type as I couldn’t find a way in Rust to have a generic type that would support the different types of registers (RegisterX86, RegisterARM, RegisterMIPS). This is inconvenient as the registers have to be explicitly converted with as i32 when calling these functions. I tried using a Register trait and impl it for the register types, but the compiler complains that Register is not Sized. Does anyone know of a better way to do this?

I also had to do some transmute tomfoolery to support different callback signatures when calling uc_hook_add(), as well as in mem_regions() to deal with the pointer to an array of uc_mem_region. If some of this stuff is making you cringe please tell me so that I can improve it.

One of the things on my todo list is to check the value of hook_type in add_code_hook() and add_mem_hook() to make sure it is valid in that context.

Thanks in advance,

Sébastien


#2

Great to see this being worked on! I’m pretty sure you didn’t need my suggestion :wink:

Unicorn is a great tool, soon to become even better once it’s been rebased on Qemu 2.5

And yeah, definitely use github source - the official page points to a completely outdated version.

Lastly, let me offer another project suggestion: porting Usercorn to Rust could serve as a very interesting comparison vs GO.


#3

Thanks PeteVine! I had not seen that reddit thread, I’m glad I answered your wish :slight_smile:

That’s a good point on the github version, I updated the README and added a check in the bindings to make sure the unicorn API is at version 1.0.

Thanks for pointing out usercorn. I was thinking about writing some kind of executable loading utility which this project seems to be doing. I will look at it a bit more and see if it makes sense to make a some rust port of it. I might do it just for the sake of learning more about Rust.


#4

Other languages have got their bindings bundled with Unicorn (Haskell got added recently) - have you considered opening a PR in their repo and asking for some design suggestions?


#5

I have let them know about my bindings and they are listed on the website, I can ask them if they are interested in packaging the bindings once I feel confident about the design. The bindings are now on crates.io as unicorn so they are easy to use in a project.


#6

Great, in case you ever decide to carry unicorn yourself, be sure to distinguish the names. Right now the binding compiles as unicorn.

Back to the original question, I wonder if @jschievink saw it…


#7

Sorry, I’m not sure which question you mean :sweat_smile:


#8

Here is my puzzle for you :slight_smile:

I would especially like your help on the following design limitation. I had to define regid in reg_read() and reg_write() as an i32 instead of a Register type as I couldn’t find a way in Rust to have a generic type that would support the different types of registers (RegisterX86, RegisterARM, RegisterMIPS). This is inconvenient as the registers have to be explicitly converted with as i32 when calling these functions. I tried using a Register trait and impl it for the register types, but the compiler complains that Register is not Sized. Does anyone know of a better way to do this?


#9

Ah, of course, the only question there was :smiley:

I’m not sure about the error, maybe you needed to make the function generic over T: Register and instead you tried to pass a Register by value?

Regarding the design, I guess a rustic way to do it would be this:

trait Register {
    fn to_i32(&self) -> i32;
}

trait Cpu {
    type Reg: Register;
    
    fn reg_read(&self, reg: Self::Reg) -> u64;
    fn reg_write(&mut self, reg: Self::Reg, val: u64);
}

enum X86Register {
    Eax,
    Ebx,
    //...
}

impl Register for X86Register {
    fn to_i32(&self) -> i32 {
        match *self {
            X86Register::Eax => 0,
            X86Register::Ebx => 1,
            //...
        }
    }
}

struct X86Cpu;

impl Cpu for X86Cpu {
    type Reg = X86Register;
    
    fn reg_read(&self, reg: Self::Reg) -> u64 { unimplemented!() }
    fn reg_write(&mut self, reg: Self::Reg, val: u64) { unimplemented!() }
}

This will only allow using correct registers for the target CPU. I’m really not sure how well this would map to Unicorn (I’ve never heard of it until now), but seems like it’d be worth a shot if you don’t mind having a larger wrapper.

(hmm, now I’m wondering why @PeteVine directed the question at me - seems like a standard Rust question, and I’m really not known for coming up with good APIs)


#10

@jschievink Even though Unicorn is new, it’s still based on QEMU (the CPU emulation part is all that’s left of it) so you might be familiar with the codebase, plus I was under the impression emulators were your forte :slight_smile:


#11

Hehe, they definitely are, although I’ve never worked with QEMU


#12

Thanks I’ll try that, it makes a lot sense! By the way you can cast X86Register to an i32 if you declare it as Copy, this way we don’t have to do the match for every value.

#[derive(Clone,Copy)]
enum X86Register {
    Eax,
    Ebx,
}

impl Register for X86Register {
    fn to_i32(&self) -> i32 {
        *self as i32
    }
}

Working example : https://play.rust-lang.org/?gist=2248316eb422873a04dad6b085047a46&version=stable&backtrace=0


#13

@jschievink I updated the code to implement your suggestion, after fighting a bit with the compiler I was able to implement the wrapper functions directly on the Cpu trait which limits the amount of duplicated code a lot.

The API is now much better, the casts for the registers are now gone and as bonus we can’t use the wrong register types as you mentioned.

use unicorn::{Cpu, CpuX86, uc_handle};

fn main() {
    let x86_code32 : Vec<u8> = vec![0x41, 0x4a]; // INC ecx; DEC edx

    let mut emu = CpuX86(unicorn::Mode::MODE_32).expect("failed to instantiate emulator");
    emu.mem_map(0x1000, 0x4000, unicorn::PROT_ALL); 
    emu.mem_write(0x1000, &x86_code32); 
    emu.reg_write_i32(unicorn::RegisterX86::ECX, -10);
    emu.reg_write_i32(unicorn::RegisterX86::EDX, -50);

    emu.emu_start(0x1000, (0x1000 + x86_code32.len()) as u64, 10 * unicorn::SECOND_SCALE, 1000);
    assert_eq!(emu.reg_read_i32(unicorn::RegisterX86::ECX), Ok((-9)));
    assert_eq!(emu.reg_read_i32(unicorn::RegisterX86::EDX), Ok((-51)));
}

#14

Looks great!