At their core, VMs are really just a loop of "fetch and decode an instruction, then execute that instruction." Because of that, a good start is to first just decode all the instructions by writing a disassembler: together with an assembler (it looks like they provide you one?) this will let you validate the messy half by then reassembling the output and checking if they're byte-identical, or at least sensible. (This does depend on the VM and instructions being somewhat amenable to it though, otherwise it's hard to tell what's instructions to decode and what's just data. Use your best judgement!)
Then execution is generally pretty simple, depending on exactly what machine you're emulating and to what level: match the decoded instruction and do what it says. A lot of the time this is something like ld y, x for "load a byte from address x to register y" which you can just write as something like self.registers[y] = self. memory[x], or add y, x for "add register x to register y", which is self.register[y] += self.register[x] for example (be sure you get the arguments the right way around, this isn't consistent across different machines!)
The messy part here is just that in practice these instructions tend to actually be pretty subtle and require some care to do exactly right, for example loading from memory may actually perform effects on some machines (like block and wait for input to read from the keyboard!), and addition generally sets flags like "the result was too large so it was wrapped", so you should actually be using overflowing_add, and so on. Be sure to read the details carefully!
You'll probably want to have unit tests for everything, though this is much more effective if you have something "known good" to validate against, ideally provided test programs with their expected output for example. Otherwise you may be just reinforcing whatever incorrect assumption you made!
You might want a debugger for your VM. This can be as simple as reading a command in a loop, at least calling "vm.step()", and a bunch of inspection commands like printing memory ranges, showing register values, or disassembling. Unfortunately it's pretty tricky to do the classic "run until I hit a key" nicely, since you can't portably read stdin without blocking.
Once you have that then "real" VMs start getting really, really messy. It's common to have to simulate how the screen is being drawn at the instruction cycle level to handle the common trick of altering the things on screen as they're being drawn, for example, which can make that code really convoluted. Then there's emulating hardware bugs, or for more modern hardware JITing native code.
I'm happy to answer anything more specific you're having issues with!