Building my first assembler

Hello,

I am currently in the process of building an assembler for the nand2tetris Hack platform specification. It has been quite challenging for me, but I am happy to have gotten this far!

The program accepts a filename command line argument and reads the file.

For each symbolic command, it carries out the following tasks:

  1. Parses the symbolic command into its underlying fields (completed)

  2. For each field, generate the corresponding bits into machine language (incomplete)

  3. Assemble the binary codes into a complete machine instruction (incomplete)

I am currently working towards step 2. The assembly code has predefined instruction types and symbols (tables.json), however there are also user defined variables and symbols that must be collected before binary translation can occur. To do this, I must make two passes over my assembly code:

For the first pass, I need to remove comments and white spaces. While doing so, this is a great opportunity to collect user defined variables and symbols.

On the second pass, I plan to map each command to its binary translation. (There may be more that I need to do at this step, but for now, I want to focus on the building the first pass.)

I have an idea, and would love to hear your input. I very much open to other ideas as well!

Ok...

For the sake of efficiency, I want to use BufReader:
lib.rs

pub fn run(filename: String) -> std::io::Result<()> {
    let assembly = File::open(filename)?;
    let mut buffered = BufReader::new(assembly);
    let mut contents = String::new();
    buffered.read_to_string(&mut contents)?;
    
    // Return a file with comments and white space removed
    // Collect user defined symbols and variables
    let filtered_contents = first_pass(contents)?;

    // Parser commands and generate binary translation
    // etc
    second_pass(filtered_contents)?;

    // Possibly more code here
    Ok(())
}

As you can see, I would like first_pass() to perform two tasks:

  1. Filter out comments and white space
  2. Collect user defined symbols and variables

I am having trouble understanding how to utilize BufReader and BufWriter inside of first_pass.

I started playing around with BufWriter and BufReader inside first_pass() in lib.rs

What are your thoughts on this approach in general?

Is BufWriter a good idea?

Are there any other approaches that may be more favorable than mine?

Instead of this

let assembly = File::open(filename)?;
let mut buffered = BufReader::new(assembly);
let mut contents = String::new();
buffered.read_to_string(&mut contents)?;

You can do

let contents = std::fs::read_to_string(filename);

Which is both simpler, and more efficient because it pre-allocates some room in the string based on the file-size.

Also, your repo doesn't currently build (tested at commit hash 9797263ba3ef968678ba6b89dabe16440cf044c7). May want to get that fixed.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.