Nom: Referencing external variables from parser(s)

I'm fairly new to using the `nom' crate, and while I think I have the hang of it, I've run into some problems. For example, I'm trying to make a parser for MikuMikuDance .pmx 3D model file format. Parsing the file header was no problem, but when I started writing the parser for the vertices, I realised I need to reference data from the header (we'll call this X) that tells me to read X amount of additional bytes for each vertex. I'm not sure if there's a macro that allows referencing an external variable though. I've looked at a few nom derivative crates/examples and I'm stumped. What I have below is incomplete, pending how I can solve this issue:

#[macro_use]
extern crate nom;

use nom::{le_u8,le_f32,le_u32};

pub mod pmx {
	#[derive(Clone,Debug,PartialEq,Eq)]
	pub enum Encoding {
		UTF16LE,
		UTF8,
	}

	#[derive(Clone,Debug,PartialEq,Eq)]
	pub struct Settings {
		encoding: Encoding,
		uv: u8,
		vertex_index_size: u8,
		texture_index_size: u8,
		material_index_size: u8,
		bone_index_size: u8,
		morph_index_size: u8,
		rigid_body_index_size: u8,
	}

	#[derive(Clone,Debug,PartialEq,Eq)]
	pub struct Header {
		version: f32,
		settings: Settings,
		name_local: String,
		name_global: String,
		comments_local: String,
		comments_global: String,
	}

	#[derive(Clone,Debug,PartialEq,Eq)]
	pub struct Vertex {
		position: [f32; 3],
		normal: [f32; 3],
		uv: [f32; 2],
		uva: Vec<[f32; 4]>,
	}

	#[derive(Clone,Debug,PartialEq,Eq)]
	pub struct Model {
		header: Header,
		vertices: Vec<Vertex>,
	}

	named!(pub utf16<String>,
		do_parse!(
			len: le_u32 >> text: take!(len) >>
			String::from_utf16_lossy(text)
		)
	);

	named!(pub settings<Settings>,
		do_parse!(
			settings_count: le_u8 >>
			encoding: switch!(le_u8,
				0 => value!(Encoding::UTF16LE) |
				1 => value!(Encoding::UTF8)
			) >>
			uv: le_u8 >>
			vertex_index_size: le_u8 >>
			texture_index_size: le_u8 >>
			material_index_size: le_u8 >>
			bone_index_size: le_u8 >>
			morph_index_size: le_u8 >>
			rigid_body_index_size: le_u8 >>
			extra: take!(settings_count - 8) >>
			
			(Settings {
				encoding: encoding,
				uv: uv,
				vertex_index_size: vertex_index_size,
				texture_index_size: texture_index_size,
				material_index_size: material_index_size,
				bone_index_size: bone_index_size,
				morph_index_size: morph_index_size,
				rigid_body_index_size: rigid_body_index_size,
			})
		)
	);

	named!(pub header<Header>,
		do_parse!(
			tag!("PMX ") >>
			version: le_f32 >>
			settings: settings >>
			name_local: utf16 >>
			name_global: utf16 >>
			comments_local: utf16 >>
			comments_global: utf16 >>
			
			(Header {
				version: version,
				settings: settings,
				name_local: name_local,
				name_global: name_global,
				comments_local: comments_local,
				comments_global: comments_global,
			})
		)
	);
	
	named!(pub vertex<Vertex>,
		do_parse!(
			header
			px: le_f32 >>
			py: le_f32 >>
			pz: le_f32 >>
			nx: le_f32 >>
			ny: le_f32 >>
			nz: le_f32 >>
			uvx: le_f32 >>
			uvy: le_f32 >>
		)
	);
}
1 Like

I think the easiest thing to do here is add an argument to the vertex parser - the first argument is the input &[u8], but the second one can be something you control (and provide at the invokation), so you could specify a u8 or u32 for your length value, and then consume it however is necessary for your parser.

The macro will be something like named_args!( pub vertex ( input: &[u8], bytes: u8), ...); The named_args! macro simply lets you add these arguments. The linked documentation also shows how you could write the function manually, if necessary.

You'll still invoke the vertex function as normal. If invoking it in a nom parser, you'll need the call! macro, which will let you pass the necessary additional argument, while handling the input and output in the normal manner.

2 Likes

named_args!.... Alright, thanks. It was right in front of my face, now I feel silly. Hehe.

Oh! Also, is there a way in named! or named_args! to declare a variable that's not a parser? I want to use a 4K ring buffer, but having the caller supply it seems unorthodox since it has no significance outside the parser function

method! works for implementing nom parsing methods onto some state context.

Otherwise, you can just fall back to writing the fn manually:

named! ( my_parser ( &str ) -> &str,
    do_parse!(
        ...
    )
);
// equivalent to:
fn my_parser(i: &str) -> nom::IResult<&str, &str> {
    do_parse!(i,
        ...
    )
}
4 Likes

Would do_parse! be the return statement or would it be better to assign it?