First time project: Modular synthesizer

Hi! I've recently been working on a modular audio synthesizer in my spare time, and it has been truly incredible to work with rust's compiler! I've put all the source code on github: GitHub - smj-edison/synth: Yet another synth written in rust.

I've run into some limitations with my current design, however. From the get-go, I decided not to use the typical digital synthesizer design. That is, using a node network that has pointers all over the memory and is hard to follow. I instead opted to use a more top-down approach. I figured this would keep the RAM from being less fragmented (for caching), as well as allow me to use a lot more compile-time checks.

For example, instead of doing (node network approach):

let osc = OscillatorNode::new();
let gain = GainNode::new();

osc.connect(gain);
gain.connect(audio_out);

fn process_audio() -> f32 {
  gain.receive()
}
  

I'm designing it more like (top-down approach):

let osc = OscillatorNode::new();
let gain = GainNode::new();

fn process_audio() -> f32 {
  osc.process();
  gain.receive(osc);
  gain.get_out()
}

Another thing that I wanted to be able to do was to have common IO between nodes. This led to a design like this (from engine/src/node.rs):

pub trait AudioNode {
    fn process(&mut self);
    fn receive_audio(&mut self, input_type: InputType, input: f32) -> Result<(), SimpleError>;
    fn get_output_audio(&self, output_type: OutputType) -> Result<f32, SimpleError>;
}

This design was working pretty well until I started setting up "pipelines," a function where sound is inputted, travels through a series of nodes, and spits out the processed audio. It ended up becoming somewhat unwieldy, looking like this (in src/pipeline/midi_oscillator.rs):

fn process(&mut self) {
    // -- snip --
    self.osc.process();

    self.envelope.receive_audio(InputType::Gate, if self.notes_on > 0 {1.0} else {0.0}).unwrap();
    self.envelope.process();

    self.gain.receive_audio(InputType::In, self.osc.get_output_audio(OutputType::Out).unwrap()).unwrap();
    self.gain.set_gain(self.envelope.get_output_audio(OutputType::Out).unwrap());
    self.gain.process();

    self.output_out = self.gain.get_output_audio(OutputType::Out).unwrap();
}

Another thing that is annoying is that I have three or four different node types that accept in audio from one channel (InputType::In), but for each one I have to write

fn receive_audio(&mut self, input_type: InputType, input: f32) -> Result<(), SimpleError> {
    match input_type {
        InputType::In => self.input_in = input,
        _ => bail!("Cannot receive {:?}", input_type),
    }

    Ok(())
}

I'm sure there's a lot of other places in my code where I'm not using the most ergonomic code, those were the biggest ones I could think of. It would be great if I could get advice on big structural things, as well as if I'm doing anything that could be done in a more "rusty" way.

I'm happy to do any refactors! I figure now's the best time before my codebase gets any bigger :slight_smile:

Thank you in advance!

3 Likes

Hi Mason! Very cool project ... audio synthesis is really a fascinating topic.

Some shallow stuff I notice right away:

  • You want to write something in your README. E.g. about your top-down approach.

  • If you have multiple crates in a repository, you want to use a Cargo workspace (so that there's only one Cargo.lock and only one build directory).

Thoughts on the code

pub enum OutputType {
    Out,
    None,
}

// and then somewhere else

    fn get_output_audio(&self, output_type: OutputType) -> Result<f32, SimpleError> {
        match output_type {
            OutputType::Out => Ok(self.output_out),
            _ => bail!("Cannot output {:?}", output_type),
        }
    }

A main part of the Rust philosophy is turning runtime errors into compile time errors by making illegal state unrepresentable. So I would probably remove None from the enum and instead wrap it in an Option if None is necessary. If there's only one output type I'd of course ditch the enum completely.

This also means using more granular traits. You don't want to have receive_audio in your AudioNode trait if some nodes are unable to receive audio (like oscillators). So it probably makes sense to split up your trait in two parts, I think in audio these are commonly called Sink and Source.

I also don't think that using a catch all InputType enum is such a good idea because runtime checks in each receive_audio implementation that the given input type in fact is supported by the current node is not only cumbersome but again also potential runtime errors that could be avoided. For example the trait could be made generic over a Command type, so that FilterOffset can only be input to a Filter but not e.g. a Gain.

It looks like you might be able to merge process and get_output_audio. For example with a Filter I'd assume that you shouldn't be able to call get_output_audio without calling process first. And calling process twice but get_output_audio only once is something that your API currently allows but is probably undesirable in general? I don't see the point of processing if you don't want the output, but I might be missing something.

Lastly I'd encourage you to look at the existing Rust audio libraries and look how they're implemented. dasp and nannou_audio seem interesting. Oh and you should probably also ask for feedback at the Rust Audio discourse :slight_smile:

1 Like

Thank you for the advice! I'll add some things to the README, and use cargo workspaces :slight_smile:

My thought process (correct me if I'm wrong) for defining both receive_audio and get_output_audio in the AudioNode trait was so I could confidently store any audio node of the type Box<dyn AudioNode> in an array (to simplify the process of creating a pipeline and potentially be dynamic). I would say that then something like Box<dyn AudioSink + AudioNode + AudioSource> would work, except now not all the nodes implement AudioSink. I suppose I could fix this by typecasting, but that introduces runtime checks again. Would enums be able to resolve this?

I wasn't quite sure what you meant by:

For example the trait could be made generic over a Command type, so that FilterOffset can only be input to a Filter but not e.g. a Gain .

Would you mind elaborating? What I understood from it was that it would be worthwhile creating a unique trait for each type of audio in and out, but I wasn't sure :slight_smile:

The reason I separated process and get_output_audio was in case multiple nodes needed to receive audio from one node, but I'm now realizing that I just need to store that value instead of calling receive multiple times. :man_facepalming:

Huh, I'll be sure to check out those crates, they seem well thought out. I didn't realize there was a rust audio discourse either!

Also thank you for taking the time to really look over my code and thought process!

You're welcome :slight_smile:

I don't see how Vec<Box<dyn AudioNode>> helps with creating a pipeline. I imagine that for creating a functioning pipeline it's important that you connect the right nodes with each other. In particular with Rust you don't want malformed pipelines to compile (e.g. oscillator -> oscillator). This requires distinct Sink & Source types ... I don't see how a catchall type helps with that at all. I imagine that each source node can be connected to several sink nodes, for which Vec<Box<dyn AudioSink>> would work just fine.

Would you mind elaborating?

I was thinking of adding an associated Input type to the Source trait but I now realize that this isn't actually such a great idea. I think really the best way would be to keep it simple with e.g.

impl Filter {
    fn set_offset(&mut self, offset: f32) {
        self.filter_offset_in = offset;
    }
}

Instead of having a InputType::FilterOffset that can be passed to every single node's receive_audio function. I would really remove the InputType from receive_audio and really only use it for audio receiving.

I hope that cleared things up :slight_smile:

Lastly something to consider is how limiting your top-down approach is. I imagine that manually playing the transport layer between nodes becomes cumbersome really fast. Using pointers of course makes building up complex node networks much simpler. Given Rust guarantees you would of course have to use something like reference counted smart pointers for networks like the following:

image

Pointers are generally cheap, so I don't think that you have to avoid them to build a memory efficient node network. Debuggability is of course more complex but I think that can be reasonably achieved via visualizations / being able to execute the network step by step.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.