Hi, I currently work on implementing some machine learning models in Rust.
First, I implement a trait like Layer
for representing a neural network layer.
This is needed for dynamic layer use and composition.
It should do forward (application of the layer function), backward (calculation the derivative information of the layer), and update its parameter (addition the delta of the parameter into self).
The second and third feature hit the limitation of the trait object "The trait cannot have any method taking Self
as arguments" (Trait Objects - The Rust Programming Language).
In other word, for example, it is impossible to add two trait objects.
The following is an example containing two problematic parts:
pub trait Layer<T> {
/// Returns the kind of the layer (like `Linear`, `Conv2D`, ...).
fn kind(&self) -> &str;
/// Returns the dimension of the input vector.
fn dim_input(&self) -> usize;
/// Returns the dimension of the predict vector.
fn dim_predict(&self) -> usize;
/// Adds the layer gradient scaled by beta to self. (Problematic Part 1)
/// dlayer is the one returned by `backward`.
fn add_scaled_dlayer(&mut self, dlayer: &Box<Any>, beta: T);
/// Calculates the forward step.
fn forward(&self, input: &[T], predict: &mut [T]);
/// Calculates the backward step. (Problematic Part 2)
/// dlayer is used for storing the derivative information (the type of it should be same as the layer itself).
fn backward(&self, input: &[T], dpredict: &[T], dlayer: &mut Box<Self>, dinput: Option<&mut [T]>);
}
Until now, I consider the workaround using Box<Any>
instead of Box<Self>
, but it made impossible to implement Model
trait for containing multiple Layer
s (This is also needed for dynamic use).
This is because, Model
should check the consistency of the dimension of the input and output vectors in each layer, and its add_scaled_dmodel
method (analogy to Layer::add_scaled_dlayer
) must call add_scaled_dlayer
method of each layer.
If downcasting Box<Any>
into Box<Self>
is possible, then this workaround would resolve this problem, but it seems to be impossible also...
Furthermore, using Box<Layer<T>>
instead of Box<Self>
makes impossible to extract Self<T>
from Box<Layer<T>>
...
For example, this is needed for adding the scaled dlayer
into self
in add_scaled_dlayer
method.
Some machine learning libraries use the vector type like Vec
everywhere for circumventing this problem (maybe).
For example, see leaf::layer::Layer - Rust.
However, I want to implement more complex layer like recurrent units, so serializing the vector makes hard to write code.
In this direction, it would be better to use macro for serializing and extracting the parameter in the vector.
I tried this approach, but it seemed to be impossible to write macro serializing the multiple member object into the vector sequentially (especially in managing the pointer in the vector).
If you have any idea, please tell me your idea.