C++ vs. Rust: Storing particle data in stream of bytes, safely read/write data

Hi,

I'm trying (as an exercise) to convert an existing C++ library into Rust code. The original C++ library can be found here:

http://www.disneyanimation.com/technology/partio.html

Source code:

https://github.com/wdas/partio

Right now I avoided some topics like inheritance but implemented parts of a test.cpp file (using parts of the library) and tried to use similar names for the structs and members. The Rust code can be found here:

https://github.com/wahn/rs_partio

But let's get to the questions I have. Here is a function used in the original C++ code:

Partio::ParticlesDataMutable* makeData()                                                                                      
{                                                                                                                             
    Partio::ParticlesDataMutable& foo=*Partio::create();                                                                      
    Partio::ParticleAttribute positionAttr=foo.addAttribute("position",Partio::VECTOR,3);                                     
    Partio::ParticleAttribute lifeAttr=foo.addAttribute("life",Partio::FLOAT,2);                                              
    Partio::ParticleAttribute idAttr=foo.addAttribute("id",Partio::INT,1);                                                    
                                                                                                                              
    for(int i=0;i<5;i++){                                                                                                     
        Partio::ParticleIndex index=foo.addParticle();                                                                        
        float* pos=foo.dataWrite<float>(positionAttr,index);                                                                  
        float* life=foo.dataWrite<float>(lifeAttr,index);                                                                     
        int* id=foo.dataWrite<int>(idAttr,index);                                                                             
                                                                                                                              
        pos[0]=.1*i;                                                                                                          
        pos[1]=.1*(i+1);                                                                                                      
        pos[2]=.1*(i+2);                                                                                                      
        life[0]=-1.2+i;                                                                                                       
        life[1]=10.;                                                                                                          
        id[0]=index;                                                                                                          
                                                                                                                              
    }                                                                                                                         
    return &foo;                                                                                                              
}                                                                                                                             

Data gets stored by a class called ParticlesSimple:

class ParticlesSimple:public ParticlesDataMutable,                                                                            
                      public Provider                                                                                         
{                                                                                                                             
...
private:                                                                                                                      
...
    std::vector<char*> attributeData; // Inside is data of appropriate type                                                   
...
};

The dataWrite part basically casts pointers to bytes of the appropriate type:

class ParticlesDataMutable:public ParticlesData                                                                               
{                                                                                                                             
...
    //! Get a pointer to the data corresponding to the given particleIndex and                                                
    //! attribute given by the attribute handle.                                                                              
    template<class T> inline T* dataWrite(const ParticleAttribute& attribute,                                                 
        const ParticleIndex particleIndex) const                                                                              
    {                                                                                                                         
        // TODO: add type checking                                                                                            
        return static_cast<T*>(dataInternal(attribute,particleIndex));                                                        
    }                                                                                                                         
};
...
void* ParticlesSimple::                                                                                                       
dataInternal(const ParticleAttribute& attribute,const ParticleIndex particleIndex) const                                      
{                                                                                                                             
    assert(attribute.attributeIndex>=0 && attribute.attributeIndex<(int)attributes.size());                                   
    return attributeData[attribute.attributeIndex]+attributeStrides[attribute.attributeIndex]*particleIndex;                  
}                                                                                                                             

I just wanted to get to the point that I have some compileable Rust code which I can share and discuss. I just made some stuff pub to access it (from outside the library) and hacked the values into the attribute_data:

// lib
#[derive(Debug)]                                                                                                              
pub struct ParticlesSimple {                                                                                                  
...
    pub attribute_data: Vec<Box<[u8]>>,                                                                                       
...
}                                                                                                                             
// examples (using lib)
fn make_data() -> partio::ParticlesSimple {                                                                                   
    // use builder to create defaults                                                                                         
    let builder: partio::ParticlesSimpleBuilder = partio::ParticlesSimpleBuilder::new();                                      
    let mut foo: partio::ParticlesSimple = builder.finalize();                                                                
    // add attributes                                                                                                         
    let position_attr: partio::ParticleAttribute =                                                                            
        foo.add_attribute("position", partio::ParticleAttributeType::VECTOR, 3);                                              
    let life_attr: partio::ParticleAttribute =                                                                                
        foo.add_attribute("life", partio::ParticleAttributeType::FLOAT, 2);                                                   
    let id_attr: partio::ParticleAttribute =                                                                                  
        foo.add_attribute("id", partio::ParticleAttributeType::INT, 1);                                                       
    // add some particle data                                                                                                 
    for i in 0..5 {                                                                                                           
        let index: partio::ParticleIndex = foo.add_particle();                                                                
        // TODO: dataWrite<...>(attr, index)                                                                                  
        let ref mut data_ref = foo.attribute_data;                                                                            
        println!("{:p}", data_ref as *const _);                                                                               
        // HELP with this part, please
    }                                                                                                                         
    foo // return                                                                                                             
}                                                                                                                             

The part I marked above with HELP looks for example like this (ugly I know):

        {                                                                                                                     
            // float* life=foo.dataWrite<float>(lifeAttr,index);                                                              
            let ref mut life = data_ref[life_attr.attribute_index];                                                           
            // life[0]=-1.2+i;                                                                                                
            let life_0: f64 = -0.2_f64 + i as f64;                                                                            
            let raw_bytes: [u8; 8] = unsafe { std::mem::transmute(life_0) };                                                  
            for bi in 0..8 {                                                                                                  
                life[bi + 0 * 8 + 2 * 8 * i] = raw_bytes[bi];                                                                 
            }                                                                                                                 
            // life[1]=10.;                                                                                                   
            let life_1: f64 = 10.0_f64;                                                                                       
            let raw_bytes: [u8; 8] = unsafe { std::mem::transmute(life_1) };                                                  
            for bi in 0..8 {                                                                                                  
                life[bi + 1 * 8 + 2 * 8 * i] = raw_bytes[bi];                                                                 
            }                                                                                                                 
            println!("life data{:?}", life);                                                                                  
        }                                                                                                                     

Any idea's how I could solve the problem in Rust in an elegant way?

For serializing data you could write the same kind of dataWrite method. I'm not sure what the index is for, but here's one that appends bytes to a vector:

trait DataWriter<T> {
    fn data_write(&mut self, data: &T);
}

impl<T> DataWriter<T> for Vec<u8> where T: Copy {
    fn data_write(&mut self, data: &T){
        unsafe {
            self.extend_from_slice(std::slice::from_raw_parts(std::mem::transmute(data), std::mem::size_of::<T>()))
        }
    }
}

fn main(){
    let mut foo = Vec::new();
    
    foo.data_write(&1.0);
}

Thanks kornel,

I did what you suggested and that makes the example test code much more readable:

https://github.com/wahn/rs_partio/commit/472fa7e4c4fec5af0d5466ca1a27695ecb8f6db6

As for unserializaton - without type metadata in the byte stream it looks like inherently unsafe problem, because there's nothing stopping you from expecting a wrong type or getting corrupted data (if the data goes over the network or is saved to disk).

The code could be made nicer though.

In Rust there's a pattern for "consuming" data from a slice. For example if you want to skip 4 elements/bytes:

slice = slice[4..];

Rust optimizes this to C pointer arithmetic slice += 4, but keeps knowledge of slices length, so it can still track and eliminate bounds checks.

You can also use

let (data, slice) = slice.split_at(4);

to get both slice for the data in the beginning, and "remove" these bytes from the slice that represents the rest.

So with that you could write something like

struct DataReader<'a> { data: &'a[u8]; }
impl<'a> DataReader<'a> {
read<T>(&mut self) -> T {
  let len = std::mem::size_of::<T>();
  let res:T = …transmute here…
  self.data = self.data[len..];
  }
}

Thanks again, kornel,

I used your suggestions in the current write method to partially implement the export to Houdini's .bgeo file format:

https://github.com/wahn/rs_partio/commit/55b6185a7d4545f6ed3d584f0175e8de1fe36f15

 wtr.write_u32::<BigEndian>(houdini_type).unwrap();

If you make the method return std::io::Result<()> (i.e. I/O error or nothing ()), you'll be able to replace all these .unwrap() calls with

try!(wtr.write_u32::<BigEndian>(houdini_type));

and in Rust nightly with

wtr.write_u32::<BigEndian>(houdini_type)?;

which as a bonus won't kill the whole program on I/O error :slight_smile:

OK. Done:

https://github.com/wahn/rs_partio/commit/cc31475dbc01fc488d80f4fb05d15c028b4c8630

Thanks again :wink: