C++ vs. Rust: Storing particle data in stream of bytes, safely read/write data


#1

Hi,

I’m trying (as an exercise) to convert an existing C++ library into Rust code. The original C++ library can be found here:

http://www.disneyanimation.com/technology/partio.html

Source code:

Right now I avoided some topics like inheritance but implemented parts of a test.cpp file (using parts of the library) and tried to use similar names for the structs and members. The Rust code can be found here:

But let’s get to the questions I have. Here is a function used in the original C++ code:

Partio::ParticlesDataMutable* makeData()                                                                                      
{                                                                                                                             
    Partio::ParticlesDataMutable& foo=*Partio::create();                                                                      
    Partio::ParticleAttribute positionAttr=foo.addAttribute("position",Partio::VECTOR,3);                                     
    Partio::ParticleAttribute lifeAttr=foo.addAttribute("life",Partio::FLOAT,2);                                              
    Partio::ParticleAttribute idAttr=foo.addAttribute("id",Partio::INT,1);                                                    
                                                                                                                              
    for(int i=0;i<5;i++){                                                                                                     
        Partio::ParticleIndex index=foo.addParticle();                                                                        
        float* pos=foo.dataWrite<float>(positionAttr,index);                                                                  
        float* life=foo.dataWrite<float>(lifeAttr,index);                                                                     
        int* id=foo.dataWrite<int>(idAttr,index);                                                                             
                                                                                                                              
        pos[0]=.1*i;                                                                                                          
        pos[1]=.1*(i+1);                                                                                                      
        pos[2]=.1*(i+2);                                                                                                      
        life[0]=-1.2+i;                                                                                                       
        life[1]=10.;                                                                                                          
        id[0]=index;                                                                                                          
                                                                                                                              
    }                                                                                                                         
    return &foo;                                                                                                              
}                                                                                                                             

Data gets stored by a class called ParticlesSimple:

class ParticlesSimple:public ParticlesDataMutable,                                                                            
                      public Provider                                                                                         
{                                                                                                                             
...
private:                                                                                                                      
...
    std::vector<char*> attributeData; // Inside is data of appropriate type                                                   
...
};

The dataWrite part basically casts pointers to bytes of the appropriate type:

class ParticlesDataMutable:public ParticlesData                                                                               
{                                                                                                                             
...
    //! Get a pointer to the data corresponding to the given particleIndex and                                                
    //! attribute given by the attribute handle.                                                                              
    template<class T> inline T* dataWrite(const ParticleAttribute& attribute,                                                 
        const ParticleIndex particleIndex) const                                                                              
    {                                                                                                                         
        // TODO: add type checking                                                                                            
        return static_cast<T*>(dataInternal(attribute,particleIndex));                                                        
    }                                                                                                                         
};
...
void* ParticlesSimple::                                                                                                       
dataInternal(const ParticleAttribute& attribute,const ParticleIndex particleIndex) const                                      
{                                                                                                                             
    assert(attribute.attributeIndex>=0 && attribute.attributeIndex<(int)attributes.size());                                   
    return attributeData[attribute.attributeIndex]+attributeStrides[attribute.attributeIndex]*particleIndex;                  
}                                                                                                                             

I just wanted to get to the point that I have some compileable Rust code which I can share and discuss. I just made some stuff pub to access it (from outside the library) and hacked the values into the attribute_data:

// lib
#[derive(Debug)]                                                                                                              
pub struct ParticlesSimple {                                                                                                  
...
    pub attribute_data: Vec<Box<[u8]>>,                                                                                       
...
}                                                                                                                             
// examples (using lib)
fn make_data() -> partio::ParticlesSimple {                                                                                   
    // use builder to create defaults                                                                                         
    let builder: partio::ParticlesSimpleBuilder = partio::ParticlesSimpleBuilder::new();                                      
    let mut foo: partio::ParticlesSimple = builder.finalize();                                                                
    // add attributes                                                                                                         
    let position_attr: partio::ParticleAttribute =                                                                            
        foo.add_attribute("position", partio::ParticleAttributeType::VECTOR, 3);                                              
    let life_attr: partio::ParticleAttribute =                                                                                
        foo.add_attribute("life", partio::ParticleAttributeType::FLOAT, 2);                                                   
    let id_attr: partio::ParticleAttribute =                                                                                  
        foo.add_attribute("id", partio::ParticleAttributeType::INT, 1);                                                       
    // add some particle data                                                                                                 
    for i in 0..5 {                                                                                                           
        let index: partio::ParticleIndex = foo.add_particle();                                                                
        // TODO: dataWrite<...>(attr, index)                                                                                  
        let ref mut data_ref = foo.attribute_data;                                                                            
        println!("{:p}", data_ref as *const _);                                                                               
        // HELP with this part, please
    }                                                                                                                         
    foo // return                                                                                                             
}                                                                                                                             

The part I marked above with HELP looks for example like this (ugly I know):

        {                                                                                                                     
            // float* life=foo.dataWrite<float>(lifeAttr,index);                                                              
            let ref mut life = data_ref[life_attr.attribute_index];                                                           
            // life[0]=-1.2+i;                                                                                                
            let life_0: f64 = -0.2_f64 + i as f64;                                                                            
            let raw_bytes: [u8; 8] = unsafe { std::mem::transmute(life_0) };                                                  
            for bi in 0..8 {                                                                                                  
                life[bi + 0 * 8 + 2 * 8 * i] = raw_bytes[bi];                                                                 
            }                                                                                                                 
            // life[1]=10.;                                                                                                   
            let life_1: f64 = 10.0_f64;                                                                                       
            let raw_bytes: [u8; 8] = unsafe { std::mem::transmute(life_1) };                                                  
            for bi in 0..8 {                                                                                                  
                life[bi + 1 * 8 + 2 * 8 * i] = raw_bytes[bi];                                                                 
            }                                                                                                                 
            println!("life data{:?}", life);                                                                                  
        }                                                                                                                     

Any idea’s how I could solve the problem in Rust in an elegant way?


#2

For serializing data you could write the same kind of dataWrite method. I’m not sure what the index is for, but here’s one that appends bytes to a vector:

trait DataWriter<T> {
    fn data_write(&mut self, data: &T);
}

impl<T> DataWriter<T> for Vec<u8> where T: Copy {
    fn data_write(&mut self, data: &T){
        unsafe {
            self.extend_from_slice(std::slice::from_raw_parts(std::mem::transmute(data), std::mem::size_of::<T>()))
        }
    }
}

fn main(){
    let mut foo = Vec::new();
    
    foo.data_write(&1.0);
}

#3

Thanks kornel,

I did what you suggested and that makes the example test code much more readable:


#4

As for unserializaton - without type metadata in the byte stream it looks like inherently unsafe problem, because there’s nothing stopping you from expecting a wrong type or getting corrupted data (if the data goes over the network or is saved to disk).

The code could be made nicer though.

In Rust there’s a pattern for “consuming” data from a slice. For example if you want to skip 4 elements/bytes:

slice = slice[4..];

Rust optimizes this to C pointer arithmetic slice += 4, but keeps knowledge of slices length, so it can still track and eliminate bounds checks.

You can also use

let (data, slice) = slice.split_at(4);

to get both slice for the data in the beginning, and “remove” these bytes from the slice that represents the rest.

So with that you could write something like

struct DataReader<'a> { data: &'a[u8]; }
impl<'a> DataReader<'a> {
read<T>(&mut self) -> T {
  let len = std::mem::size_of::<T>();
  let res:T = …transmute here…
  self.data = self.data[len..];
  }
}

#5

Thanks again, kornel,

I used your suggestions in the current write method to partially implement the export to Houdini’s .bgeo file format:


#6
 wtr.write_u32::<BigEndian>(houdini_type).unwrap();

If you make the method return std::io::Result<()> (i.e. I/O error or nothing ()), you’ll be able to replace all these .unwrap() calls with

try!(wtr.write_u32::<BigEndian>(houdini_type));

and in Rust nightly with

wtr.write_u32::<BigEndian>(houdini_type)?;

which as a bonus won’t kill the whole program on I/O error :slight_smile:


#7

OK. Done:

Thanks again :wink: