Safe code got SIGSEGV, using Serde


#1

I began learning Rust this month. My program got a segmentation fault when implementing a new Serializer for Serde library, without any unsafe code.
Does it mean a bug of serde?

I did wrote some evil code like following, just to cause SIGSEGV, which does not make sense in real project:

#![allow(unconditional_recursion)]
fn end( self ) -> Result<()> { self.end() }

OS: FreeBSD 11.1-RELEASE
Rust: rust-nightly-1.25.0.20180321

The following code( lib.rs ) can reproduce SIGSEGV:

#![allow(unused_variables)]
#![allow(unconditional_recursion)]

extern crate serde;

mod error {
    use std::error;
    use std::fmt::{self, Debug, Display};
    use std::result;
    use serde::ser;

    pub struct Error;

    pub type Result<T> = result::Result<T, Error>;

    impl error::Error for Error {
        fn description(&self) -> &str { "" }
        fn cause(&self) -> Option<&error::Error> { None }
    }

    impl Display for Error {
        fn fmt( &self, f: &mut fmt::Formatter ) -> fmt::Result { Display::fmt( &"", f ) }
    }

    impl Debug for Error {
        fn fmt( &self, f: &mut fmt::Formatter ) -> fmt::Result { Debug::fmt( &"", f ) }
    }

    impl ser::Error for Error {
        fn custom<T: Display>( _msg: T ) -> Error { Error }
    }
}

use serde::ser::{self, Serialize};
use error::{Error, Result};

pub struct Serializer;

impl Serializer {
    pub fn to_string<T>( value: &T ) -> Result<String> where T: Serialize {
        let mut serializer = Serializer;
        value.serialize( &mut serializer )?;
        Ok( String::from("") )
    }
}

impl<'a> ser::Serializer for &'a mut Serializer {
    type Ok = ();
    type Error = Error;
    type SerializeSeq           = Self;
    type SerializeTuple         = Self;
    type SerializeTupleStruct   = Self;
    type SerializeTupleVariant  = Self;
    type SerializeMap           = Self;
    type SerializeStruct        = Self;
    type SerializeStructVariant = Self;

    fn serialize_bool( self, v: bool) -> Result<()> { Ok(()) }
    fn serialize_i8 ( self, v: i8  ) -> Result<()> { Ok(()) }
    fn serialize_i16( self, v: i16 ) -> Result<()> { Ok(()) }
    fn serialize_i32( self, v: i32 ) -> Result<()> { Ok(()) }
    fn serialize_i64( self, v: i64 ) -> Result<()> { Ok(()) }
    fn serialize_u8 ( self, v: u8  ) -> Result<()> { Ok(()) }
    fn serialize_u16( self, v: u16 ) -> Result<()> { Ok(()) }
    fn serialize_u32( self, v: u32 ) -> Result<()> { Ok(()) }
    fn serialize_u64( self, v: u64 ) -> Result<()> { Ok(()) }
    fn serialize_f32( self, v: f32 ) -> Result<()> { Ok(()) }
    fn serialize_f64( self, v: f64 ) -> Result<()> { Ok(()) }
    fn serialize_char( self, v: char ) -> Result<()> { Ok(()) }
    fn serialize_str ( self, v: &str ) -> Result<()> { Ok(()) }
    fn serialize_bytes( self, v: &[u8] ) -> Result<()> { Ok(()) }
    fn serialize_none( self ) -> Result<()> { Ok(()) }
    fn serialize_some<T>( self, value: &T ) -> Result<()> where T: ?Sized + Serialize { Ok(()) }
    fn serialize_unit( self ) -> Result<()> { Ok(()) }
    fn serialize_unit_struct( self, _name: &'static str ) -> Result<()> { Ok(()) }
    fn serialize_unit_variant( self, _name: &'static str, _variant_index: u32, variant: &'static str) -> Result<()> { Ok(()) }
    fn serialize_newtype_struct<T>( self, _name: &'static str, value: &T ) -> Result<()> where T: ?Sized + Serialize { Ok(()) }
    fn serialize_newtype_variant<T>( self, _name: &'static str, _variant_index: u32, _variant: &'static str, value: &T) -> Result<()> where T: ?Sized + Serialize { Ok(()) }
    fn serialize_seq( self, _len: Option<usize> ) -> Result<Self::SerializeSeq> { Ok( self ) }
    fn serialize_tuple( self, len: usize ) -> Result<Self::SerializeTuple> { Ok( self ) }
    fn serialize_tuple_struct( self, _name: &'static str, len: usize) -> Result<Self::SerializeTupleStruct> { Ok( self ) }
    fn serialize_tuple_variant( self, _name: &'static str, _variant_index: u32, variant: &'static str, _len: usize) -> Result<Self::SerializeTupleVariant> { Ok( self ) }
    fn serialize_map(self, _len: Option<usize>) -> Result<Self::SerializeMap> { Ok( self ) }
    fn serialize_struct( self, _name: &'static str, _len: usize) -> Result<Self::SerializeStruct> { Ok( self ) }
    fn serialize_struct_variant( self, _name: &'static str, _variant_index: u32, variant: &'static str, _len: usize) -> Result<Self::SerializeStructVariant> {Ok( self ) }
}

impl<'a> ser::SerializeSeq for &'a mut Serializer {
    type Ok = ();
    type Error = Error;

    fn serialize_element<T>( &mut self, value: &T ) -> Result<()> where T: ?Sized + Serialize { Ok(()) }

    fn end( self ) -> Result<()> { self.end() }
}

impl<'a> ser::SerializeTuple for &'a mut Serializer {
    type Ok = ();
    type Error = Error;

    fn serialize_element<T>( &mut self, value: &T ) -> Result<()> where T: ?Sized + Serialize { Ok(()) }

    fn end( self ) -> Result<()> { self.end() }
}

impl<'a> ser::SerializeTupleStruct for &'a mut Serializer {
    type Ok = ();
    type Error = Error;

    fn serialize_field<T>( &mut self, value: &T ) -> Result<()> where T: ?Sized + Serialize { Ok(()) }

    fn end( self ) -> Result<()> { self.end() }
}

impl<'a> ser::SerializeTupleVariant for &'a mut Serializer {
    type Ok = ();
    type Error = Error;

    fn serialize_field<T>( &mut self, value: &T ) -> Result<()> where T: ?Sized + Serialize { Ok(()) }

    fn end( self ) -> Result<()> { self.end() }
}

impl<'a> ser::SerializeMap for &'a mut Serializer {
    type Ok = ();
    type Error = Error;
    fn serialize_key<T>( &mut self, key: &T ) -> Result<()> where T: ?Sized + Serialize { Ok(()) }
    fn serialize_value<T>(&mut self, value: &T) -> Result<()> where T: ?Sized + Serialize { Ok(()) }
    fn end( self ) -> Result<()> { self.end() }
}

impl<'a> ser::SerializeStruct for &'a mut Serializer {
    type Ok = ();
    type Error = Error;
    fn serialize_field<T>(&mut self, _key: &'static str, value: &T) -> Result<()> where T: ?Sized + Serialize { Ok(()) }
    fn end( self ) -> Result<()> { self.end() }
}

impl<'a> ser::SerializeStructVariant for &'a mut Serializer {
    type Ok = ();
    type Error = Error;
    fn serialize_field<T>(&mut self, _key: &'static str, value: &T) -> Result<()> where T: ?Sized + Serialize { Ok(()) }
    fn end( self ) -> Result<()> { self.end() }
}

////////////////////////////////////////////////////////////////////////////////

#[cfg(test)]
mod tests {
    #[test]
    fn test() { let result = super::Serializer::to_string( &[1] ).unwrap(); }
}

#2

Does this also segfault?

#![allow(unconditional_recursion)]

fn end() {
    end();
}

fn main() {
    end();
}

#3

No, just stack overflow.

thread ‘main’ has overflowed its stack
fatal runtime error: stack overflow
^C^C^C^C^C^CAbort (core dumped)


#4

I suspect Serde is a red herring and this doesn’t have anything to do with Serde. It would be helpful if you could minimize the repro code further. ¯\_(ツ)_/¯ But I would recommend not disabling unconditional_recursion in any case.


#5

It gives

thread 'main' has overflowed its stack
fatal runtime error: stack overflow
/root/entrypoint.sh: line 7:     5 Aborted                 timeout --signal=KILL ${timeout} "$@"

On play with test changed to main.


#6

I don’t think the playground is running with FreeBSD 11.1-RELEASE.


#7

Proof-Of-Concept code already in #1.


#8

Yeah as I said it would be helpful if you could minimize it further. The code you gave has 50 functions! Please try reproducing the segfault with fewer than 50 functions.


#9

What would you expect to happen if you run out of stack by calling a function recursively? All systems I used, be it some kind of Unix or Linux, send SIGSEGV when encountering a stack overflow.
The warning you disabled is there to prevent obvious failing by stack overflow.

Also there was no stack overflow on release configuration, there might be some tail call recursion optimization going on.


#10

I’ m not very familiar with Serde but I guess these functions are required for implementing a dummy Serde Serializer. Most of them are there to make rustc and Serde happy.


#11

I know I’ve written some evil code. Just to be clarified that it’s possible for safe code to get SIGSEGV.


#12

IIRC, stack overflow leads to a panic on Linux, but I am not sure how that works and if it can be easily ported to BSDs.


#13

FreeBSD may need some work here - I left a FIXME on a suspicious bit recently when I was refactoring the stack guard detection in this file.

But the only practical difference is whether you get the message about stack overflow with a SIGABRT, or just a raw SIGSEGV. There’s no problem for memory safety here, as the program will still terminate either way.


#14

Also, a big difference in the OP reproducer is that it uses #[test], which will run in a separate thread, compared to the single-threaded main->end… recursion. The FIXME I mentioned is specifically dealing with pthread stacks.


#15

Thanks for your explanation!
I found this: Abort on stack overflow instead of re-raising SIGSEGV. It said:

This caused some confusion, as it was unexpected that safe code would be able to cause a segfault, while it’s easy to overflow the stack in safe code.

That’s exactly my case.


#16

You mean to link to this: Abort on stack overflow instead of re-raising SIGSEGV #31333 (link is missing a colon)