[RFC] Rust versus NULL

I am a database professional, not a developer. And I have my personal war against SQL NULL, for several reasons. To sum up:

  1. From a theoretical point of view, Ted Codd proposed two different markers for unknown and missing values. SQL mixed them into a single NULL marker, which sometimes behaves as a missing value, and sometimes as an unknown value.
  2. From a practical point of view, the vast majority of incorrect SQL queries (queries that work but don't return the correct results) don't handle NULL correctly in one or more places.
  3. Despite this, I admit that using NULL in relational databases is necessary, in some cases.
  4. However, most of the times columns are NULLable because this is the default, and they contain NULL values because the developer didn't consider any alternative. Or it's just inserted by mistake.

Now I'm learning Rust, and I found out that it has a great way to deal with the idea of NULL without actually implementing NULL - I'm thinking about Option of course.

I wrote an article to show people who follow me the Rust way. This may be of little use (SQL NULL is here to stay), but in my opinion opening minds and trying to open a discussion are always good things.

Here is the article:

I hope to see comments from this community, here or in that page. Also, since I'm a Rust newbie, if I wrote anything stupid I would be grateful to whoever show me my mistakes.

Thank you in advance!

4 Likes

Very nice. I spotted one small mistake; you probably meant Option<String> and Some("Greta") here:

If a variable is Option(<String>) you can assign it Option("Greta") , but not just "Greta" .

.

1 Like

Great article!

Some thoughts:

This doesn’t solve one problem: the inconsistency of NULL semantics in SQL, which I described in What does NULL mean in SQL?. For Rust, this problem doesn’t exist: using None in an expression will usually cause the program to panick, which always makes sense, no matter if you consider None as an unknown value or a missing value. In SQL this is not desirable, so it would be desirable to have different markers with different behaviours.

The enum mechanism in Rust is very powerful, and Option is just an example. To handle unknown values, we can define a new enum: (I'm sure there is a better name than MaybeUnknown)

#[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
pub enum MaybeUnknown<T> {
    Unknown,
    Known(T),
}

Here's an example usage, where raw scores (integers in the range 0..=100) are scaled by 0.1.

mod maybe_unknown {
    #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
    pub enum MaybeUnknown<T> {
        Unknown,
        Known(T),
    }
}

use maybe_unknown::MaybeUnknown::*;

fn main() {
    let raw_scores = vec![
        Known(Some(30)),
        Known(Some(60)),
        Known(None),
        Known(Some(90)),
        Known(Some(100)),
        Unknown,
    ];
    println!("Raw scores: {:?}", raw_scores);

    let scaled_scores: Vec<_> = raw_scores
        .iter()
        .map(|&raw_score| match raw_score {
            Unknown => Unknown,
            Known(score) => Known(score.map(|s| f64::from(s) / 10.0)),
        })
        .collect();
    println!("Scaled scores: {:?}", scaled_scores);
}

(playground)

The table of raw scores looks like

Index  Raw Score
0      30
1      60
2      No Score
3      90
4      100
5      Unknown

and the result is

Index  Raw Score
0      3.0
1      6.0
2      No Score
3      9.0
4      10.0
5      Unknown

Note that a score of 0, No Score, and Unknown are distinct from each other.

2 Likes

@2e71828, ouch! I fixed it now. Thanks for reporting!

1 Like

Nice read! Coming from Java/Python, not having to worry about null is definitely one of my favorite things in Rust along with no exceptions (ones that are intended to be handled in normal use anyway) which is likewise replaced with an enum.

using None in an expression will usually cause the program to panick

I don't think this is accurate (or I misunderstood), if the signature of a function or a type contains an Option then it's usually expected for it to be None sometimes, only using unwrap in cases where it is guaranteed in some way the value is not None.

2 Likes

@L.F yeah. I thought about it, but your example is much better than what was in my head. I was thinking that an enum could contain Unknown, None and Some(<T), but actually it makes more sense to separate these ideas (Option and MaybeUnkown). Thanks a lot!

Does Rust have a way to define the behaviour of operators with enums? I'm thinking that it would be great to implement MaybeUnkown in a way that this code would return 0:

use maybe_unknown::MaybeUnknown::*;
let a: MaybeUnknown<u32> = Unknown;
a - a
1 Like

Yes; most operators can be overloaded by implementing one of the traits in std::ops.

1 Like

@Heliozoa, thanks. I used a lot of Python (even though I never became an expert), and I agree with you!

With that sentence I meant that you cannot do something like this:

let n = None;
let m = 10;
let x = n - m;

Which is good, because if you do it you're not thinking about what you're doing. I thought the context makes it clear, but if you find it unclear I should fix it.

1 Like

Brilliant. I'll try to implement a MaybeUnknown type and I'll share the result.

(Edit for context: this was written in response to a (now deleted) code sample above)

The trouble here is that you assume two Unknowns refer to the same value, which isn’t necessarily the case. You need to track the identity of various Unknowns to know whether to return 0 or Unknown. Maybe something like this (untested):

#[derive(Eq,PartialEq,Copy,Clone,Debug)]
struct Identity(usize);

impl Identity {
    fn new()->Self {
        static NEXT = AtomicUsize::new(0);
        Identity(NEXT.fetch_add(1, Ordering::Relaxed))
    }
}

#[derive(Copy,Clone)]
enum MaybeUnknown<T>
{
    Unknown(Identity),
    Known(T)
}

impl Sub for MaybeUnknown<u32> {
    type Output = MaybeUnknown<u32>;

    fn sub(self, rhs: MaybeUnknown<u32>) -> Self::Output {
        use MaybeUnknown::*;

        match (self, rhs) {
            (Unknown(x), Unknown(y)) where x==y => Known(0),
            (Known(a), Known(b)) => Known(a - b),
            _ => Unknown(Identity::new()),
        }
    }
}
2 Likes

Oh, you're absolutely right. I made a huge false assumption.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.