Behavior-altering features considered harmful?

mzabaluev · December 10, 2021, 4:41pm

Features are typically used to manage dependencies, or to incrementally enable API. Sometimes, however, feature flags are used to modify behavior of APIs otherwise enabled.

Here's why it can be problematic, on the example of time and its feature serde-human-readable which enables human-readable parsing and formatting in the serde impls in case when the Serializer/Deserializer advertises the format as human-readable.

Consider library a crate with this Cargo.toml:

[package]
name = "a"
version = "0.1.0"
edition = "2021"

[dependencies]
time = { version = "0.3", features = ["serde"] }
serde_json = "1.0"

It relies on omission of the serde-human-readable feature to format the output string as a JSON array:

use time::OffsetDateTime;

pub fn json_clock() -> String {
    let t = OffsetDateTime::now_utc();
    serde_json::to_string(&t).unwrap()
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn formats_as_array() {
        let json = json_clock();
        let v: serde_json::Value = serde_json::from_str(&json).unwrap();
        assert!(v.is_array());
    }
}

Taken on its own, the crate passes the test. Now add it to a dependency tree of another crate that additionally enables serde-human-readable in time:

[package]
name = "b"
version = "0.1.0"
edition = "2021"

[dependencies]
a = { path = "../a" }
time = { version = "0.3", features = ["serde-human-readable"] }

The binary of this crate prints a JSON string literal:

use a::json_clock;

fn main() {
    println!("{}", json_clock());
}

If these crates are put together in a workspace, the test in a also fails.

This example is a bit contrived, but there are likely other instances where behavior selected by feature omission is liable to be stealthily modified outside of the crate relying on this behavior.

Should such features be considered a footgun? Does it deserve a mention in the API guidelines?

alice · December 10, 2021, 5:18pm

It is generally considered good practice to not have features change behavior.

bluss · December 10, 2021, 5:22pm

Cargo doc about this more or less has this as consequence: Features - The Cargo Book “features should be additive”.

Maybe what it doesn't say, but is also a consequence, is that the feature design should be such, that if you depend on and use a crate Foo, then any other crate in the same project should be able to enable features in Foo without that ruining your original use of the crate Foo. It should continue to build and it should continue to work in a compatible way.

cuviper · December 10, 2021, 5:26pm

In this example, they could instead provide a time::serde_human_readable module with the alternative implementation, like we did with indexmap::serde_seq.

steffahn · December 10, 2021, 5:27pm

I don’t think that the time crate guarantees anything about the precise way its types are serialized. So it might not even be considered a breaking change, if e.g. the crate was changed so that serde-human-readable is always enabled by default; and a test-case like this might be invalid because it tests unstable implementation details.

At least in general, serialization accesses private fields... so precise serialization behavior of types is quite commonly unspecified and allowed to change over “non-breaking” changes like adding a new private field, or otherwise changing the internals of a serializable type.

cuviper · December 10, 2021, 5:47pm

Serialization is an ABI of sorts -- I would expect stronger guarantees than that. If someone is serializing over the network or to disk, there's no guarantee that it will be the exact same version deserializing, nor even the same target arch/os/etc. Implementations should try to be backwards compatible (new version reading old data), but ideally the new version should also serialize in a way that past versions can understand. This may mean that the derived implementation is insufficient.

geebee22 · December 10, 2021, 6:16pm

Yes. I would say it's an outright bug.

mzabaluev · December 10, 2021, 6:29pm

Unfortunately, this is less convenient than enabling a human-readable format in the default derive (as hardly anyone would like to read a time value from a broken-down tuple). The decision not to have it always on is motivated by code size and dependency management: the parsing and formatting features it depends on bring in some sizable code and a dependency on std (which could be made unnecessary with some rework, but that's another issue).

cuviper · December 10, 2021, 6:36pm

That sounds more like a Debug concern to me, where I think it would be fair game to make changes. Serialization is not generally for humans.

mzabaluev · December 10, 2021, 6:59pm

Why, we've been using serde_json and serde_yaml with configuration structures and other similar data to a great effect. This is why paying regard to is_human_readable is very useful in serde impls. Newtypes and helper mods are helpful, but less convenient.

jhpratt · December 10, 2021, 11:07pm

I am the maintainer of the time crate for those not aware.

This is correct. The only guarantee is that the serialized format can be deserialized by a compatible version of time (0.3 in this case). The actual format is not specified unless you're using #[serde(with = …)], in which case the stability guarantee is placed on the user.

The time crate can deserialize a binary format even when serde-human-readable is enabled. This was an intentional decision to avoid breakage. It will also serialize to a binary format if the serializer is not declared human readable.

cuviper · December 10, 2021, 11:11pm

But can it deserialize the human format when that feature is not enabled? If not, that's trouble.

If you are able to make that work, then most of the objection goes away. That is, if serialization depends on the feature while deserialization always handles both, then things should just work.

jhpratt · December 10, 2021, 11:30pm

No, it cannot as the underlying features required are not the same. Both serialization and deserialization depend on the feature. How is that trouble? Features are additive and global, so if anyone in the dep tree requires it, it's permitted.

I don't want someone who enables serde to also enable the entirety of the parsing infrastructure, which is quite significant.

cuviper · December 10, 2021, 11:32pm

Serialization redefines what it means to be "global". e.g. There might be a client/server pair built separately, but only one of them gets the feature added, perhaps indirectly from a dependency.

jhpratt · December 10, 2021, 11:38pm

That's fair. If there is a way to do this without enabling the full parsing infrastructure, I'm all ears. Parsing is stupid difficult to say the least.

cuviper · December 10, 2021, 11:51pm

I don't know how to make that work automatically, but it would be safe with an explicit opt-in at the source level, like a #[serde(with = "...")] helper as I mentioned in my first reply.

jhpratt · December 11, 2021, 12:21am

Anything that would work with #[serde(with = "…")] (aside from trivial things like the existing timestamp format) would still require the parsing feature to be enabled. This is the case with the pending pull request that makes those annotations super easy to create.

cuviper · December 11, 2021, 12:25am

Right, but that's the opt-in. Anything that doesn't use the attribute will use the binary format regardless of the other features, so it will be compatible between different builds.

jhpratt · December 11, 2021, 12:34am

Ah I understand what you're getting at now. It would be a breaking change to pull support for this now, but I'll make a note of that for a future breaking release.

mzabaluev · December 11, 2021, 8:24pm

Maybe when time-rs/time#400 is resolved, there will be fewer reasons not to roll the serde-human-readable functionality unconditionally into the serde feature?

Topic		Replies	Views
Thoughts about using nightly features from de-facto std crates (e.g. serde)? community	7	565	April 1, 2022
Time & serde_json help	4	3623	January 12, 2023
New features on lib.rs website announcements	20	6293	November 23, 2023
Manifest: Conditionally enable library dependency features help	5	251	August 23, 2022
Features best practices: changing Debug representation?	13	471	October 3, 2022

Behavior-altering features considered harmful?

Related Topics