Feedback requested on Shr for negative BigInt

cuviper · December 11, 2017, 7:47pm

I would like community feedback on this issue:
https://github.com/rust-num/num/issues/347

The gist is that Shr for primitive integers always rounds down, but Shr for BigInt is rounding toward zero, which means it gives different results when shifting negative values. This behavior has existed ever since BigInt was first added to the standard library in 2012!

BigInt::from_biguint(self.sign, self.data >> *rhs)

I doubt that was intended to behave differently than negative i32, for instance -- it's just the most obvious way to implement Shr in this case. Rounding down will require some conditional fixups.

But I think matching primitive behavior makes the most sense across the board, so I'd like to just change the behavior and call it a bug fix, without a semver bump. Would anybody object to this?

radix · December 11, 2017, 9:42pm

I don't use num directly, so my input should be taken with a grain of salt, but this seems like a big enough change in behavior to warrant a major bump.

mthom · December 11, 2017, 9:58pm

I'd prefer shr to use the expected behavior with two's complement binary. Currently I'm emulating it myself using this code:

    fn signed_bitwise_op<Op>(&mut self, n1: BigInt, n2: BigInt, t: usize, f: Op)
        where Op: FnOnce(BigUint, BigUint) -> BigUint
    {
        let n1_b = n1.to_signed_bytes_le();
        let n2_b = n2.to_signed_bytes_le();

        let u_n1 = BigUint::from_bytes_le(&n1_b);
        let u_n2 = BigUint::from_bytes_le(&n2_b);

        let result = BigInt::from_signed_bytes_le(&f(u_n1, u_n2).to_bytes_le());

        self.interms[t - 1] = Number::Integer(result);
    }

I couldn't find a more efficient way of doing it. There's a lot of needless copying going on.

cuviper · December 11, 2017, 10:15pm

If I understand correctly, a right-shift through that signed_bitwise_op will give you a logical shift, no? Or do you also preserve the sign bit yourself? Are there other ops that you emulate this way?

There is BigInt::from_biguint(Sign, BigUint) already, but we could probably add a 2's-complement-signed version of this. Then in the other direction, I'm surprised we don't have something that returns (Sign, BigUint), but we could add that and another that outputs a 2's-complement-signed BigUint. (All of this regardless of whether we change Shr.)

xfix · December 11, 2017, 10:25pm

It is a quiet breaking change, just saying. The point of semver is not to be pretty, but to be actually meaningful. By all means, feel free to release a major version.

Also, due to the way num is split into multiple crates, most types in num 0.1 and num 0.2 would actually be compatible, the only exception would be num-bigint types. Just bump major release of num and num-bigint, this should reduce the impact on other crates.

mthom · December 11, 2017, 10:31pm

Yes, you are right. I wanted an arithmetic shift, but just found that shifting -1 by 1 gives 0, not -1.. so I am getting a logical shift.

cuviper · December 11, 2017, 10:32pm

The thing that terrifies me about a server bump for num is that it's often a public dependency, with its traits exposed in the public interface of other crates. I really don't want a "numpocalypse" akin to libc's 0.1 to 0.2.

Maybe I could separate a semver bump for just the num-bigint sub crate, but the num::BigInt re-export wouldn't be able to follow it.

xfix · December 11, 2017, 10:39pm

Once again, the impact of bumping num wouldn't be as big as it appears. Let's say that only num and num-bigint would be bumped.

Let's say that I'm using a crate that depends on num 0.1 and implements num::Zero for some of its type. My program is using num 0.2, and tries to use that trait...

And it works. Why? Because num 0.1 and num 0.2 both depend on num-traits 0.1. It's the exact same trait.

xfix · December 11, 2017, 10:44pm

That said, ideally libraries would reexport any crate or parts they use they consider fully as part of their public API, but unfortunately that's not always the case .

cuviper · December 11, 2017, 10:50pm

Hmm. You're probably right, and I may even be able to use the semver-trick to keep BigUint compatible. Although if I do bump num-bigint, I may reach for some other breaking changes too, like hiding big_digit as an implementation detail.

I should also take this time to say, I could really use help maintaining num! I take very careful care of it, since I know it is so widely used (since long before I had anything to do with it), so I would want someone who will share that attitude. But the reality is that I just don't make much time to actually improve it.

tbu · December 13, 2017, 2:17am

I'd suggest doing whatever Python is doing. They have a native big integer type.

bill-myers · December 14, 2017, 5:04am

It should obviously behave like i32/i64.

This happened due to the poor choice of representing BigInt as (BigUint, Sign) rather than using the normal 2-complement representation (which in addition to issues like this is obviously bad for performance since you need to branch on the sign bit on every operation).

This is part of the bigger issue of num-bigint being implemented pretty terribly (it looks like it was written with the goal of providing bignums with minimal effort and code, rather than providing the fastest, lowest memory overhead and most expressive library).

Just to mention a few of the most critical issues:

No bignum type with small bignum optimization
Bignums use 32-bit units even on 64-bit machines and fail to take advantage of "add with carry" instructions, probably resulting in halving the performance
No SIMD usage (another 2-4x performance loss at least...)
No way to have "borrowed" bignums, only an owned vector of BigDigits

This also holds for the num crate as a whole: e.g. Ratio implements comparison by recursive division (!) rather than taking the time to add a trait for extending multiplication and using that like any sensible implementation would do (this is likely 10-100x slower in case it's not obvious).

Long-term, a rewrite from scratch is required.

ExpHP · December 14, 2017, 12:37pm

What are you picturing for this? Ref/Mut (or Slice/SliceMut) types make sense to me for types like SoA vecs where it seems reasonable to want to have a type with the same ownership semantics as &mut [T] (contiguous subset, mutation without reallocation...), but it wouldn't seem so useful for BigInt where even AddAssign can reallocate.

cuviper · December 14, 2017, 8:57pm

Rather than just a list of grievances, I would be more receptive to PRs, especially those that don't break API. I am just one maintainer, who didn't even write most of this code in the first place.

Or if you want to do a whole rewrite that breaks API, you're welcome to create your own crates, of course.

le-jzr · December 17, 2017, 11:34am

If you go for a two's complement behavior, IMO you should go all the way, and implement all bitwise operations for it, removing the need for a separate unsigned type. E.g. in this case, (n & -1) == n, because -1 is conceptually represented as an infinite string of one-bits. Using two's complement representation for one operation and sign-magnitude for all the rest would be inconsistent.

le-jzr · December 17, 2017, 11:38am

Alternatively, to avoid semver problems, you could just add another, explicitly two's-complement-based type, and keep the old sign-magnitude types untouched, but deprecated.

scottmcm · December 18, 2017, 8:13pm

Sharing a layout between the two would be nice (like Vec<u8> and String), but I think there's still value in two different types, for things like 4 - 10.

le-jzr · December 18, 2017, 8:58pm

for things like 4 - 10.

What would be the expected behavior in this case?

scottmcm · December 18, 2017, 9:00pm

panic!() would make sense, like 4_u32 - 10.

le-jzr · December 18, 2017, 9:02pm

It doesn't seem necessary to have two separate types just for the sake of builtin assert!(x >= 0).

Topic		Replies	Views
Why is the Rhs of bitshift operations not forced to be unsigned/u8? help	9	550	October 5, 2020
Looking for code review for my simple Rust big integer help	2	573	August 1, 2019
Why does `checked_shr` only work with `u32`? help	7	495	October 11, 2021
Integer square root algorithm announcements	21	6325	January 12, 2023
Binary literal out of range for i32 (with negative numbers) help	7	1486	January 12, 2023

Feedback requested on Shr for negative BigInt

Related Topics