Many people expect the characters in their language to be in the "correct" order in the Unicode code charts. Because collation varies by language and not just by script, it is not possible to arrange the encoding for characters so that simple binary string comparison produces the desired collation order for all languages
a bit weird considering the "[] sort before [null]", where there isn’t a “counterpart” against which to compare.
Good point. I will add this detail to the proposal:
Length of the array shall not be considered by the comparator logic.
And empty array shall sort before non empty arrays.
You should explicitly mention negative zeros, and whether you’re making a total order or not.
Yes the idea is to make a total ordering for the combined set of Integer and Float.
f64 values that are <= -2^127 will sort before all i128 integers.
f64 values that are >= 2^127-1 will sort after all i128 integers.
NaN, Not a Number values shall sort after all i128 integers.
-Infinity shall sort before all numbers.
+Infinity shall sort after all numbers.
NaN shall sort after +Infinity.
Additionally I will amend proposal with
f64 Negative ZERO shall short before f64 Positive ZERO.
I’m not sure you need to include these in JSON; you can have them be external, like Bound::Unbounded.
Excellent. To begin with, unboundedness are never part of any sorted set
of values. They just come from query. Will amend the proposal and code.
I think that just changes the uncertain case to [null]-vs-[null,null]. I'd expect a rule like "if the comparison runs out of items, the side the ran out of items sorts before the side that still has more".
Notice that, since JSON is a text format, it cannot distinguish between different NaN payloads or different representations of the same subnormal value, and so much of the complexity of that spec can be eliminated.
...then again, JSON doesn't even have NaN or Infinity, so that same argument could be taken even further...
Edit: I incorrectly thought denormals had degenerate representations. I don't know what that page is even talking about with having multiple representations for the same finite value.
Thanks for pointing me to total-ordering for floats.
Still I don't understand how same floating point value can have different representation in IEEE-754 format.
The total ordering section has this to say-
different representations of the same floating point number are ordered by their exponent multiplied by the sign bit.
If we are talking about the possibility of different representation in JSON text format for the same value, may be we can simply state that - "parse floating point string to 64-bit IEEE-754 format and proceed to comparison".
As far as representing -Infinity, +Infinity, -NaN, +NaN in JSON format, we can always extend the specification, like JSON5 which is backward compatible with JSON, if it make sense to have them in a document database.
And finally we can arrive at:
-Infinity shall sort before all numbers.
-qNaN negative quite-NaN shall sort after -Infinity.
-sNaN positive signalling-NaN shall sort after -qNaN.
normal and sub-normal shall sort after -sNaN.
+sNaN signalling-NaN shall sort after normal numbers.
+qNaN quite-NaN shall sort after +sNaN.
+Infinity shall sort after all numbers.
-0.0 and +0.0 shall be treated as equal.
Different representation of floating point numbers are usually
normalised in IEEE-754 format.
Ordering between sNaN and qNaN in the same class being
based on the integer payload, multiplied by the sign bit.
Ordering between Integer and floating-point:
Floating point value shall be converted to integer and compared.
Floating point values that are <= -2^127 will sort before all i128 integers.
Floating point values that are >= 2^127-1 will sort after all i128 integers.
Are the above rules enough for total-order for all numbers ? Please share your thoughts.