I don't understand what footgun you're talking about. You can already add strings by using a + &b
, so I don't see how allowing a + b
creates a footgun.
Because
You're talking about branch misprediction? Did you not say that this is "small fry"?
The check would only happen after it turns out the left side doesn't have enough space. This means that either we realloc
, or we try using the right's capacity. Realloc
is orders of magnitude more expensive than an extra check, so this is not significant.
No, I'm talking about the fact that asymptomatically reallocation is O(1) while prepend is always O(n). It's worse if you're talking about always reusing the right side, but not notably so. Talking about branch prediction was supposed to be pointing out I'm not talking about it (except for the minor "slightly worse" effect in the "normal" append case) - when the rest of the post is talking about the asymptomatic behavior.
I have no idea what you're talking about. What's n here?
realloc
may take time proportional to the amount of memory reallocated because it may have to copy data to a new location.
a + b
never takes longer asymptotically than a + &b
does.
I didn't see the following mentioned:
String + String
overload is just an inferior API which shouldn't exist. For that matter, String + &str
shouldn't have existed either. It's a rudiment from old languages, like Java or Python, which have long since themselves moved on to more ergonomic APIs.
Nobody uses string addition in modern Python. People use format strings, i.e. `f"hello, {world}!". It's shorter, with less syntactic noise, and more configurable (e.g. it's easy to use alternative formatting for parameters).
Similarly, nobody uses string addition in Rust. You should use format!()
, or write!()
if you want to reuse an existing allocation. Would you rather see this
String::from("hello") + world.to_string!() + String::from("!")
or this
format!("hello, {world}!")
I think it's obvious which API is more ergonomic, composable and configurable.
You're essentially in the first case here: "not enough data to care" when you're talking about a single operation. Performance only matters when you look at the context of usage, which is why when containers kind String or Vec reallocate, they double (or some other factor) the capacity which takes the asymptomatic cost to O(1).
When you say "a + b" is the same cost as "a + &b", that's only true when it's only appending to the left. When you're prepending to the right you're paying O(a.len() + b.len()), when you're appending to the left you only pay O(b.len()), and when you're iterating on "a + b" you generally have to either keep that result as the new a or the new b, or conditionally either (hence the three iterating cases I mention above). When you're keeping this as a, then you expect a.len() >> b.len() and the whole iteration is O(n) (where n is the final length), but in the others a.len() <=~ b.len() so the iteration is O(n²)
Neither a + b
nor a + &b
"appends" to the left or to the right. Both operations return a new String
. +=
appends.
Since you are saying my claim is false, can you demonstrate example strings and capacities where a + b
would take much longer than a + &b
?
If you say that this is only true for each individual operation but not in total then... this makes no sense, because the total time is the sum of individual times. But if you insist on this, can you show some example code where a + b
is slower than a + &b
?
You claimed here that you're talking about reusing the right string allocation:
Right? So you're essentially arguing for the suitability of using + in a context where the right string is likely to have capacity. The most likely case for that is some form of:
for left in lefts {
right = left + right;
}
You've previously agreed that this is poorly performing, right? It seems to be disconnecting when I argue that therefore this shouldn't be supported without it being obvious that the right is not being reused; making the caller look into why, which hopefully will allow them to avoid the O(n²) cost.
In the reallocation case, yes, it may already have to copy the data. But then you're also copying the data right, meaning you're now O(2n). You're still guaranteed to do more copying.
No, I'm not. I'm not arguing for the suitability of using +
in your code. That's a bad algorithm.
All I am saying is that if somebody is using a + &b
today, and could replace that with a + b
(which is often the case), then that would be an improvement. It is more ergonomic, never less performant, and sometimes more performant.
I don't quite understand what you're saying. If you're reallocating to a new memory location, you have to move the contents of both strings in memory. With the proposed optimization (reusing the right argument's allocation), you also have to move the contents of both strings in memory. What the optimization is saving is a call to realloc
, except if the left argument is empty in which case it's saving even more: realloc
and memcpy
.
Now, there is a case where it might make sense to forgo the optimization: if realloc
will happen in place, or will happen quickly using virtual memory, realloc
may be better than a memcpy
. This might be the case if the left string is very long. That's a policy decision that the implementation can make.
Again, if it's only ergonomics, then being able to reuse the right buffer doesn't matter. If it's only performance of the single operation, then nothing matters because the lengths involved are not going to be big enough to matter short of gigabyte sized strings, an extreme edge case. The problem is that it's both: the ergonomics of "String + String" implies that "a = a + b" and "b = a + b" are equivalent performance, which they very much are not, and your suggestion that it could reuse the right string buffer doesn't improve the situation.
You say the algorithm is bad: I agree. The problem is what the ergonomics guide you to in terms of performance.
Technically those two statements do have equivalent performance, but I understand you're talking about repeatedly concatenating small strings to the end vs the beginning of a large string.
Well... the implication doesn't make sense to me. Just like the performance of insert
and drain
depends on whether you're adding or removing things at the beginning or at the end, even though you're using the same API, a similar thing happens with adding small + large vs adding large + small. If you know how Vec
and String
is represented, this is natural and there is no reason to expect performance symmetry -- spare capacity is only maintained at one end and not the other.
Strictly, insert and drain have O(n) performance unless you go out of your way to do it at the end, which then have explicit append/push/pop APIs that are O(1): there's a reason for that!
O(n) is an upper bound. Removing the second-last element of a Vec
(via remove
or via drain
) is O(1) time.