I will edit my comments using the syntax used by @BurntSushi, namely U+FFFF instead of 0xFFFF.
Can you cite the exact passage on wikipedia you are referring to? The phrase "must never appear in a valid UTF-8 sequence" only shows up once, and it does not appear relevant.
The table in the wikipedia article is talking about "UTF-8 code units (individual bytes or octets)" not Unicode scalar values. So it means those bytes will never appear in UTF-8. Not that those unicode values will never appear.
It doesn't say that. You are mixing up Unicode and its various encodings.
Wow! I am brain dead. You've been saying this the whole time, and I finally understand what you mean. I'm very sorry for wasting your time. It clearly states above the table that it's talking about "UTF-8 code units". Thank you for your patience.
Yes, it does indeed. I'm sorry about my brain fart.
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.