The "Using UTF-8 with Gentoo" guide states: Unicode throws away the traditional single-byte limit of character sets, and even with two bytes per-character this allows a maximum 65,536 characters. Many people assume unicode only allows 65,536 characters but this is definitely incorrect. Current versions of unicode allow a maximum of 1,114,112 characters. See http://www.tbray.org/ongoing/When/200x/2003/04/26/UTF for a good explanation of supplementary planes. Most of this confusion arises because early versions of unicode did allow for only 65,536 codepoints. But this changed in version 3.2 (not sure of the exact version). The above link might be a good one to add to the list of resources at the bottom of the guide. Reproducible: Always Steps to Reproduce: 1. Read the UTF8 guide 2. 3.
Actually looking at this again and the following sentence, it is quite unclear whether you are trying to say that unicode only allows 65,536 characters or not. Certainly it needs to be rewritten. Actually 65,536 codepoints would probably be enough if it weren't for Chinese. There are already more than 65,000 Chinese characters in unicode and there are plans to add another 40,000!
You're right. I rephrased the paragraph and add the link to the 'Char vs bytes' article. Thanks for reporting.
I think you have rewritten the relevant section very clearly and concisely.