March 9th, 2004

тайные знания
  • quirrc


It looks like LJ accepts raw (non-encoded) UTF8 from clients. I do not know whether that is an error or on purpose but it was one of the reasons why some (i.e. random) users on Asian locales (CJK) could not use Semagic for a long time receiving "Invalid UTF stream error". That was definitely an error in Semagic too (of the same nature as described here, i.e. isalpha and other similar functions that are used in urlencoding are locale dependent (surprise surprise)) but it would have been discovered long time before if LJ were not accepting it non-encoded at all. It looks like for some strings it works but other are accepted only encoded. I may suggest that the code at LJ should be at least re-checked (if not to block non-encoded) because it may be associated/lead to some other errors (maybe somehow related to composite/precomposed stuff, I'm not a specialist here).