Log in

No account? Create an account
ver=1 - LiveJournal Client Discussions — LiveJournal [entries|archive|friends|userinfo]
LiveJournal Client Discussions

[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

ver=1 [Apr. 14th, 2002|11:23 am]
LiveJournal Client Discussions
From the news: it looks like LJ has switched to UTF-8.

To indicate to LJ that your client supports UTF-8, you need to set the "ver" key on your requests. People who don't send the key are assumed to be at protocol version 0; version 1 is identical to zero with the addition of UTF-8 support.

Here's my login input from LogJam:


(Hopefully avva will correct me if I've got any of it wrong? :P )

[User Picture]From: avva
2002-04-14 11:49 am (UTC)
Does LogJam already support UTF-8 properly? That's grand of you ;)
(Reply) (Thread)
From: evan
2002-04-14 12:15 pm (UTC)
I'm using GTK2.0 which uses UTF-8 internally, so yeah, in a sense...

(I only really noticed it when GTK started printing warnings because people on my friends list have high characters in their fullname. :P)

Sadly, when I take even a simple GTK 2.0 program that displays a text entry box and run it in a Japanese locale, it segfaults as soon as I hit shift-space (the "switch to japanese input key"):
trout:~/utf% LC_ALL=ja_JP.UTF-8 ./simple
zsh: segmentation fault LC_ALL=ja_JP.UTF-8 ./simple

When I run it through gdb it doesn't segfault, it just behaves like I entered a normal space. :(
(Reply) (Parent) (Thread)
[User Picture]From: phil99
2002-04-14 11:58 am (UTC)
UTF-8... I guess my biggest question is, is this going to be a necessary thing to implementat in the future? Or will we all be able to carry on without it?
(Reply) (Thread)
From: evan
2002-04-14 12:16 pm (UTC)
Depending on your programming language, it should be near-trivial to implement.

Not supporting UTF-8 makes your program sorta useless to anyone using LJ in another language, which may or may not be of concern to you. :)
(Reply) (Parent) (Thread)
[User Picture]From: phil99
2002-04-14 12:19 pm (UTC)
Well, my current effort is Perl... I haven't got a clue - I just send the damn text to the server and hope it works :) And if it doesn't, try again until something stops breaking.
(Reply) (Parent) (Thread)
[User Picture]From: jerronimo
2002-04-14 01:50 pm (UTC)
could you put an lj-cut on that, or break the line or something?

thanks. :]
(Reply) (Thread)
[User Picture]From: sprote
2002-04-14 03:02 pm (UTC)


Thanks to Evan's clues I got UTF-8 support working very quickly in Journalert. All it requires is sending the "ver=1" with every request, and interpreting the response text's charset as UTF-8. (That includes the contents of an event body once you've finished decoding the "%xx" stuff into 8-bit.)

I've noticed that events retrieved via 'getevents' come back with a metadata property 'unknown8bit' whose value is '1'; I assume that means these were posted by a version-0 client so the server doesn't know what the charset is. It seems to do an OK job of translating from the prior default (CP1252, aka WinLatin1, aka ISO-Latin-1-plus-useful-extra-chars) to UTF-8, so I haven't had to add any code in my client to deal with that.
(Reply) (Thread)
[User Picture]From: avva
2002-04-14 04:41 pm (UTC)

Re: Thanks!

Good note about unknown8bit bit.
Maybe we should remove that when sending to a ver 1 client (because it was just converted to UTF-8, so is no longer unknown8bit).
(Reply) (Parent) (Thread)