|problems with lastsync
||[Apr. 4th, 2004|01:47 am]
LiveJournal Client Discussions
i've implemented a client that uses the syncitems/lastsync routines. i read evan's code and wrote a similar implementation, popping off syncitems as they were downloaded, and looping until syncitems is empty.
my algorithm was in stuck in an infinite loop, and here i was pulling my hair out for hours thinking 'Perhaps the client is broken?'.
calling syncitems first, the server is returning an array of items, and one item in particular has the itemid 441. (testing this on my own journal)
then, following getevents through its course, the final result set has NO itemid of 441. furthermore, calling getevents with selecttype of "one" and "itemid" of 441 returns an empty result set.
if this item doesn't exist, why in the world is syncitems returning it to start with? is this a bug? what's going on?
It existed at one point. So, let's say this happens:
- User makes post, gets id 441. User makes more posts.
- User downloads their journal. 441 exists at this point, so it's downloaded to.
- User deletes post 441.
- User re-syncs their journal...
If 441 just disappeared, the client wouldn't know what happened to it. So instead, it's there, but it should say that the type is deleted or something like that. (Or updated or something... I forget what the types are.)
2004-04-04 11:35 am (UTC)
ok, fair enough. that makes sense. i checked the action for that sync item, and it shows it as "update".
following the logic in the algorithm you described a few posts down (and following evan
's similar logic in the algorithm he posted as a reply, from logjam, won't deleted posts cause you to loop forever? (you never get that itemid, hence you never mark it as downloaded)
if the syncitem was marked as "delete" instead of update, i think this could be dealt with. kind of makes sense that way.
am i being obtuse?
find the oldest "time" in this hash for items that have downloaded == 0
mark THIS item as downloaded (so we don't use the same time twice and loop forever)
You'll never loop if you follow my instructions to the letter. :) Whenever you use something as the lastsync time, you mark it as downloaded. That way, if there's a problem of any sort, the worst case scenario is that you send one request per entry you have from syncitems, which is bad, but still better than looping forever.
And if the item is marked as updated, and you don't get it back from getevents, you can assume it's deleted.
2004-04-04 12:52 pm (UTC)
aha! very sneaky. thanks for clearing that up.
2004-04-04 01:08 pm (UTC)
if a user has a habit of deleting a lot of their entries, syncitems has the potential to be very slow and very painful on the server. that's one extra getevents call for EVERY entry that it can't find.
i came up with a more optimistic algorithm, except it will also loop forever in the rare event that a user deletes 100 entries in a swath.
if syncitems differentiated between "update" and "delete", you could write a much more elegant, less taxing algorithm.
if that that entry was deleted before ~spring 2003 you will get its id anyway. you need to check amount of entries always, for ex. if it's zero for selecttype "one" that means entry deleted
2004-04-04 11:55 am (UTC)
it sounds like what this implies, then, is that if i am making an algorithm to archive entries from the server:
- i should NOT use syncitems to intially create the archive. instead i should just loop getevents (selecttype = syncitems) and keep looping with latest eventtime, until it returns 0 entries.
- after the archive is created, i should use syncitems combined with getevents (selecttype = one) to get creates/updates/deletes.
itemids are needed only to get update time for particualr entry. you store them and assign to the entries you get new with getevents/syncitems. in this case you should not care if one of ids is not used. also there is a selecttype " multiple" (check the ljprotocol.pl source)