|if you want bots to cache, make the resources cacheable!
||[Aug. 30th, 2007|08:07 pm]
LiveJournal Client Discussions
LiveJournal's bot policy page says:|
"You are encouraged to cache the results of your bot's requests, which saves us bandwidth and CPU time. Bots making repeated requests on the same resource (URL) in a short amount of time will be blocked."
However, the HTTP responses LiveJournal sends for FOAF data are, per the heuristics in RFC 2616, nearly uncacheable. They don't include Last-Modified or Expires headers. There's sort of a cognitive disconnect going on here.
$ curl -I http://frank.livejournal.com/data/foaf
HTTP/1.0 200 OK
Date: Thu, 30 Aug 2007 22:27:28 GMT
Cache-Control: private, proxy-revalidate
Keep-Alive: timeout=30, max=100
Content-Type: application/rdf+xml; charset=utf-8
Clearly a custom application can cache the data however it wants. But it would be a lot more convenient if we could take advantage of HTTP-level caching support in web client frameworks. I've just spent much of the day struggling with such a framework, trying in vain to convince it to cache FOAF resources so I didn't have to re-invent the wheel. :-/