Log in

No account? Create an account
LiveJournal Client Discussions [entries|archive|friends|userinfo]
LiveJournal Client Discussions

[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

if you want bots to cache, make the resources cacheable! [Aug. 30th, 2007|08:07 pm]
LiveJournal Client Discussions
LiveJournal's bot policy page says:

"You are encouraged to cache the results of your bot's requests, which saves us bandwidth and CPU time. Bots making repeated requests on the same resource (URL) in a short amount of time will be blocked."

However, the HTTP responses LiveJournal sends for FOAF data are, per the heuristics in RFC 2616, nearly uncacheable. They don't include Last-Modified or Expires headers. There's sort of a cognitive disconnect going on here.
$ curl -I http://frank.livejournal.com/data/foaf
HTTP/1.0 200 OK
Date: Thu, 30 Aug 2007 22:27:28 GMT
Server: Apache
Cache-Control: private, proxy-revalidate
Vary: Accept-Encoding
Content-length: 45098
Keep-Alive: timeout=30, max=100
Connection: keep-alive
Content-Type: application/rdf+xml; charset=utf-8

Clearly a custom application can cache the data however it wants. But it would be a lot more convenient if we could take advantage of HTTP-level caching support in web client frameworks. I've just spent much of the day struggling with such a framework, trying in vain to convince it to cache FOAF resources so I didn't have to re-invent the wheel. :-/