/
Active Document Cacher

Active Document Cacher

Many uPortal deployers run into the same issue, how do I cache all of this remote web content (RSS, WebProxy, iCal, ...) locally to improve performance and reduce dependency on remote services. Yale, UW-Madison and many others have implemented their own local solutions to this problem with limited success and portability. This page is meant to collect requirements and design information for a general solution application for this problem.

Requirements

  1. Cached documents will be retrieved via a URL into the document cacher service.
  2. Actively retrieve a configured URL at a specified interval
    • Ability to vary interval based on absolute timing or timing relative to last successful or failed retrieval
  3. Configure complex retrieval intervals
    • One idea here would be to allow cron expressions
  4. Specify the action to take when a retrieval fails
    • Continue serving old data
    • Serv some per-URL error message
  5. Set an optional max age for cached data
  6. Share the cached data between multiple server nodes (Optional?)
  7. Allow 'easy' configuration via a big-long-url with all of the config parameters

Design Ideas

  1. Defined DocumentRetrievalService (DRS) interface
  2. DRS lookup by document URI
  3. cache service interface
    • allows for per-document cache settings
  4. how to store service configuration?
    • xstream - local xml file
    • embedded database (similar to bookmarks?)
  5. quartz for scheduling
    • need to have a db backed job store?