Service Availability Strategies

Every CAS deployer must answer the following question at some point in time, "What should happen if CAS goes down?"  The reason this question is so important is because CAS is the gatekeeper deciding whether your constituents should use your web applications.  Every CAS deployer starts off with the default settings for the lay of the land, but if you are concerned with user experience and service uptime, you will need to decide how far you want to go to ensure these important requirements.  We will spend some time discussing the most common decisions made to this question, their benefits, and the costs associated with each.

Ticket Registry Strategies

CAS stores information about users in a ticket registry that keeps track of when users logged in, their usernames, the services they have logged into, how long their sessions are good for, etc.  The ticket registry is one of the pieces that affects CAS service availability as we will see shortly. 

Option 1: Standalone ticket registry

By default, CAS is configured to use an in-memory ticket registry that is constructed every time the server is started.  This allows first-time deployers to quickly get an instance of the CAS server up and running with little effort at the cost of users' SSO information being erased if the server goes down.  Some say this is an acceptable loss for several reasons:

  1. Users are still logged into applications with existing sessions
  2. Going to a new application merely requires the user to login again

If we were talking solely about whether users are required to login again in order to use a new application, then the default ticket registry might be enough.  However, CAS server 3.1.1 and higher uses a feature called Single Sign Out (SSOut) that relies upon users' SSO information to expire users' application session whenever they go to logout of CAS.  This has important implications whenever users leave their browsers open after logging out thinking that someone cannot use their protected applications.  This issue is mitigated through the use of distributed ticket registries and elimintated through the use of persistent ticket registries.

Option 2: Distributed ticket registry

Distributed ticket registries reduce the risk of losing users' SSO information by storing it on multiple CAS servers.  This effort lessens the possibilty of losing users' SSO information as a CAS cluster can recover as long as a single machine is up, however this does not completely eliminate the possibility.  This provides a handful of additional benefits over the default ticket registry:

  1. Harder to interrupt users' experiences
  2. Harder to completely lose users' SSO information

Though there are some benefits over the default ticket registry, there are also some drawbacks to this approach:

  1. Users' SSO information is volatile; lost if all servers crash
  2. CAS servers must be aware of one another
  3. Server-to-server replication is very chatty as number of CAS servers goes up

The following distributed ticket registries are available depending on the version of CAS you are deploying:

Ticket Registry

Version

Description

JBoss Cache

3.0.6

Uses JBoss Cache to store and replication information between servers over TCP / UDP

Memcached

3.3.0

Uses Memcached to store information and third-party patch to replication information

Ehcache3.5.0Uses Ehcache to store and replication information between servers over TCP / UDP

Option 3:Persistent ticket registry

Persistent ticket registries eliminate the risk of losing users' SSO information by storing it into an external datastore that can survive crashes.  This is the high road in term of ticket registries as it buys very attractive benefits over both the default and distributed ticket registries:

  1. Much harder to interrupt users' experiences
  2. Much harder to completely lose users' SSO information
  3. Server-to-server replication is reduced / eliminated
  4. CAS cluster don't know about one another
  5. Easier to scale CAS cluster

Though there are a considerable number of benefits from this approach, there are also some drawbacks:

  1. Persistent ticket registries become bottlenecks
  2. Need to ensure datastore availability through replication

The following persistent ticket registries are available depending on the version of CAS you are deploying:

Ticket Registry

Version

Description

JPA

3.2.1

Uses JPA framework (Hibernate, OpenJPA, TopLink, EclipseLink) to store information in datastore

Summary

Every CAS deployer should spend some time reviewing the options availble for ticket registries in reference to their organizational needs and resources available to support the costs.

Ticket Registry

Type

# of Machines

Version

Pros

Cons

Default

Standalone

1

N / A

  • Quick and easy
  • Data lost when server goes down

JBoss Cache

Distributed

2+

3.0.6

  • Replication options (TCP/UDP, unicast/multicast)
  • Data is somewhat resilient; improved availability
  • Data lost when all servers go down
  • Network chatty
  • CAS servers aware of one another

Memcached

Distributed

2+

3.3.0

  • Light-weight alternative to JBoss Cache
  • Separate process from CAS
  • Data is somewhat resilient; improved availability
  • Data lost when all servers go down
  • Network chatty
  • Requires third-party patch
  • Limited replication options (question)
  • Comprehensive data retrieval impossible
  • CAS servers aware of one another
EhcacheDistributed2+3.5.0
  • Data is somewhat resilient; improved availability
  • CAS servers aware of one another
  • Network chatty

JPA

Persisent

1+

3.2.1

  • Separate process from CAS
  • Data is very resilient; highest availability
  • CAS servers are not aware of one another
  • Datastore availability must be ensured
  • Datastore can be bottleneck