This occurs about once out of perhaps every 500-600 logins (from what we've seen) and is probably the last bug that needs to be addressed in order to achieve 100% service availability. Please see the following ClearPass transaction (includes proxy ticket request, etc).
Service Ticket: ST-1234
<cas:serviceResponse xmlns:cas='http://www.yale.edu/tp/cas'> <cas:authenticationSuccess> <cas:user>jdoe</cas:user> </cas:authenticationSuccess> </cas:serviceResponse>
Proxy Ticket Response
<cas:serviceResponse xmlns:cas='http://www.yale.edu/tp/cas'> <casroxyFailure code='INVALID_REQUEST'> 'pgt' and 'targetService' parameters are both required </casroxyFailure> </cas:serviceResponse>
https://cas.example.com/clearPass?ticket= (proxy ticket)
<cas:clearPassResponse xmlns:cas='http://www.yale.edu/tp/cas'> <cas:clearPassFailure>No authentication information provided.</cas:clearPassFailure> </cas:clearPassResponse>
This doesn't happen often, but it has happened to 7 people in the past 5 hours, one of which I've found to be associated with a SocketTimeoutException which led to additional exceptions resulting in error.authentication.credentials.bad. Others, however, show no signs of such exceptions.
Granted, it's a less than 1% failure rate, but fails nonetheless for some users under certain circumstances. The observed result in the pgtInit response is that the pgtInit response doesn't return a PGT-IOU, even though we are still getting a cas:authenticationSuccess XML response.
Could this be caused due to replication latency as we are currently storing tickets in a replicated memcached environment?
CAS 3.5.2 with ClearPass / Couchbase with replication