Attributes

There are several different views of Attributes that differ based on the level of complexity in your environment.

If you never looked at a standard, you may think that an Attribute is a set of name+value strings. This is the old RFC 822 approach

mail: Howard.Gilbert@yale.edu
affiliation: Staff
department: ITS

This approach worked well when every enterprise created its own directory with its own format. Then someone decided to create an LDAP standard so that, in theory, someone at Yale could get information about someone at Rutgers by using a formally defined binary network protocol. The problem was that everyone had to agree to use the same name for the same thing. Programs wouldn't work if they were expecting to get an attribute called "E-mail" and instead got "mailaddr".

LDAP defined one set of properties for "people" in general. Educause expanded this with a few additional properties for people who are at universties. This EduPerson standard contains most of the information that you plausibly might want to disclose.

An EduPerson/LDAP attribute has a more complicated name and a long unique numeric code.

"eduPersonAffilitiation" has the unique registered OID of "1.3.6.1.4.1.5923.1.1.1.1". There is no guarantee (however unlikely it would be in reality) that no other directory system anywhere in the world will use the name "eduPersonAffiliation" to mean some entirely different thing, but the number is guaranteed to only mean the "eduPersonAffilitiation" in the Educase standard. The standard suggests a list of typical values for this attribute: "faculty, student, staff, alum, member, affiliate, employee" but the list may be extended.

Then SAML came along. XML is a character standard instead of a binary standard. More importantly, it has a different approach to guaranteeing uniqueness. Simple names like "eduPersonAffiliation" are qualified not by a registered number, but instead by a unique namespace URI. Thus when this particular LDAP attribute is rendered in Shibboleth/SAML terms, the resulting XML looks like:

<Attribute xmlns:typens="urn:mace:shibboleth:1.0"
AttributeName="urn:mace:dir:attribute-def:eduPersonAffiliation" AttributeNamespace="urn:mace:shibboleth:1.0:attributeNamespace:uri">
<AttributeValue xsi:type="typens:AttributeValueType">member</AttributeValue>
</Attribute>

For those who don't speak XML, this says that there is an attribute named "urn:mace:dir:attribute-def:eduPersonAffiliation" in namespace "urn:mace:shibboleth:1.0:attributeNamespace:uri" that has the value "member" in the schema for namespace "urn:mace:shibboleth:1.0".

Now I would agree that the casual use of attribute nicknames leaves open the opportunity for some confusion. However, this particular piece of XML may be horribly complicated and over-qualified for any useful purpose. Shibboleth understands this and significantly reduces the complexity of the attributes delivered to applications. However, if SAML is going to be one of the formats supported by CAS for Service Ticket validation, then we have to be clear about the filtering that may be applied between the <Attribute> SAML structure delivered to Yale by Rutgers and the corresonding simplified <Attribute> SAML structure delivered by Yale CAS to some library application.

In making this decision, we have to remember that many attributes may not come from SAML or even from any logic that knows about LDAP OID fields. In many cases, they come from the columns of a database table or some other set of simple name-value pairs.

An Attribute object therefore has several fields:

A name which uniquely identifies this attribute to all programs. [We decide not to expose a separate namespace, so it is up to the CAS plugins to make sure that no two attributes have the same name and to try, as much as possible, to ensure that the same attribute from different sources has the same name.]

A set of values. It is a set because one person may have several phone numbers, several E-Mail addresses, and several eduPersonAffiliations. He may, for example, be both a student and an employee.

An institutional source or scope. The same attribute name may be in use at more than one school. When a person can authenticate simultaneously at both institutions, then we need to know that name=eduPersonAffiliation,scope=yale.edu has a value of "faculty", while name=eduPersonAffiliation,scope=rutgers.edu has a value of "alumnus". If scope is null, this implies that the attribute comes from the local institution.

Having said this, CAS provides the framework for generating Attribute objects (the authentication process and its various plugins). It also provides a framework for delivering attribute information to services in various formats during Service Ticket validation. It is up to the institution using CAS to make sure that all the plugins and services are consistent.

In particular, CAS doesn't know or care if an attribute name is "eduPersonAffilitiation" or the longer "urn:mace:dir:attribute-def:eduPersonAffiliation". Either will work if the authentication plugins create it and the services consume it. There will be problems if the plugins want to generate one form of name and the services expect to see the other. This is not a problem for the CAS infrastructure. Is is an institutional configuration problem.

Currently, there is no "right answer". You will get different suggestions if you talk to different groups of people. At some point, CAS may have to add a name-mapping layer corresponding to what the W3 group calls "ontology" to solve the "different names in different standards for the same thing" problem. For now, any mapping is your problem.