Data Flow

Data passes through OpenRegistry via a series of well defined stages, each of which is customizable to meet the needs of your institution. This flow is represented graphically here: (or-architecture-dataflow.png)

The definition of the stages is described below. Generally speaking, each stage reads only from the nearest tables (r,s,c) to the left on the graph.

Load

Data is loaded from batch oriented sources into the "raw" (r) tables. Data is loaded as is, and represents what was provided by the System of Record (SOR).

Data provided from real-time sources does not need to be staged, and is passed directly to Validate.

Customizations

Validate

Validation is a check of the inbound data against the defined syntax agreed upon with the SOR, as well as enforcement of the security model that determines what data the SOR is allowed to assert.

Customizations

Normalize

Normalization is the process of transforming the data according to institution specific rules designed to make all data look the same regardless of how it is represented by the SOR.

Defined Transformations

  • CorrectCase: Transform inbound data from UPPER CASE or other formats to Mixed Case.

Standardize

Standardization is the process of mapping the data from SOR specific interface definitions to the standard OpenRegistry data model. After standardization is complete, data is stored in the "standardized" (s) tables.

Customizations

Reconcile

Reconciliation is the process of matching records from one SOR with those already known to the system. It is key to maintaining a single identity for a person regardless of how many sources they come from.

Customizations

  • Reconciliation is configurable based on the attributes available in the "standardized" (s) tables, and on a per-SOR basis. That is, SORs that may not provide sufficient quality data for reconciliation to take place may be reconciled differently from those that do.

Identify

Identification is the process of assigning identifiers to newly added persons.

Customizations

  • Identifier assignment is configurable based on locally defined formats.

Elect

Election is the process of deciding which conflicting data provided by multiple SORs is the "true" data for a person. For example, if two SORs assert a different official name for the same person, only one name can be treated as official by OpenRegistry.

Defined Elections

  • First: The first value asserted by the first SOR is elected.
  • Hierarchy: A hierarchy of SORs determines which value is elected.
  • Provisional: An SOR may be considered "provisional" for another SOR, in which case the first SOR's records are invalidated when the second SOR record appears.
  • Last: The most recent value asserted by the most recent SOR is elected.

Customizations

  • Elected values may be overridden manually (via self-service or helpdesk).
  • Elected values may be reverted (XXX what does this mean?)

Calculate

XXX Attribute Calculation Engine

Export

Data is exported to Downstream Systems via batch, real-time, and provisioning interfaces.

Customizations