Page Comparison

Data Model ERDs

(revised October 2008February 2009)

...

...

Source/batch files are loaded into raw (R) tables
The data is normalized and moved into standardized (S) tables, eg prs_sor_role_records and associated tables
Where a person has multiple records from a given SOR, the "best" biodem data is elected into prs_sorprc_persons
1. This covers (eg) correction of typos and name changes
Where a person has multiple SORs, the "best" biodem data is elected into prc_persons
1. Note: The current table definition implies same SOR for best name & biodem

The database is a "black box", so nothing sees it except for core Registry code. All manipulation is done via APIs.
Where possible, tables should be consolidated to keep the number of tables down and simplify administering them. As a general rule of thumb, if two tables have the same structure and vary by only one column name, the tables should be consolidated.
As a general rule, only Calculated data is referenced for publishing outside the Registry.

Each SOR can only assert one set of biodem data and one official name.
prs_sor_roles must have only one entry per role, where role is department + title.
When an SOR role assertion disappears, the record remains in prc_role_records, possibly with a stop date added.

Table names are prefixed CCT_ where CC indicates the responsible component and T indicates the type of table as enumerated above.
Table and column names are all lowercase, with underscores (_) to separate words/fragments. StudlyCaps are not used.
Natural english is preferred over major/minor. So start_date, not date_start.
Column names should avoid incorporating the table name.
The suffix _id indicates a row identifier.
The suffix _t indicates a type identifier, as defined in ctx_data_types.

...