Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

Data Model ERDs

(revised October 2008)

Currently Being Revised

Initial Design

Source Files

Data Model Design

Process Flow (by example)

  1. Source/batch files are loaded into raw (R) tables
  2. The data is normalized and moved into standardized (S) tables, eg prs_sor_role_records and associated tables
  3. Where a person has multiple records from a given SOR, the "best" biodem data is elected into prs_sor_persons
    1. This covers (eg) correction of typos and name changes
  4. Where a person has multiple SORs, the "best" biodem data is elected into prc_persons
    1. Note: The current table definition implies same SOR for best name & biodem

Guidelines

  1. The database is a "black box", so nothing sees it except for core Registry code. All manipulation is done via APIs.
  2. Where possible, tables should be consolidated to keep the number of tables down and simplify administering them. As a general rule of thumb, if two tables have the same structure and vary by only one column name, the tables should be consolidated.

Naming Conventions

  1. Table names are prefixed CCT_ where CC indicates the responsible component and T indicates the type of table as enumerated above.
  2. Table and column names are all lowercase, with underscores (_) to separate words/fragments. StudlyCaps are not used.
  3. Natural english is preferred over major/minor. So start_date, not date_start.
  4. Column names should avoid incorporating the table name.
  5. The suffix _id indicates a row identifier.
  6. The suffix _t indicates a type identifier, as defined in ctx_data_types.

Terminology

Table Data Descriptions

  • (Pre)defined: Definitions provided out of the box, may be added to by local deployment
  • Instantiated: Definitions added by local deployment
  • Standardized: Normalized data transformed from SOR specific format to common format
  • Raw: Untouched data from SOR
  • Calculated: Data calculated by transformation on raw, standardized, or calculated data

Table Types

  • Calculated (C): Tables holding calculated data. May also include Standardized data.
  • Dictionary (D): Support tables that hold (pre)defined definitions external to the Registry. Example: A list of countries.
  • Original (O): Tables that hold data original to/originated by the Registry.
  • Raw (R): Tables that directly reflect the source (or "raw") data, with only minimal formatting changes.
  • View (V): A database view, constructed of underlying tables.
  • Standardized (S): Tables holding standardized data.
  • Supporting (X): Tables that relate to the management of the system, and that are generally only instantiated or modified by the system or an administrator.
  • No labels