Batch Interface

Target release0.9.4
ThemeBatch Interface
Document statusDRAFT
Document owner
Designer
Developers
QA

Goals

  • Implement a flexible system for processing batch records

Background and strategic fit

 Loading records into the registry via a batch interface is a very common pattern, and should be supported out of the box.

Assumptions

Requirements

#TitleDescriptionPriorityNotes
1SQL LoadLoad System of Record (SOR) data from arbitrarily defined views or tables in common RDBMSs (Oracle, MySQL, Postgres, etc) into PRR tables.Required

 

2Validate FieldsAttach field specific validation rules.Required

Ex 1: Birth month must be 01-12.

Ex 2: Valid options for ROLE_CD are 1,2,3,4,Y,N.

3Standardize FieldsMap inbound fields to registry data model, including the ability to group multiple columns from the same – or different – tables together, to split one field into multiple columns, and to assign types where appropriate.Required

Ex 1: Split "LAST,FIRST MIDDLE" into appropriate prs_names fields.

Ex 2: One table with LNAME1, FNAME1, LNAME2, FNAME2 should create two rows in prs_names.

4Process DiffsUnder normal operations, only process records that have changed.Required 
5Full ReloadOn demand, perform a full reload of the feed by dropping all records and then reloading them.Required 
6FrequencyThe frequency of batch processing should be configurable on a per-feed basis.Requiredeg: Every 10 minutes, every hour, nightly, weekly, etc
7Selection CriteriaAllow selection of rows for processing via SQL expressions.Required

Ex 1: WHERE STATUS='A'

8DictionariesSupport loading of data dictionaries, not just role records.RequiredEx 1: Departmental hierarchy
9NotificationGenerate email or other event style notifications upon completion of processing of individual feeds, with recipients configurable per feed.Required 
10Configure via APIConfiguration of batch processing should be supported over the API so that configuration can be managed via an identity console.Required 
11Status via APICurrent feed processing status should be obtainable via the API.Required 
12Graceful FailureFeed processing should fail gracefully. For example, a failure in one record should not cause the entire feed to fail.Required 
13ThresholdsA threshold value should be configurable on a per feed basis, with a configurable default value, to prevent large changes without manual intervention.Required 
14DocumentationDocumentation for configuring and managing feed processing services should be added to the OR wiki.Required 
15File LoadLoad System of Record (SOR) data from arbitrarily defined files in common formats (CSV, fixed width, XML, etc).Optional 

User interaction and design

Questions

Below is a list of questions to be addressed as a result of this requirements document:

QuestionOutcome
Should the need to support file-based batch loading impact the design for requirement #7?
Requirement #10 may have impact for the ability to manipulate the configuration via the command line. 

Not Doing