Batch Interface
Goals
- Implement a flexible system for processing batch records
Background and strategic fit
 Loading records into the registry via a batch interface is a very common pattern, and should be supported out of the box.
Assumptions
Requirements
# | Title | Description | Priority | Notes |
---|---|---|---|---|
1 | SQL Load | Load System of Record (SOR) data from arbitrarily defined views or tables in common RDBMSs (Oracle, MySQL, Postgres, etc) into PRR tables. | Required | Â |
2 | Validate Fields | Attach field specific validation rules. | Required | Ex 1: Birth month must be 01-12. Ex 2: Valid options for ROLE_CD are 1,2,3,4,Y,N. |
3 | Standardize Fields | Map inbound fields to registry data model, including the ability to group multiple columns from the same – or different – tables together, to split one field into multiple columns, and to assign types where appropriate. | Required | Ex 1: Split "LAST,FIRST MIDDLE" into appropriate prs_names fields. Ex 2: One table with LNAME1, FNAME1, LNAME2, FNAME2 should create two rows in prs_names. |
4 | Process Diffs | Under normal operations, only process records that have changed. | Required | Â |
5 | Full Reload | On demand, perform a full reload of the feed by dropping all records and then reloading them. | Required | Â |
6 | Frequency | The frequency of batch processing should be configurable on a per-feed basis. | Required | eg: Every 10 minutes, every hour, nightly, weekly, etc |
7 | Selection Criteria | Allow selection of rows for processing via SQL expressions. | Required | Ex 1: WHERE STATUS='A' |
8 | Dictionaries | Support loading of data dictionaries, not just role records. | Required | Ex 1: Departmental hierarchy |
9 | Notification | Generate email or other event style notifications upon completion of processing of individual feeds, with recipients configurable per feed. | Required | Â |
10 | Configure via API | Configuration of batch processing should be supported over the API so that configuration can be managed via an identity console. | Required | Â |
11 | Status via API | Current feed processing status should be obtainable via the API. | Required | Â |
12 | Graceful Failure | Feed processing should fail gracefully. For example, a failure in one record should not cause the entire feed to fail. | Required | Â |
13 | Thresholds | A threshold value should be configurable on a per feed basis, with a configurable default value, to prevent large changes without manual intervention. | Required | Â |
14 | Documentation | Documentation for configuring and managing feed processing services should be added to the OR wiki. | Required | Â |
15 | File Load | Load System of Record (SOR) data from arbitrarily defined files in common formats (CSV, fixed width, XML, etc). | Optional | Â |
User interaction and design
Questions
Below is a list of questions to be addressed as a result of this requirements document:
Question | Outcome |
---|---|
Should the need to support file-based batch loading impact the design for requirement #7? | |
Requirement #10 may have impact for the ability to manipulate the configuration via the command line. | Â |