The Longitudinal Record Enterprise Data Warehouse (EDW) bundle contains files that make up the patient’s longitudinal record in HealtheEDW. This feed type includes the longitudinal record data for only a tenant’s specific population.
Each table is represented with potentially two files: a <table_name> file and a <table_name>_delete file. The <table_name>_delete files contain a hash_value column, and a corresponding hash_value column is on the <table_name> table to identify that the row needs to be deleted. The <table_name> files contain the rows that should be inserted. An update to a record is represented by a delete of the old record then an insert.
Complete the following steps to process the data:
Bundle Metadata:
Starting in February 2021, we are providing metadata that is available in the Data Syndication API. This bundle metadata has a complete list of all files that are available in a delivery and indicates whether the file is a full replacement or if it is an incremental file and has a corresponding delete file.
Files in a Delivery:
Not every file listed in the documentation is present in every delivery. There may be empty files at times and other times no file at all because of the way the data is processed. If one of the transformation reads data upstream and writes to five different tables in the warehouse, and if there is one new row in one of the five tables, five files are produced (four of which are empty and one contains the single new row). If the same transformation writes data to five tables in the data warehouse, but there is no new data for all five tables, then a file is not generated.
File Headers:
While it is generally the goal that the columns in the header row are consistent, there are exception. It is recommended to use an ETL tool that supports parsing files by column header rather than by position. If you are using an ETL tool that does not support parsing files by column names rather than position, a preprocessing step is required to apply this column ordering requirement on the consuming side once the files are downloaded.
Version | Notes |
---|---|
V1 | Initial version |
The Longitudinal Record EDW (long-record-edw
) feed type is designed in the following ways:
Topic | Description |
---|---|
Data Types | Cerner recommends that you use the data types in the documentation when you load the files into a relational database. |
Dates | Date fields are syndicated using the International Organization for Standardization (ISO) 8601 format with a Coordinated Universal Time (UTC) offset. |
File Format | .CSV files are created using the Request for Comments (RFC) 4180 standard. See the RFC 4180 page on the Internet Engineering Task Force (IETF) website for more information. |
Files With No Data | When no data is present in a file, a file with a zero length is syndicated. |
Quoting and Escaping | Fields that contain line breaks (CRLF), double quotes, and commas are enclosed in double quotation marks. Double quotation marks in a string are escaped by preceding them with another set of double quotation marks. |