Understanding Uplift

While clinical documents follow various specifications (CCDA/CCD/CDA, HL7, FHIR, etc.) there is enough variation across systems/vendors to make it challenging for consumers (both human and machine) to handle them. The Convert API performs a lossless enrichment and standardization process (“uplift”) to make its outputs more consistent and suitable for downstream processing.

Overview

The conversion process consists of three phases, each of which improves the quality of the generated output:

  • No Data Left Behind - Capture the full clinical picture from the input data.
  • Terminology Uplift - Transform uncoded or poorly-coded data into complete clinical concepts.
  • Standardization - Format the data in a consistent, reliable way.

Visualizing Uplift

The Developer Portal helps you visualize the uplift performed by the Convert API.

The “Uplift” tab summarizes the FHIR resources in the output bundle, showing how many additional codings were added by the uplift process.

Selecting an individual resource class will highlight the specific additional codings added. In the example below, a free text Condition resource received two additional codings from standard reference systems (SNOMED and ICD10).

No Data Left Behind

The Orchestrate APIs leverage years of experience with real-world clinical data to find critical information buried in source documents, whether it’s located in a non-standard place or coded in a vendor-specific way.

The following examples are by no means exhaustive, but illustrate just a few ways that the Convert API goes beyond just “conversion” to capture the full clinical narrative.

Discovering Concepts in Referenced Tables

Some source EMRs do not populate a displayName for a coded concept (e.g., a problem or medication), but only include a description in the originalText reference. Orchestrate can follow those references to discover the proper displayName.

For example, the following C-CDA medication entry contains a reference to “#medication_14”:

  <manufacturedProduct classCode="MANU">
    <templateId root="2.16.840.1.113883.10.20.22.4.23"/>
    <manufacturedMaterial classCode="MMAT">
      <code code="309362" codeSystem="2.16.840.1.113883.6.88" codeSystemName="RxNorm">
        <originalText>
          <reference value="#medication_14"/>
        </originalText>
      </code>
    </manufacturedMaterial>

Which is a reference back to an earlier entry in the problems list:

<td ID="medication_14">Clopidogrel 75 MG oral tablet</td>

Orchestrate can extract these buried references and create a coherent CodableConcept:

  {
    "system": "urn:oid:2.16.840.1.113883.6.88",
    "code": "309362",
    "display": "Clopidogrel 75 MG oral tablet",
    "userSelected": true
  }

Extracting Data with Missing Codes

When the source EMR does not have a proper code for a problem or procedure, many C-CDAs end up with empty code. Orchestrate can still generate a CodableConcept out of the other available text.

For example, this C-CDA procedure table has no code for procedure type, but it does have a description:

  <tr>
    <td ID="date_19">8/10/2023 11:00:00 PM</td>
    <td ID="procedure_19">Bronchoscopic bronchial thermoplasty, ablation of airway smooth muscle</td>
    <td ID="proceduretype_19"> </td>
  </tr>

Orchestrate can extract the procedure text, determine its corresponding code (in this case using ICD9), and generate CodeableConcept:

  {
    "system": "http://hl7.org/fhir/sid/icd-9-cm",
    "code": "32.27",
    "display": "Bronchoscopic bronchial thermoplasty, ablation of airway smooth muscle",
    "userSelected": false
  }

Handling Non-Standard or Proprietary Sections

Orchestrate recognizes standard template sections from all common CDA variations (including, but not limited to, HITSP C32, CDA R1.1 and R2.1). When the system encounters an unrecognized template ID, it will still extract the generic HTML (narrative) content of that section.

Decoding Poorly Formatted Fields

Orchestrate will use available context clues and robust parsing mechanisms to extract data even when it is poorly formatted. For example:

  • Deciphering non-standard fields (like “Y” or “N” where true/false was expected).
  • Gracefully handling parsing errors (such as missing quotes or undeclared namespaces) that might render XML invalid for standard parsers.
  • Recognizing dozens of different date/time formats, including partial data.

Terminology Uplift

The Convert API utilizes the Terminology API to transform uncoded, proprietary-coded, or free text source terms (CodableConcepts) to well-formed codings in standard reference systems like ICD-10-CM, SNOMED, and LOINC®. This can happen in many ways, but several common scenarios are described in detail below.

CodeSystem OID Matching

Source data coded with a reference CodeSystem OID will be matched to a corresponding coding with the appropriate system URI, reference code, and display. This results in a standardized coding conformant with the FHIR specification.

For example: The original source data references an ICD10CM code using the CodeSystem OID (2.16.840.1.113883.6.90). The Convert API will create an additional coding with the proper ICD10 system URI (http://hl7.org/fhir/sid/icd-10-cm), reference code, and display value.

Field Source Data Additional Coding
Display Type 2 diabetes Type 2 diabetes mellitus without complications
Code E119 E11.9
System urn:oid:2.16.840.1.113883.6.90 http://hl7.org/fhir/sid/icd-10-cm

Equivalent Concepts

Additional reference codings may be included if they represent an equivalent concept to the original source data.

For example: The original source data references a SNOMED code. The Convert API will create an additional coding for the corresponding ICD10 code.

Field Source Data Additional Coding
Display Mitral stenosis and aortic insufficiency (disorder) Rheumatic disorders of both mitral and aortic valves
Code 194734000 I08.0
System http://snomed.info/sct http://hl7.org/fhir/sid/icd-10-cm

It is possible for a single input to drive several equivalent concepts. For instance, an NDC code may result in equivalent NDC, RxNorm and CVX codes.

Natural Language Processing

Uncoded source data may result in a reference coding if the textual description is sufficient to derive a representative reference.

For example: The original source data references a lab with code “hba1c”. The Convert API can match that to the appropriate SNOMED procedure coding.

Field Source Data Additional Coding
Display hba1c Hemoglobin A1c measurement (procedure)
Code hba1c 43396009
System LabService http://snomed.info/sct

Standardization

The standardization process ensures that the resulting output not only adheres to the relevant standards (FHIR, HL7, C-CDA, etc.) but also presents the data in a consistent, reliable, easy-to-use format.

  • Sections/resources are presented in a consistent order across similar documents.
  • Human-readable blocks follow a consistent format.
  • Information is present in consistent fields regardless of vendor.
  • Terminology is consistent across similar clinical concepts.