What is XML?
XML (eXtensible Markup Language) is a flexible, text-based, and platform-independent format used to store and transport data in a well-structured way that is both human- and machine-readable. XML permits users to define custom tags to describe the meaning and organization of their data. For example: <book><title>The Hitchhiker's Guide</title></book>.
An XML document is self-descriptive and structured as a hierarchical tree of elements. Every document has a single root element that encapsulates all other content. Elements can contain text, child elements, and attributes (name-value pairs that offer supplementary information). These documents are conventionally stored with the .xml file
The integrity of this structure can be enforced using the following:
- DTD (Document Type Definition): Provides basic validation rules.
- XSD (XML Schema Definition): Offers advanced rules, including data types and constraints.
Transforming XML Documents
This part of the article describes the following:
- Parsing and converting general XML to HL7 standard.
- Parsing CCDA (Consolidated Clinical Document Architecture) documents (XML) and converting them to HL7 format.
In these implementations, both formats are first transformed into the InterSystems IRIS SDA (Standardized Data Architecture) format. It is considered a standard, efficient, and less error-prone approach because it effectively utilizes the platform's pre-built classes. Once the data is in the SDA format, it can be seamlessly transformed into any target standard, such as HL7 v2, FHIR, or CCDA.
Parsing General XML Documents
A general XML document is self-describing and works with such custom tags as <name>, <sex>, and <address>. This section explains how to parse such a document and utilize it to construct an HL7 message via the intermediate SDA (Standardized Data Architecture) format.
Before starting the conversion, you need to choose the appropriate workflow:
- Recommended Approach: Transforming to SDA. It is the most efficient method. It involves reading the XML document as a data stream and converting it directly into the SDA (Standardized Data Architecture) format within an Interoperability Production. This practice is standard yet highly effective for large-scale data processing.
- Alternative Approach: Manual Conversion. You can read the XML file as an object and perform the conversion programmatically. This method offers finer control but is typically more complex to implement and is less scalable.
Reading the XML Document
InterSystems IRIS provides a comprehensive set of classes for parsing XML streams smoothly. Two key methods for this include the following:
- %XML.Reader: Offering a programmatic way to read an XML stream and load its content into an object via %XML.Adaptor.
- EnsLib.EDI.XML.Document: Used within an interoperability production to represent and parse an XML document dynamically.
Using %XML.Reader and %XML.Adaptor classes
This is a robust and straightforward technique for parsing an XML file or stream into an in-memory object by combining the %XML.Adaptor and %XML.Reader classes.
Let’s take the XML below as an input file for our example:
<Patient> <PatientID>12345</PatientID> <PatientName>DOE^JOHN</PatientName> <DateOfBirth>19900101</DateOfBirth> <Sex>M</Sex> <PatientClass>I</PatientClass> <AssignedPatientLocation>GEN^A1</AssignedPatientLocation> <AttendingDoctor>1234^DOCTOR^JOHN</AttendingDoctor> </Patient>
First, you must create a class definition that represents the structure of your XML document. This class must extend from %XML.Adaptor. Once the XML is loaded, the data becomes available as properties of your object, allowing for easy access and subsequent manipulation of your code.
Class MyApp.Messages.PatientXML Extends (%Persistent, %XML.Adaptor)
{
Parameter XMLNAME = "Patient";
Property PatientID As %String;
Property PatientName As %String;
Property Age As %String;
Property DateOfBirth As %String;
Property Sex As %String;
Property PatientClass As %String;
Property AssignedPatientLocation As %String;
Property AttendingDoctor As %String;
ClassMethod XMLToObject(xmlStream As %Stream.Object = "", xmlString, filename = "C:\learn\hl7msg\test.xml")
{
Set reader = ##class(%XML.Reader).%New()
// Begin processing of the XML input
If filename'="" {
Set sc=reader.OpenFile(filename) ; open the file directly
}
ElseIf $IsObject(xmlStream){
Set sc=reader.OpenStream(xmlStream) ; parse from stream
}
ElseIf xmlString'="" {
Set sc=reader.OpenString(xmlString) ; parse from stream
}
Else {
Return $$$ERROR("No file name,string or stream found")
}
If $$$ISERR(sc) Do $system.OBJ.DisplayError(sc) Quit
// Associate a class name with the XML element name
;Do reader.Correlate(..#XMLNAME,$classname())
Do reader.CorrelateRoot($classname())
Do reader.Next(.patient,.sc)
If $$$ISERR(sc) Do $system.OBJ.DisplayError(sc) Quit
ZWrite patient
}
}
The XMLToObject method parses XML data from a file, stream, or string, creating a class instance, which can be later utilized for either programmatic conversions or within an Interoperability Production.
Utilizing the EnsLib.EDI.XML.Document
The EnsLib.EDI.XML.Document class provides runtime and XPath-based access to any XML content, without requiring a predefined schema or class. It is ideal when you need to extract values dynamically at runtime. Simply load your XML into this class and employ its methods to access elements on the fly with XPath expressions.
This class also offers the capability to persist an XML document directly to the EnsLib_EDI_XML.Document table by saving the object instance.
ClassMethod ParseXML(xmlfile As %String="")
{
Set ediXMLDoc = ##class(EnsLib.EDI.XML.Document).ImportFromFile(xmlfile,,.sc)
If $$$ISERR(sc) {
Quit
}
; pass XPath into GetValueAt method
Write ediXMLDoc.GetValueAt("/Patient/PatiendID") ;returns the patient id
}
Then you can declare the return value of the GetValueAt() method in your class, a local variable, or JSON format.
The XML File Business Service (EnsLib.EDI.XML.Service.FileService) uses this virtual document class for its parsing operations.
Note: When retrieving excessively large strings (those exceeding 3,641,144 characters) via GetValueAt("XPath") in InterSystems IRIS, you typically receive a <MAXSTRING> error. Your code should include appropriate handling for this limitation.
XSD Schema Loading in InterSystems IRIS
An XSD (XML Schema Definition) outlines the structure, elements, types, and validation rules for an XML document to ensure data consistency and validity.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Patient">
<xs:complexType>
<xs:sequence>
<xs:element name="PatientID" type="xs:string"/>
<xs:element name="PatientName" type="xs:string"/>
<xs:element name="DateOfBirth" type="xs:string"/>
<xs:element name="Sex" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
In IRIS, loading an XSD schema serves a few primary purposes:
- Validation: It enables incoming XML documents to be verified against a predefined structure, ensuring data integrity.
- ObjectScript Class Generation: It automates the creation of ObjectScript classes that mirror the XML structure, simplifying programmatic access. (You can load your XSD file and produce a class definition by following Studio > Tools > AddIn > XMLSchema wizard).
- DTL Transformations: It facilitates schema-based transformations within the Data Transformation Language (DTL), enabling seamless data mapping between different formats.
Importing an XML Schema
To import an XML schema (.xsd) file into InterSystems IRIS, follow the steps below:
- Navigate to the System Management Portal.
- Go to Interoperability > Interoperate > XML > XML Schema Structures.
- Click the Import button.
- In the dialog box, select your schema.xsd file and click OK to complete the import.
All imported XSD schemas are saved in the ^EnsEDI.XML.Schema global within each namespace. The first subscript of the global is the schema name, just as displayed in the Management Portal.
The path to the source XSD file is stored at ^EnsEDI.XML.Schema(<schema name>,"src",1).
Important:If the source file is deleted from its original location, any future validation attempts against the schema will result in an error.
Validate the XML Using Schema
Once the schema is loaded, you can validate your XML document against it using the custom code below. To do that, you will be required to provide the XML file name and the schema name as parameters:
/// XMLSchema – Imported schema or give the full path of the schema in the directory, e.g,/xml/xsd/patient.xsd
ClassMethod XMLSchemaValidation(xmlFileName As %String,XMLSchema As %Stirng="")
{
Set xmlFileName="C:\test.xml"
Set ediXMLdoc1 = ##class(EnsLib.EDI.XML.Document).ImportFromFile(xmlFileName,,.sc)
If $$$ISERR(sc) Quit sc
Set ediXMLdoc1.DocType=XMLSchema
Return ediXMLdoc1.Validate()
}
Embedded Python sample:
Class pySamples.XML Extends %RegisteredObject
{
ClassMethod GetError(er)
{
Return $SYSTEM.Status.GetErrorText(er)
}
ClassMethod pyXMLShcemaValidation(xmlFileName = "C:\\hl7msg\\test.xml", XMLSchema = "Patient") [ Language = python ]
{
import iris
xml_status = iris.ref()
ediXMLdoc = iris.cls("EnsLib.EDI.XML.Document").ImportFromFile(xmlFileName,1,xml_status)
if xml_status.value!=1:
print("XML Parsing error: ",iris.cls(__name__).GetError(xml_status))
else:
print(ediXMLdoc)
}
}
Getting Schema from the Object:
Set object = ##class(MyApp.Messages.PatientXML).%New() ;replace your class here
Set XMLSchema = object.XMLSchema() ; this will return the XML schema for this class.
Before exploring SDA and other healthcare messaging standards, let's take a brief look at CDA.
Clinical Document Architecture (CDA)
The Clinical Document Architecture (CDA) is a healthcare standard developed by HL7 for electronic clinical documents.It defines how these documents are structured, encoded, and exchanged to ensure both human and machine readability.
CDA is an XML-based standard for representing such clinical documents as the following:
- Discharge summaries
- Progress notes
- Referral letters
- Imaging or lab reports.
A CDA document is generally wrapped by the <ClinicalDocument> element and has two major parts: a header and a body.
1. Header (Required): It is located between the <ClinicalDocument> and the <structuredBody> elements. It contains metadata about the document, describing who the document is about, who created it, when, why, and where.
Key elements in the header:
- Patient demographics (recordTarget), Author (doctor, system), Custodian (organization responsible), Document type and template IDs, Encounter info, Legal authenticator.
2. Body (Required): It contains the clinical content report, wrapped by the <structuredBody> element, and can be either unstructured or comprised of structured markup. It is also typically divided into recursively nested document sections:
- Unstructured: free-form text that might come with an attachment (e.g., PDF).
Structured:XML sections with coded entries for allergies, medications, problems, procedures, lab results, etc.
<ClinicalDocument>
... CDA Header ...
<structuredBody>
<section>
<text>...</text>
<observation>...</observation>
<substanceAdministration>
<supply>...</supply>
</substanceAdministration>
<observation>
<externalObservation>...
</externalObservation>
</observation>
</section>
<section>
<section>...</section>
</section>
</structuredBody>
</ClinicalDocument>
CCDA to HL7 Conversion
Converting a CCDA (Consolidated Clinical Document Architecture) document to an HL7 v2 message is a common interoperability use case in InterSystems IRIS. While you can still employ a direct, single-step DTL (Data Transformation Language) mapping, we recommend opting for an intermediate data format called SDA (Standardized Data Architecture) as the most robust approach.
Step 1: C-CDA to SDA (XSLT)
The first step is to transform the incoming C-CDA document into an SDA object. (SDA is a vendor-neutral, clinical data model that simplifies the representation of clinical information.)
- Why should we use SDA? C-CDA is a complex, hierarchical XML structure with numerous templates and sections. Attempting to map it directly to the flat, segment-based configuration of an HL7 v2 message is extremely difficult and often requires intricate yet brittle logic. SDA serves as a simplified, intermediate model that extracts the essential clinical data from the C-CDA, avoiding the structural complexities of XML.
- How does it work? InterSystems IRIS provides a library of pre-built XSLT files (typically located in the install-dir\CSP\xslt\SDA3 directory) designed for C-CDA to SDA conversion. This transformation generally requires using a Business Process or a Business Operation to invoke the correct XSLT.
All InterSystems healthcare products come with a library of XSLTs to transform CDA documents into SDA, and vice versa. You can examine the available root-level XSLT location at install-dir\CSP\xslt\.
For instance, CCDA-to-SDA transforms the following:
- Consolidated CDA 1.1 CCD into SDA, CCDAv21-to-SDA transformations.
- Consolidated CDA 2.1 CCD into SDA, SDA-to-C32v25transformations.
Converting from or to SDA
Once the XML is loaded into a class object, it is ready for transformation. At this point, you should create a custom DTL to map your data structure into HS.SDA3.Container or HS.SDA3.* specific class to build an SDA document.
FHIR
Utilize the IRIS built-in Data Transforms for conversion to/from SDA. You can refer to the article for FHIR conversion.
HL7 V2
- You can programmatically convert an HL7 message to SDA using the class method HS.Gateway.HL7.HL7ToSDA3.GetSDA(). For example, do ##class(HS.Gateway.HL7.HL7ToSDA3).GetSDA(pRequest,.tSDA).
- Note: There is currently no method for programmatic conversion directly from SDA back to HL7 v2.
Key Classes, Tables, Globals, Links
Classes
- %XML.*.cls: All XML-relevant classes can be found in this package.
- Ens.Util.XML.Validator: This class contains practical methods for verifying XML.
- EnsLib.EDI.XML.Service.FileService: It is a Business service host.
- %XML.XSLT.Transformer:
- %XML.Writer:
- %XML.Reader:
- %XML.Adaptor:
- %XML.Document:
- %XML.Schema:
- %XML.String: It is a Datatype class for XML.
- %XML.SAX.Parser:
- HS.SDA3.Container: It is a primary SDA container class.
- HS.SDA3.*.cls: SDA classes.
Tables
- EnsLib_EDI_XML.Document: This table is used to store EDI XML documents.
Globals
- ^EnsEDI.XML.Schema: This global stores XSD schemas.
- ^EnsLib.EDI.XML.DocumentD: This global holds the data for the EnsLib_EDI_XML.Document table.
This article provides an overview of the basic of XML