Declarative Amsterdam

Case study of a semantic library underpinning the four-corner model for document exchange

G. Ken Holman
Crane Softwrights Ltd.

Abstract

The four-corner model network topology enables parties using non-declarative data to leverage declarative content in a document exchange network, thus triggering the network effect of enabling each new user to access all existing users without knowing their trading partner's use of non-declarative data.

Keywords:
  • Semantics,
  • Semantic components,
  • XML,
  • XSD,
  • Business documents,
  • UN/CEFACT CCTS,
  • OASIS UBL
Table of contents

Introduction to business document exchange

Business documents such as invoices, purchase orders, and waybills have been exchanged between the computer systems of trading partners for decades. Each of the sender and the receiver agree on a choreography of exchanging information back and forth, but each are using different applications that have independently developed different data models within which to express the document information.

For most of this time the exchange bridging the two data models has been using paper, as shown in Figure 1.

Figure 1.Historical use of paper interchange of document information
Figure 1. Sender Receiver Sender's Business Practices Receiver's Business Practices Sender's Application Receiver's Application Sender's Data Model Receiver's Data Model Business Document Print Scan Entry Keyboard Entry Profile of Choreography

The electronic equivalent of the business document in a two-corner model network topology would require the sender’s application to serialize the sender’s data model into a stream of bytes that can be interpreted by the receiver’s application in order to populate the receiver’s data model. But there is impedance in this because the data models are not the same and/or the formats are not the same, as shown in Figure 2.

Figure 2.Impedance in document exchange electronic formats
Figure 2. Sender Receiver Sender's Business Practices Receiver's Business Practices Sender's Data File Sender's Application Receiver's Application Sender's Data Model Receiver's Data Model Receiver's Data File x

What the sender and receiver need is a common declarative basis upon which to exchange document information. But, importantly, such must be available to both with the least amount of disruption to the sender and receiver.

This paper is a case study of how an example four-corner model for document interchange addresses this need by leveraging the Organization for the Advancement of Structured Information Standards (OASIS) Universal Business Language (UBL) declarative semantic library and interchange syntax.

The four-corner model for document exchange
(business or otherwise)

One way to avoid the document impedance is to use the three-corner model topology where both parties delegate the responsibility for interchange to a common third party intermediary who assumes the responsibility of moving the information from one format to another format. Such was very popular starting around the 1990s in projects run by large prime contractors promising to solve all of the problems of both of the parties:

Figure 3.Impedance delegated to a common third party intermediary
Figure 3. Sender Receiver Sender's Business Practices Receiver's Business Practices Sender's Data File Sender's Application Receiver's Application Sender's Data Model Receiver's Data Model Receiver's Data File Intermediary Intermediary's Data Store

But such delegation of responsibility has many challenges, attractive as it may seem on the surface. There are questions of data sovereignty (where in the world is my data being stored?), data integrity (how do I know the information is being properly interpreted?), and data security (why is all my data in one place that might be vulnerable?).

One hears the call “simply use declarative markup and all your problems will be solved!” Though we know this is not that simple.

It isn’t enough simply to have a declarative interchange vocabulary with which to exchange semantics. Before going into the detail of UBL as an example declarative interchange vocabulary, consider the impact of imposing any declarative interchange vocabulary in an document exchange scenario:

Figure 4.Mapping content to a common format for interchange
Figure 4. OASIS Universal Business Language Purview and Standardization XML Sender Sender's Hardware/ Software Platform XML Receiver Receiver's Hardware/ Software Platform Sender's Business Practices Receiver's Business Practices Sender's Application Receiver's Application Sender's Data Model Receiver's Data Model Content Mapping XML Integration XML Format Mapping Document Interchange

Yes, interchange is achieved, but both parties have incurred the burden of syntax integration and syntax format mapping. The sending party has to generate the declarative markup containing the document content and the receiving party has to interpret the declarative markup in order to obtain the document content. It may be that either or both parties are not equipped to modify their applications to meet this need.

Consider, then, the four-corner model network topology where each trading partner has their own independent delegated network representative, called an “access point”, responsible for sending their data over the network and receiving the data from the network, as shown in Figure 5.

Figure 5.Delegation of content mapping and interchange
Figure 5. OASIS Universal Business Language Purview and Standardization XML Sender Sender's Access Point Document Exchange XML Receiver Receiver's Access Point Document Exchange Sender's Business Practices Receiver's Business Practices Sender's Access Point Receiver's Access Point Sender's Data File Content Mapping XML Integration XML Format Mapping Document Interchange Sender's Application Receiver's Application Sender's Data Model Receiver's Data Model Receiver's Data File

In this topology, the access points are either contracted out to other companies as services or built in-house. The sender’s access point is responsible only for knowing the sender’s data format, without any knowledge of the receiver’s data format. Similarly, the receiver’s access point is responsible only for knowing the receiver’s data format, without any knowledge of the sender’s data format.

Both access points know the declarative document interchange format, and in this example that is UBL. But a four-corner model network can pick any one declarative format to be the method of conveying document content.

This network topology was introduced to the e-commerce XML world by the Peppol project in Europe, as shown in Figure 6.

Figure 6.Peppol overview of peer-to-peer network definition
Figure 6. Roaming agreement SLA requirements Standardized processes Standardized documents Standardized profiles Registry infrastructure Federated trust

Note how there is no hardware being introduced between access points: this is a true peer-to-peer network with no intermediaries. The access point is a trusted network representative, and the connection between the trading partner and their access point is private, secure, and unknown to all other parties. Thus there are no questions of data sovereignty. There are no repositories vulnerable to attack. The peer-to-peer exchange is accomplished using robust and secure point-to-point connections with no intermediaries.

The lack of any central hardware makes the network resilient and effortlessly expandable, as shown in Figure 7.

Figure 7.Effortlessly expandable peer-to-peer network
Figure 7. Roaming agreement SLA requirements Standardized processes Standardized documents Standardized profiles Registry infrastructure Federated trust

This topology is very resilient to falling apart. Trading partners can choose to change their access point provider at any time, or use multiple access point providers, one for each of different document types to be supported. Consider in this example that an access point has gone out of business, in which case the trading partner can employ the services of another access point, even if it is servicing other trading partners, as shown in Figure 8.

Figure 8.Trading partners can change access point providers at any time
Figure 8.

Once each access point can successfully interchange with one trading partner, it can successfully interchange with every member of the network.

Figure 9.Interchange with one means interchange with all
Figure 9. B2 XML Supplier's Access Point Supplier Supplier's Business Practices Supplier's Application Supplier's Data Model Buyer 2's Access Point Buyer 2 Buyer 2's Business Practices Buyer 2's ERP Buyer 2's Data Model B1 XML Buyer 1 Buyer 1's Access Point Buyer 1's Business Practices Buyer 1's ERP Buyer 1's Data Model Supplier's Data UBL XML UBL XML

What makes this topology successful is the use of a single, declarative, adopted interchange syntax between access points. In the example of the four-corner models for Peppol (worldwide), and Business Payments Coalition (BPC - US), this interchange syntax for business documents is UBL. Notably, the BPC also is implementing a four-corner-model network for remittances, using the same architecture but with the ISO 20022 syntax between access points.

The sending access point is responsible for converting the private sender format into UBL, and the receiving access point is responsible for converting UBL into the private receiver format. This is illustrated in an image from the BPC project:

Figure 10.Looking inside the four corners in the BPC project
Figure 10.

In that diagram item (1) is the full UBL schema suite as published by the OASIS technical committee, (2) is a subset of the schema that isn’t required but may help with some integration tools, and (3) is an XSLT stylesheet of value validation constraints created typically (though not necessarily) from Schematron.

Figure 11 below shows Corner 3 in more detail. In that diagram, (1) is the full UBL schema suite as published by the OASIS technical committee, (2) is the BPC set of value constraints expressed in Schematron that is derived from a shared Google spreadsheet, and (3) is the XSLT stylesheet expression of the value constraints from Schematron.

The success of the four-corner model network topology is that every access point representative is using the same declarative syntax for document content, independent of the document formats used by sender and receiver in their legacy systems. In the Peppol and BPC projects for e-commerce documents, this syntax is the serialization of the OASIS UBL semantic library.

Figure 11.Detail of Corner 3 tasks upon receipt of the interchange document
Figure 11. Incoming UBL Business Document Instance Corner 3 Corner 4 ERP-C4 System BPC XML XSLT UBL XSD ERP-C4 Data From Corner 2 Private Private Data Integrity Constraints Syntax Constraints BPC XML BPC XML To Corner 2 To Corner 2 Rejection UBL Message 'SV' Syntax Violation Rejection UBL Message 'BV' Data Integrity Violation BPC SCH

OASIS Universal Business Language (UBL)

The OASIS UBL technical committee follows the Open-edi approach separating static semantic information design from syntactic data constraint expressions. The OASIS UBL committee is over 20 years old now. OASIS UBL ISO/IEC 19845 XML is used around the world in many business document interchange networks and environments. In UBL 2.3 business concepts govern 91 separate document types as onion-skins around a common core library of over 4000 information items.

For these 91 document types UBL standardizes a published set of static business document semantics and a published set of XSD schemas. User communities are expected to adopt subsets of the semantics according to their particular business needs. Never was it the intention of the committee that any one community implement every UBL business object. Nor was it ever the intention of the committee to model dynamic business semantics as the ways that UBL is being used are as varied as the committees that are using UBL.

Figure 12.OASIS UBL ISO/IEC 19845
Figure 12.     Note: Not all document types of either specification are listed here, only representative examples   Image courtesy Crane Softwrights Ltd. and may be reproduced in whole as desired. e-Invoicing Standards Intersecting Touchpoints   Transportation Logistics Cash Management Pre-award Procurement Replenishment Payments/Finance Post-award Procurement Initiation Clearing/Settlement UBL-ISO/IEC 19845 UBL-ISO/IEC 19845 ISO 20022 ReceiptAdvice DespatchAdvice Invoice RemittanceAdvice Tendering Contracting Catalogues Order Quotation Transport Service Description Transport Execution Plan Goods Item Itinerary Transportation Status FreightInvoice ProductActivity StockAvailability Customer Credit Credit Notification Credit Initiation Waybill 20180302-0240z

The sheer magnitude of the document specifications precludes human intervention, but such was not the reason to recognize the benefits in adopting how Open-edi separates the semantics of data from the syntax of data.

UBL was not designed using XML or XSD but, rather, the Core Component Technical Specification (CCTS) Version 2.01, a syntax-neutral modeling approach for hierarchical information found in business documents. The focus of committee members is the CCTS, whereas the XSD is machine generated without human intervention to produce validation artefacts that govern constraint checking of syntactic documents. The machine generation is governed by OASIS Business Document Naming and Design Rules (BDNDR).

Open-edi standards ISO/IEC 14662 and ISO/IEC 15944

The focus of UBL is on the static semantic data model of the data transfers (i.e. messages or documents), not on the dynamic semantics of the interpretation of the content (i.e. business in general or business processes). The UBL committee expressly limited their attention to how to structure the content, and not how to use the content, because there was no way the committee would conceive of all of the possible uses of UBL in the real world. Dynamic business relationships constantly change the way data is used and the expectations of the content of the data, and so the UBL committee elected solely to standardize the way the content is structured and serialized so that it could be exchanged readily and consistently. No longer would business document projects have to conceive of their own business object structures to convey commonly-understood eBusiness concepts.

This important distinction is seen in the way the international standardization community views “eBusiness”. In the early 1990s the joint ISO/IEC JTC 1/SC 32/WG 1 eBusiness standards committee working group created the ISO/IEC 14662 Open-edi Reference Model. This prescribes the separation of abstract business concepts from concrete functional implementations of those abstractions. This allows for identification, focus, and standardization in respective areas of effecting electronic business, while recognizing that the environment in which business operates works independently from a functional implementation of that environment, yet relies heavily on that functional implementation to be realized.

ISO/IEC 15944 Part 20 outlines how the Business Operational View (BOV) establishes the business environment in which trading partners are doing business, the specific business scenarios that are being addressed by an implementation, the various roles that are party to the information being exchanged in a given scenario, and the semantic bundles of information needed for the roles to perform their part in the trading partner scenario in the business environment. The specification also outlines how the Functional Services View (FSV) establishes the transport of content between trading partners supporting the choreography of the exchange of syntactic user data in fulfillment of the semantic bundles of information.

It is this reification of the information bundles as user data that bridges business semantics (the meaning of the data) from services implementing the semantics (the syntax of the data). The UBL specification document itself directly reflects the separation of the information bundles from the user data in the table of contents and the section content, as shown in Figure 13.

Figure 13.Open-edi standards
Figure 13. Perspective of business transactions limited to those aspects regarding the making of business decisions and commitments among Persons, which are needed for the description of a business transaction Perspective of business transactions limited to those information technology interoperability aspects of Information Technology Systems needed to support the execution of transactions among Open-edi Community parties. ISO/IEC 14662 Open-edi Reference Model ISO/IEC 15944-20 Linking BOV to FSV User Community Open-edi Configuration ISO/IEC 19845 Universal Business Language (UBL) Specification Section 2. UBL Business Objects Section 3. UBL Schemas   Section 4. Addi- tional Document Constraints   Section 5. UBL Digital Signatures   Scenarios Roles Information Bundles User Data Choreographies Transport Environment Implemented BOV Implemented FSV UBL Customization Open-edi Implementation BOV - Business Operational View FSV - Functional Services View Semantics (meaning) Syntax (format)

Also, this underscores the committee’s focus only on what information is described and how it is serialized, without any focus on how the information is used: any dynamic semantics reflecting how business is performed using the information bundles is out of scope of the UBL committee and project. The only semantics being defined are those of the information bundles being exchanged.

The UBL committee members collaborate in the abstract information bundles defined and arranged using the UN/CEFACT Core Component Technical Specification (CCTS) version 2.01 modeling principles. These principles are syntax independent. The committee then uses the OASIS Business Document Naming and Design Rules (BDNDR) to create the artefacts that govern the documents that are interchanged between access points, as shown in Figure 14.

Figure 14.Role for naming and design rules
Figure 14. Perspective of business transactions limited to those aspects regarding the making of business decisions and commitments among Persons, which are needed for the description of a business transaction Perspective of business transactions limited to those information technology interoperability aspects of Information Technology Systems needed to support the execution of transactions among Open-edi Community parties. ISO/IEC 14662 Open-edi Reference Model User Community Open-edi Configuration Universal Business Language Specification Section 2. UBL Business Objects Section 3. UBL Schemas Section 4. Addi- tional Document Constraints Section 5. UBL Digital Signatures Scenarios Roles Information Bundles User Data Choreographies Transport Environment Implemented BOV Implemented FSV Vocabulary Implementation Open-edi Implementation BOV - Business Operational View FSV - Functional Services View Naming and Design Rules

This has contributed to the worldwide success of deploying UBL in different business environments. While the UBL committee members have created a repertoire of business objects based on general accounting and business principles, UBL user communities have cherry-picked their own set of information bundles from this. For example, the suite of UBL business objects in the information bundles used in the US Business Payments Coalition project differs slightly from the suite of objects in the bundles used in the European Peppol project.

CCTS: semantic modeling for business documents

In 1999 the United Nations Centre for Trade Facilitation and Electronic Business (UN/CEFACT) worked with the Organization for the Advancement of Structured Information Standards (OASIS) to create ebXML “Electronic Business Using Extensible Markup Language” to provide an “open, XML-based infrastructure that enables the global use of electronic business information in an interoperable, secure, and consistent manner by all trading partners”.

  • ISO 15000-1: ebXML Collaborative Partner Profile Agreement (ebCPP)

  • ISO 15000-2: ebXML Messaging Service Specification (ebMS)

  • ISO 15000-3: ebXML Registry Information Model (ebRIM)

  • ISO 15000-4: ebXML Registry Services Specification (ebRS)

  • ISO 15000-5: ebXML Core Components Specification (CCS)

The precursor to 15000-5 is the UN/CEFACT Core Component Technical Specification (CCTS) version 2.01, which was in play at the time that UBL began its development.

CCTS defines a core set of component types with content specifications and associated supplementary components. Whereas XSD data types are suitable for any kind of data, CCTS core component types are specifically designed for constructing information bundles for business documents.

Recall the base data types of XSD:

Table 1.Base simple data types of XSD
XSD simple data type overview
string and string sub-types
boolean
base64Binary
hexBinary
float
decimal, integer, and integer sub-types
double
anyURI
QName
NOTATION
duration, date, and time types

Using XSD one can compose many and varied complex types on a custom basis. Any data type can be used for the element content, any attribute can be used, and any attribute can be of any XSD type that can be selected by the XSD designer.

In contrast, using CCTS one cannot compose any custom base data types, as one is obliged to use only the Core Component Types (elements in XML), their Secondary Representation Terms (derived elements in XML), and their pre-defined properties called Supplementary Components (attributes in XML):

Table 2.CCTS Core Component Types and Supplementary Components
Core Component Type (CCT) CCT Supplementary Components
Name Base Secondary Name (all are strings)
Amount decimal Currency Identifier
Currency Code List Version Identifier
Binary Object base64 binary Graphic, Picture, Sound, Video Format
MIME Code
Encoding Code
Character Set Code
URI
File Name
Code normalized string List Identifier
List Agency Identifier
List Agency Name
List Name
List Version Identifier
Name
Language Identifier
List URI
List Scheme URI
Date Time string Date, Time Format
Identifier normalized string Scheme Identifier
Scheme Name
Scheme Agency Identifier
Scheme Agency Name
Scheme Version Identifier
Scheme Data URI
Scheme URI
Indicator string Format
Measure decimal Unit Code
Unit Code List Version Identifier
Numeric decimal Value, Rate, Percent Format
Quantity decimal Unit Code
Unit Code List Identifier
Unit Code List Agency Identifier
Unit Code List Agency Name
Text string Name Language Identifier
Language Locale Identifier

Outside of the XML element content and prescribed available XML attributes implementing the Core Component Types and Supplementary Components, users of CCTS are not permitted to add any other types of elements nor any other attributes of any kind to the XML.

Users of CCTS derive unqualified data types from the Core Component Types, broken down as primary and secondary representation terms. In OASIS, the following 20 unqualified data types defined in the Business Document Naming and Design Rules (BDNDR) are available for each of the abstract business objects:

Table 3.OASIS BDNDR Unqualified Data Type Restrictions
Unqualified Data Type Core Component Type Restriction
Amount Amount Required currency identifier
Binary Object Binary Object Required MIME Code
Code Code
Date Time Date Time xsd:dateTime
Date Date Time xsd:date
Time Date Time xsd:time
Graphic Binary Object Required MIME Code
Identifier Identifier
Indicator Indicator xsd:boolean
Measure Measure Required Unit Code
Name Text
Numeric Numeric
Percent Numeric
Picture Binary Object Required MIME Code
Quantity Quantity
Rate Numeric
Sound Binary Object Required MIME Code
Text Text
Value Numeric
Video Binary Object Required MIME Code

One builds hierarchical business document structures from the CCTS Core Component Types by creating three kinds of Business Information Entities (BIE). As all tree-like document hierarchies go, there are leaves with content, branches with leaves, branches with branches, and a trunk with branches.

Starting with the leaves of the tree, Basic Business Information Entities (BBIE) contain the actual document data sequences of octets, lexically constrained and structured in elements with attributes according to the unqualified data types. No octets of business content in the data stream are allowed to be anywhere other than within BBIEs.

The branches of the tree are the Associated Business Information Entities (ASBIE), each one’s shape defined by a particular Library Aggregate Business Information Entity (Library ABIE). The ABIE shape contains a combination of zero or more BBIEs followed by zero or more ASBIEs. Library ABIEs are manifest as elements only as ASBIEs and not standalone on their own.

The trunks of the tree are the Document Aggregate Business Information Entities (Document ABIE) and these are the only ABIEs that are manifest directly as elements. They, too, contain a combination of zero or more BBIEs followed by zero or more ASBIEs (see Figure 15).

Figure 15.Business Information Elements and their defining CCTS components
Figure 15. Core Definitions Business Definitions Document ABIE Basic Core Component Association Core Component Aggregate Core Component Aggregate Business Information Entity (Library ABIE) Association Business Information Entity (ASBIE) Basic Business Information Entity (BBIE) defined as used in Qualified Data Type Unqualified Data Type Core Component Type

UBL 2.3 has 91 document types and over 4000 constituent components (see Figure 16).

Figure 16.CCTS components in full UBL and subsets
Figure 16. Library ABIE (254) ASBIE (1750) BBIE (2597) Qualified Data Type Unqualified Data Type UBL Common Library UBL Document ABIE UBL 2.1: 65 UBL 2.2: 81 UBL 2.3: 91 Library ABIE (<254) ASBIE (<1750) BBIE (<2597) Qualified Data Type Unqualified Data Type UBL Common Library Subset Document ABIE Supplemental ABIE Supplemental ASBIE Supplemental BBIE Supplemental Library Additional Document ABIE Unqualified Data Type   "used in" Core Component Type Core Component Type

The committee members focus on the business semantics by modeling the CCTS components for UBL in a shared Google spreadsheet. ABIEs in magenta, ASBIEs in green, BBIEs in blue. This is illustrated in the following sample semantic for a postal address, as shown in Figure 17.

Figure 17.Committee spreadsheet
Figure 17.

The spreadsheet has no concepts of syntax, only core component data types for the BBIE basic components. The ABIE shapes are ordered by the spreadsheet with each member component’s constrained cardinality. As a convention, all BBIEs of an ABIE are listed before the ASBIEs of the ABIE.

Expecting the unexpected

The makeup of the original UBL technical committee included a lot of XML experience. Before the XML issues were resolved and the committee became weighted almost entirely in business experts rather than XML experts, two important distinctions developed between the UBL perspective of business documents and the UN/CEFACT perspective of business documents. Both issues relate to expecting the unexpected from our users.

The rigidity of the UN/CEFACT NDR is unpalatable to the UBL committee members. In particular, all of the code lists with sets of values in a value domain are expressed as schema enumerations, and there is no accommodation whatsoever for foreign content.

Early in the development of UBL, the committee recognized that code lists are content, not structure. And the committee wanted hands off of all content, because content is the purview of the users of UBL, not the UBL committee. Accordingly, the OASIS BDNDR does not use schema enumerations for code lists. It is expected that users will use a second pass value validation that can check code lists and many other aspects of values. How that second pass is implemented is out of the scope of UBL, but the committee has created two specifications to manage code lists: OASIS genericode for enumerating coded values and their associated metadata, and OASIS Context-value Association for mapping value checks to arbitrary hierarchical contexts.

Figure 18.Two-pass validation
Figure 18.