Declarative Amsterdam


Declarative Amsterdam 2021

Nov 4: Tutorial day, Nov 5: Symposium; Preliminary program

Location: CWI, Amsterdam

Thu 4 November: Tutorial Day

9:00registration (on site)
XProc - A pipelining language
Erik Siegel (Xatapult Content Engineering)
XProc is an XML based programming language for complex document processing. Documents flow through pipelines in which steps perform processing like conversion, validation, split, merge, report, etc. It’s an almost perfect fit for the kind of processing necessary in complex document engineering.
In 2016 a W3C community group started working on XProc 3.0 to replace the never very popular 1.0 version (the 2.0 proposal never made it). Main goals were to make the language much more usable, understandable and concise, update the underlying standards (most notably XPath) and allow processing of non-XML documents as well.
The XProc 3.0 core specification ( has been stable for over a year now. There is one functioning processor (MorganaXProc-IIIse by Achim Berndzen, and one in the making (XML Calabash 3.0 by Norman Tovey-Walsh). There is a book (XProc 3.0: A Programmer Reference by Erik Siegel, that describes the language.
This tutorial covers the basics of XProc 3.0. Participants that are in for the hands-on exercises: please download MorganaXproc-IIIse and try to flight-test it.
Erik Siegel ( works as a content engineer, XML specialist and technical writer. His main customers are in publishing and standardization. He is a member of the XProc 3.0 editorial committee.
10:30XProc - A pipelining language. Part 2
11:15coffee break
Saxon-JS Tutorial
Norm Tovey-Walsh (Saxonica), with Debbie Lockett
Saxon-JS is an XSLT 3.0 processor written in JavaScript and XSLT. It offers all of the traditional declarative features of XSLT in any modern browser and, on the server side, in Node.js. This tutorial will explain how to setup and use Saxon-JS. We’ll cover the interactive extensions that make Saxon-JS a powerful platform for developing browser-based applications. We’ll also explore how to use it on Node.js for traditional server-side automation tasks. Participants will be guided through a series of hands-on sessions where they will experience first hand how easy and fun it is to build applications with Saxon-JS. 
13:30Saxon-JS Tutorial. Part 2
14:30tea break
RumbleDB: Data independence for large, messy datasets
Ghislain Fourny (ETH Zürich), with Ingo Müller, Can Berker Cikis, Stefan Irimescu, Gustavo Alonso
We introduce Rumble, a query execution engine for large, heterogeneous, and nested collections of JSON objects built on top of Apache Spark. While data sets of this type are more and more wide-spread, most existing tools are built around a tabular data model, creating an impedance mismatch for both the engine and the query interface. In contrast, Rumble uses JSONiq, a standardized language specifically designed for querying JSON documents. The key challenge in the design and implementation of Rumble is mapping the recursive structure of JSON documents and JSONiq queries onto Spark's execution primitives based on tabular data frames. Our solution is to translate a JSONiq expression into a tree of iterators that dynamically switch between local and distributed execution modes depending on the nesting level. By overcoming the impedance mismatch in the engine, Rumble frees the user from solving the same problem for every single query, thus increasing their productivity considerably. As we show in extensive experiments, Rumble is able to scale to large and complex data sets in the terabyte range with a similar or better performance than other engines. The results also illustrate that Codd's concept of data independence makes as much sense for heterogeneous, nested data sets as it does on highly structured tables.
Hands-on ixml
Steven Pemberton (CWI)

We choose which representations of our data to use, JSON, CSV, XML, or whatever, depending on habit, convenience, or the context we want to use that data in. On the other hand, having an interoperable generic toolchain such as that provided by XML to process data is of immense value. How do we resolve the conflicting requirements of convenience, habit, and context, and still enable a generic toolchain? Invisible XML (ixml) is a method for treating non-XML documents as if they were XML, enabling authors to write documents and data in a format they prefer while providing XML for processes that are more effective with XML content. For example, it can turn CSS code like

body {color: blue; font-weight: bold}
into XML like

<css> <rule> <simple-selector name="body"/> <block> <property> <name>color</name> <value>blue</value> </property> <property> <name>font-weight</name> <value>bold</value> </property> </block> </rule> </css> 


<css> <rule> <selector>body</selector> <block> <property name="color" value="blue"/> <property name="font-weight" value="bold"/> </block> </rule> </css> 

depending on choice. More details at This tutorial provides a hands-on introduction to ixml: how to specify how documents are transformed into XML, and what choices you have.

Steven Pemberton is a researcher affiliated with CWI Amsterdam, the Dutch national research centre for mathematics and informatics. His research is in interaction, and how the underlying software architecture can support users. He co-designed the ABC programming language that formed the basis for Python. Involved with the Web from the beginning, he organised two workshops at the first Web Conference in 1994. For the best part of a decade he chaired the W3C HTML working group, and has co-authored many web standards, including HTML, XHTML, CSS, XForms and RDFa. He now chairs the XForms group at W3C.
16:45Hands-on ixml. Part 2

Fri 5 November: Symposium

9:00registration (on site)
Features of a modern XML Resolver
Norm Tovey-Walsh (Saxonica)

XML Resolvers are a core extension feature in XML parsers and other applications in the XML stack. They allow you to transparently satisfy requests for DTDs, schemas, stylesheet modules, etc., with local copies of those resources. This offers improvements in both performance and security. XML Resolver 3.0, available in Java and (soon!) C#, provides full support for the XML Catalogs standard and a broad range of features designed to make deploying and using catalog-based resolution faster and easier. This talk will highlight the new features of the resolver including:

  • Dynamic catalog construction with caching.
  • Automatically loading catalogs from extension modules (jar files or assemblies).
  • Improved support for resources distributed in extension modules. 
  • Handling http: and https: entries transparently. 
  • Validation of catalog files. 
  • Namespace-based resource discovery by indirection through RDDL documents.
Roaster - declarative routing for eXist-db
Juri Leino (eXist-db, exist-solutions)
Declarative approach to routing requests in eXist-db
A brief introduction into the status quo, followed by a presentation of a new approach to declaratively design APIs and route requests with examples for several use cases.
I will explain the basics of routing in general.
After that, we will have a look at the status quo of routing requests in eXist-db.
In particular using rest, RestXQ and the controller.xq and their pros and cons.
- rest
 - has some quirks
 - does not encourage RESTful interface best-practices
 - can be hard to secure
- restXQ
 - route handlers can be somewhere in a package
 - can lead to duplicate code for multiple output formats
 - parameter handling authentication and error handling is left to the user
- controller.xq
 - can be hard to secure
 - parameter, authentication and error handling is left to the user
 - complex controllers can get hard to read
 - can only pass strings as parameters to handlers
Because of said limitations of all routing options on exist-db I made several attempts to come up with a better solution that maps a route to a function.
2019 I had a series of small breakthroughs and a working prototype that roughly modelled after the express router known from nodeJS.
By mid 2020 Wolfgang Meier expressed the need for a better routing option to use with TEI-publisher. I showed him what I got and he ran with it.
He had the brilliant idea to implement the OpenAPI standard and thus created a router where you first create the documentation. You declare which routes exist and what they expect and return.
In this configuration you also set things like headers, mime-types and more.
This ongoing collaboration is now part of e-editiones, the same society that governs TEI-publisher.
At the beginning we will have a look at an example JSON file that declares a simple API of an exist-db package.
1. Using a test page created from our declaration
2. Looking at the JSON file itself
Then we will create a new route that will output different formats like (HTML, XML, JSON, CSV).
I will show how to set arbitrary headers per route, in a handler function, dynamically, for cacheing and also using a middleware for all routes.
To round things up, how to secure routes with cookies and basic auth, how to handle authorisation of requests and how to use a custom authentication method.
What's next?
What are our medium and long-term goals and how can you contribute.
10:45coffee break
SCHEMA LayoutFX: Automating Content Layout Without Losing Flexibility
Klaus Kurre
The high cost for publishing print material in a globalized market is partly due to translation, but mostly to manual layout adaptations of the localized documents.
It comes as no surprise, that the biggest advantage of XML (and DITA) does not lie in the reuse of content or other virtues, but rather in the possibilities of automating layout. Most XML-based documentation set-ups are using XSL-FO to transform media-neutral content into professional looking PDF documents. However: setting-up an XML / XSL-FO / PDF transformation chain can be a tedious task that often can only be performed by some technically gifted persons.
Because of this knowledge gap, automating layout shares the problem of most IT projects: the discrepancy between users’ expectations and the final technical solution. Usually it takes some rounds between the person using the layout and the person automating the layout to finalize the implementation.
Layout automation and flexibility do not have to be in contradiction!
In this session, attendees will learn how SCHEMA LayoutFX bridges the gap between scalability of an automated publishing process and marketing requirements for individual documents. I will show how it enables you to Design and manage your (multilingual) Layouts as well as to automate the whole process. Because of its graphical user interface and no code programming it is much more accessible to publish xml/DITA data into PDF or word file as most other solutions. Its quick implementation and easy maintenance will reduce costs significantly and the DTP-like autonomy will at the same time help to regain control of the whole process. You simply don’t need to book XSL-FO programmers and run the “inner loop” before being able to look at results. 
After studying physics in Aachen and Montréal Klaus Kurre worked for ten years as a technical translator (German, English, and French). In 2004 Mr. Kurre established himself as a consultant and trainer for translation tools (CAT), additionally since 2008 as a Lead Auditor to ISO 17100/ISO18587 certification. From 2016 he worked as a Trainer for the world leading non DITA CCMS SCHEMA ST4 and now has a position in Business development with Quanos Content Solutions to promote LayoutFX, a DITA compatible SaaS publishing solution.
Extracting Microcontent from DITA Topics
Chris Despopoulos
We transform DITA to HTML in the browser via a Single Page App (SPA). Our product GUI is also an SPA – Why not put our content directly in the GUI? This talk shows the advantages of dynamic transforms, and how we use that technique to extract subsets from the single source and display them in the product.
Functional, Declarative Audio Applications
Nick Thompson (Independent)
Audio software, and particularly digital signal processing, is an application domain where the imperative, object oriented programming model dominates. In part, this can be justified by the realtime constraints that underly the domain, and that C/C++ has historically dominated the high performance native software landscape. But this is not without cost: the high barrier to entry prevents developers from trying to write audio software, and the industry spends far more time than needed to deliver new products. 
In this talk, we'll look at some of the complications that come from writing low level native audio software in C/C++ with an imperative, object oriented model. Then we'll reframe the conversation to show why a functional, declarative approach may be fundamentally more fitting for the problems we want to tackle when writing new audio software. 
Finally, I'll introduce Elementary Audio: a new JavaScript runtime for writing realtime, native audio applications with a functional, declarative API. We'll see how Elementary applies the declarative model to audio software, and then finish with a detailed example of a small drum synthesis library written in Elementary.
Nick Thompson is an audio software developer, contractor, and consultant. He is the owner of a small audio plugin company, Creative Intent, and the author of Elementary Audio and React-JUCE. Nick's interest lies in tools that enable and promote creativity and simplicity, both in music making and in software development.
SchemaCom - An XML Schema Comparator
Ihe Onwuka
People working with large XML vocabularies often face the task of upgrading to a new version [1]. Ideally such are guided with a specification of how to map the components of an XML instance to the new version of the vocabulary. SchemaCom was created to assist situations where such a specification is not available. It highlights the differences (and similarities) between the constituent content models in the respective vocabularies, this information can then guide the analysis necessary to specify the missing mappings and can be applied without loss of generality between schemas representing different vocabularies (as opposed to different versions of the same one). A distinguishing feature is the delivery of the user interface as an XForm.
15:00tea break
Aparecium, an XQuery / XSLT parser library for invisible XML
C. M. Sperberg-McQueen (Black Mesa Technologies)
'Invisible XML' ('ixml') is a method for treating non-XML information as if it were XML; it was proposed by Steven Pemberton in 2013. The basic idea is straightforward: a context-free grammar is used to describe the structure of the information, annotations in the grammar specify how the raw parse tree of a sentence in the language described by the grammar is to be represented into XML, and an ixml parser uses the grammar to parse the non-XML document into an XML form. This allows all the tools of the XML toolbox to be applied to the data: XQuery and XSLT for general processing, XForms for creating user interfaces to the data, XML schema languages for validation, and so on.
Aparecium is an ixml parser written in XQuery and XSLT, as a library of functions callable from XQuery and XSLT. (The name is a reference to a spell in the Harry Potter novels, which makes invisible writing visible.) When used to parse external resources, Aparecium can be thought of as a replacement for the standard doc() function which can read non-XML data and deliver it as XML; it can also be used to parse strings which obey a context-free grammar, such as CSS style specifications, XSLT pattern expressions, SVG path expressions, and so on. The latter makes Aparecium useful for handling XML formats which use micro-grammars for some portions of documents.
For simplicity, Aparecium is implemented as a pipeline of processes. First the extended BNF notation allowed in ixml grammars is translated to an equivalent unextended BNF. This grammar is then used by an Earley parser to parse the input; the result is a large set of 'Earley items' describing various aspects of the parse. From the set of Earley items, Aparecium then constructs a 'parse-forest grammar' describing the set of parse trees in the input. As a final step, a parse tree is extracted from the parse-forest grammar and returned to the caller. Alternate interfaces may be used to specify that the parse-forest grammar should be returned, instead; this may be helpful in cases of ambiguity, since it allows the caller to study the ambiguity and in some cases to extract the preferred parse tree.
In some cases the caller will have the grammar in the non-XML form described in the ixml specification; in others, the grammar will be available as an XML document; sometimes the caller will have a URI for the grammar. The input may similarly be available either as a string or as a URI. Aparecium provides distinct calls for each of these situations, to simplify the use of Aparecium in constructing applications.
The talk will briefly describe the current status of Aparecium implementation and (the gods willing) show a simple demo; it will conclude with a discussion of some next steps in the work on Aparecium and in the development of broader support for invisible XML.
Declarative is a Feminist Issue
Betsy Haibel (Director of Software Engineering, LTSE)
Front-end web development is rooted in two declarative languages (HTML and CSS) and one imperative language (JavaScript) that can be written in a functional style. Front-end web development is also noted for contentious and ever-shifting gender dynamics – one year HTML and CSS are “for girls” and “not real programming,” another year it's JavaScript that's looked down upon. In this talk, we'll look at the history of front-end development through the twin lenses of gender and declarativity. Along the way, we'll see how gendered programming trends boosted the adoption of popular frameworks – and led to the quiet death of others. We'll get real about the social forces that have affected the credibility and “approachability” of declarative methods in front-end, and talk about how these same forces might play out in other declarative projects.