Program
Program entries
XForms, a Tutorial
One of the few declarative programming languages available is XForms, this month celebrating its tenth anniversary in its current instantiation. It is a W3C standard, and despite its name is not only about forms. Large projects, at large companies such as the National Health Service, the BBC and Xerox, have shown that by using XForms, programming time and cost of applications can be reduced to a tenth!
This tutorial introduces XForms, and shows several amazing applications that can be written in only a few dozen lines.
Steven Pemberton is a researcher affiliated with the CWI. Amongst other technologies, he co-designed ABC, the programming language that Python was based on, and web technologies such as CSS, HTML, XHTML, and XForms. He was chair of the W3C HTML working group for a decade, and still chairs the XForms working group.
Declarative vs Procedural
An introduction to the concept of declarative programming.
Steven Pemberton is a researcher affiliated with the CWI. Amongst other technologies, he co-designed ABC, the programming language that Python was based on, and web technologies such as CSS, HTML, XHTML, and XForms. He was chair of the W3C HTML working group for a decade, and still chairs the XForms working group.
Implementing XForms using interactive XSLT 3.0
Saxon-Forms is a (currently) partial XForms implementation developed using Saxon-JS, an XSLT 3.0 run-time written purely in JavaScript. Designed for browsers the mechanics of the XForms implementation such as actions are implemented using ‘interactive’ XSLT 3.0 extensions available with Saxon-JS, to update form data in the (X)HTML page, and handle user input using event handling templates.
O’Neil Delpratt joined Saxonica from a research project at the University of Leicester in 2010. He is a co-developer of the Saxon product, with specific responsibility for Saxon on .NET and Saxon/C for C/C++/PHP/Python languages. Before joining Saxonica, he completed his post-graduate studies at the University of Leicester. His thesis title was “In-memory Representations of XML documents”, which coincided with a C++ software development of a memory efficient DOM implementation, called Succinct DOM.
Debbie Lockett joined the Saxonica development team in 2014 following post-doctoral research in Mathematics at the University of Leeds. Debbie has worked on performance benchmarking, the implementation of XQuery 3.1 features, and on developing the tools for creating Saxonica’s product documentation. She is now the lead developer for Saxon-JS.
Are we still Open Source? Dilemmas for a new XML Database
Over the last 5 years many NoSQL vendors have relicensed their software, and the landscape is only becoming more tumultuous with several vendors recently playing a licensing game which is akin to Musical Chairs. Developing a new NoSQL database is no small feat, and we must eventually choose some sort of license for our users. We examine what is driving the licensing changes in the wider database community, how they apply to a new entrant to the marketplace, and ultimately ask the question, are we still Open Source?
Adam Retter has been a core contributor to the Open Source eXist-db Native XML Database for 14 years, he was also an invited expert to the W3C XQuery Working Group and helped standardise XQuery 1.0, 3.0, and 3.1. Adam founded the EXQuery project, and developed the RESTXQ framework for XQuery. Recently, Adam has been developing FusionDB a new multi-model NoSQL database which also supports XML natively.
Declarative Health: cityEHR
A multi-billion pound project to provide a national distributed patient-record system for the British National Health Service failed. One person, John Chelsom, recoded it using declarative techniques, and it now running in several hospitals.
Dr. John Chelsom has worked for over 30 years in the field of Health Informatics. He qualified with a degree in Engineering Science from the University of Oxford and a PhD from City University, London, where he studied the application of artificial intelligence in medicine.
In 2010, he started the Open Health Informatics research programme at City University, London, looking to address the causes of failure of the National Programme for IT. This research led to the development of the open source cityEHR – an ontology-based health records system, based on open standard, interfaces and development practices. cityEHR is now deployed as an operational EHR in several hospitals in England and is used for teaching students in health informatics.
Views from the past
The Views project was an outgrowth of the ABC project — the programming language that gave birth to Python. Views was specifically an effort to design a programming environment for the ABC programming language. Its main shortcoming may have been that it was ahead of its time. The Views system has been characterized as a browser avant la lettre; I will argue that it was much more than that.
Lambert Meertens became fascinated with computers at the age of 15, when he realized that there was no limit to the possibilities of computers but for the limitations of human imagination. This fascination has lasted to today. Enabling the general public to use computing technology productively and creatively has been a major motivation of his research.
Declarative Applications with XForms
In the 50s, when the first programming languages were designed, computers cost millions, and relatively, programmers were almost free. Those programming languages therefore reflected that relationship: it didn't matter if it took a long time to program, as long as the resulting program ran as fast as possible.
Now, that relationship has been reversed: compared to the cost of programmers, computers are almost free. And yet we are still programming them in direct descendants of the programming languages from the 50s: we are still telling the computers step by step how to solve the problem.
Declarative programming is a new approach to applications: rather than describing exactly how to reach the solution, it describes what the solution should look like, and leaves more of the administrative parts of the program to the computer.
One of the few declarative languages available is XForms, an XML-based language that despite its name is not only about forms. Large projects, at large companies such as the National Health Service, the BBC and Xerox, have shown that by using XForms, programming time and cost of applications can be reduced to a tenth!
This hands-on tutorial allows you to learn about the structure and workings of XForms, and gives you the opportunity to work on useful working programs.
It is a “bring your own device” tutorial. You will be required to install some files beforehand (details to follow), and check they are working. You will be able to work using the text editor of your choice during the tutorial.
Steven Pemberton is a researcher affiliated with CWI Amsterdam, the Dutch national research centre for mathematics and informatics. His research is in interaction, and how the underlying software architecture can support users. He co-designed the ABC programming language that formed the basis for Python. Involved with the Web from the beginning, he organised two workshops at the first Web Conference in 1994. For the best part of a decade he chaired the W3C HTML working group, and has co-authored many web standards, including HTML, XHTML, CSS, XForms and RDFa. He now chairs the XForms group at W3C.
An introduction to Greenfox, a schema language describing file system contents
Schema validation is like a prototype of the declarative approach – to describe, rather than to code. The tutorial introduces to Greenfox, a schema language for file system contents. Resources are described by shapes, which are sets of constraints. The goals of the tutorial are twofold. First, it should help to get started with using Greenfox. Second, it should awake an awareness that the same set of constraints can be described in a less or more declarative way. This possibility is above all opened by a new possibility, offered by the upcoming version of Greenfox, to describe resource relationships independently of the constraints using them. Participants might acquire an increased awareness that it is not sufficient to ask “if” declarative or not, but we should also ask “how” declarative, think about degrees of declarativeness.
Hans is a developer with a keen interest in XML technology. He likes to claim that "XML has nothing to do with XML!", but what he really wants to say is probably that XML technology has nothing to do with XML syntax. Or perhaps even that XML technology is a way of thinking about information which does emphasize tree structure but is independent of any particular mediatype. When after several years of advocating it, he found his idea still as uncommon as a green fox, he decided to give it a name, which is Greenfox.
Hands-on with Saxon-JS
XSLT in the browser has long been a promise, but nowadays, creators of web browsers have lost interest. They only support the early version of the standard, but very often support is even missing or going to be withdrawn.
This is a pity. A fact is that many XSLT transformations can be done on the server, but transformations in the client side can still be useful, for instance when XML documents are retrieved from an external source and have to be formatted.
Saxon-JS solves this problem. Additionally, and very interestingly, it can also come in the place where people normally apply Javascript. Saxon-JS is able to respond to user events such as clicks, keystrokes, focus events, finger events and much more. It can handle such events by applying XSLT to the HTML document that lives inside the browser, for instance by changing attributes or by adding or removing content.
My tutorial will give you a hands-on experience by compiling stylesheets for use in the browser and putting it to work for some common use-cases. Bringing your own laptop is required if you want to fully participate.
Pieter Masereeuw is a self-employed XML consultant. Born in the pre-computer, pre-SGML world, he managed to find his way from punch cards to XML. He now is a great lover of the XML tool stack, especially XSLT and XQuery. He used Saxon-JS on a Raspberry Pi in order to bring his daughter’s automated cat feeder back to life after being broken.
Success factors and pitfalls of declarative approaches
Declarative programming may not seem to be widespread, but it has been used for several decades in some areas of information technology. Examples are SQL for querying databases, and regular expressions and grammars for text analysis. More recently, domain-specific languages have been used to take advantage of declarative methods, with varying degrees of success.
What can we learn from the successful applications of declarative programming? And perhaps more importantly, is there something that failed applications have in common?
In this presentation we will look at declarative techniques that are so ubiquitous that nobody notices them anymore. We will also look at two pitfalls that the author has encountered many times: genericity and reification.
Nico Verwer works as a freelance software developer, designer, architect and trouble-shooter. His clients are mainly companies in the fields of publishing, media and government services, but also fit20, the world market leader in High Intensity Resistance fitness training. Nico has no preferred programming language, because he values understanding the application domain over knowledge of a particular technology. However, he does prefer techniques and methods that minimize accidental complexity. During his career he has deleted more lines of code than he has written.
JayParser: an Invisible XML implementation in XSLT
In XML Prague this year I presented on a proof of concept grammar parser in XSLT. In this presentation I plan to release an open-source fully compliant Invisible XML parser based on this work.
The presentation will demonstrate iXML parsing, some technical details of the implementation, and finish with a short overview of the advantages of this declarative approach of parsing text.
I also hope to show how it is possible for the parser to extend itself to other grammar languages.
Tomos Hillman has over a decade of experience with XML, XSLT, XQuery and related technologies, particularly in the field of digital publishing, quality analysis, and transformation. He has given training courses to various institutions including publishers, universities and the UN, as well as being a regular faculty member at the prestigious XML Summer School in Oxford.
Compiling XQuery to Native Machine Code
It has become a trend amongst modern high-performance databases to optimise queries by avoiding interpretation of the query language. They instead opt to compile queries to native machine code which can subsequently be executed directly by one or more CPUs and/or GPUs. Both Saxon and XSLTC have previously demonstrated, albeit by different means, compilation of XSLT and XQuery to Java bytecode which can then be executed by the Java Virtual Machine.
To the best of our knowledge, we are the first group to demonstrate the compilation of XQuery directly to native machine code. As part of a research effort to develop a new performant XQuery processor for our poly-store database (FusionDB), we are constructing an XQuery parser which emits LLVM IR (Intermediate Representation), and a JIT (just-in-time) compiler to produce native machine code which is then executed.
In this paper we review the current approaches to native query compilation, detail our challenges and progress in building a modern XQuery parser, and our use of LLVM for compiling XQuery to native machine code. We also consider how we might exploit LLVM's low-level IR optimizations to improve query performance.
Adam Retter has been a core contributor to the Open Source eXist-db Native XML Database for 15 years, he was also an invited expert to the W3C XQuery Working Group and helped standardise XQuery 1.0, 3.0, and 3.1. Adam founded the EXQuery project, and developed the RESTXQ framework for XQuery. Recently, Adam has been developing FusionDB a new multi-model NoSQL database which also supports XML natively.
Petal - An in-browser editor for LwDITA
With the meteoric rise of HTML5 and the slow demise of XHTML, simple in-browser WYSIWYG editors for producing XML documents have all but disappeared. Today, there are numerous Open Source JavaScript components for simple in-browser editors (e.g. CKEeditor, Editor.js, Quill, TinyMCE, etc.) that produce documents in a subset of HTML5. Likewise, there are free and commercial SaaS offerings that edit and publish reStructuredText or Markdown documents from within the browser (e.g. readthedocs.org, gitbook.com, and mkdocs.org).
The lack of simple in-browser XML editors, is likely driven by several factors:
1. The complexity of offering a full XML editor. One of the advantages of XML is its flexibility, an author can arbitrarily decide on their document structure, and element and attribute names, however this adds complexity to any such editor. Conversely schema languages for XML that limit that flexibility by enforcing a certain document grammar, then require extra parsing and validation steps by the editor.
2. A decrease in XML processing support within the browser itself.
3. A perceived reduction in what constitutes an acceptable level of re-use and presentation for technical documentation.
When compared to HTML5, Markdown, and reStructuredText, we believe that there are still key advantages that can be exploited by using XML for technical documentation. For the purposes of authoring and publishing the documentation for FusionDB Server, we examined several markup formats but ultimately settled on LwDITA. We intend to show that LwDITA occupies an optimum position, it allows us to reap the benefits of XML, whilst remaining simple enough to allow us to develop a simple and compliant in-browser editor.
Whilst we acknowledge that there are a handful of existing commercial and/or enterprise offerings which include in-browser XML editors (e.g. Oxygen XML Web Author, easyDITA, Xopus, etc.), they are in themselves very comprehensive and complex products with accompanying costs. Rather, we intend to both discuss the construction of, and demonstrate, our in-browser LwDITA editor as an intentionally simple solution for editing technical documentation.
Charaf Eddine Cheraa is an Engineer at Evolved Binary. He started developing as a hobby using QBasic in 2004, and since then has used many development languages for different purposes. Recently, he has worked mostly on web apps.
On the design of the URL
Notations can affect the way we think, and how we operate; consider as a simple example the difference between Roman Numerals and Arabic Numerals, where Arabic Numerals allow us not only to more easily represent numbers, but they also ease manipulations of numbers and calculations with them.
One of the innovations of the World Wide Web was the URL. In the last 30 years, URLs have become an ever-present element of everyday life, so present that we scarcely even grant them a second thought. And yet they are a designed artefact: there is nothing natural about their structure -- each part is there as part of a design. This talk will look at the design issues behind the URL, what a URL is meant to represent, and how it relates to the resources it identifies, and its relationship with representational state transfer (REST) and the protocols that REST is predicated on. The talk will consider what mistakes, if any, were made, and with hindsight how if at all the design could have been improved. While it is too late now to change the design of URLs, we will consider what the lessons are that we can draw from their design to direct the future designs of notations.
Steven Pemberton is a researcher affiliated with CWI Amsterdam, the Dutch national research centre for mathematics and informatics. His research is in interaction, and how the underlying software architecture can support users. He co-designed the ABC programming language that formed the basis for Python. Involved with the Web from the beginning, he organised two workshops at the first Web Conference in 1994. For the best part of a decade he chaired the W3C HTML working group, and has co-authored many web standards, including HTML, XHTML, CSS, XForms and RDFa. He now chairs the XForms group at W3C.
Self-Generating Quality Control: A Case Study
This paper demonstrates how quality control infrastructure can be generated from a single requirements document. Taken from a recent project that is now being used in production at a large journal publisher, it discusses some of the challenges faced and techniques used when generating Schematron, XSpec tests, XML grammar checks, and documentation.
The project set out to implement quality control requirements for journal articles using Schematron. In pursuing this objective, the project also created quality control infrastructure for Schematron itself that streamlines the process for incorporating iterative changes to requirements.
The techniques used in this project and described in this paper may be generally applicable in other projects.
Vincent Lizzi is Head of Information Standards at Taylor & Francis, is a member of the NISO JATS Standing Committee (ANSI/NISO Z39.96), and has contributed to the development of XSpec.Tomos Hillman has over a decade of experience with XML, XSLT, XQuery and related technologies, particularly in the field of digital publishing, quality analysis, and transformation. He has given training courses to various institutions including publishers, universities and the UN, as well as being a regular faculty member at the prestigious XML Summer School in Oxford.
Plain text processing in structured documents
Applications that analyze and process natural language can be used for things like named entity recognition, anonymization, topic extraction, sentiment analysis.
In most cases, these applications use the plain text of a document, and may add or change markup.
This causes problems when the original document already contains markup that must be preserved.
The text to be analyzed may run across markup boundaries, and newly generated markup may lead to unbalanced (non well-formed) structures.
This presentation shows how the Separated Markup API for XML (SMAX) can be used to apply natural language processing to XML documents.
It preserves the existing document structure and allows for balanced insertion of new markup.
A demonstration will be given of the use of SMAX for extracting and marking references in legal documents.
This Link eXtractor was built for the Dutch center for governmental publications.
SMAX and Simple Pipelines of Event API Transformers (SPEAT) will be available as open source software at the time of Declarative Amsterdam.
Nico Verwer works as a freelance software developer, designer, architect and trouble-shooter. His clients are mainly companies in the fields of publishing, media and government services, but also fit20, the world market leader in High Intensity Resistance fitness training. Nico has no preferred programming language, because he values understanding the application domain over knowledge of a particular technology. However, he does prefer techniques and methods that minimize accidental complexity. During his career he has deleted more lines of code than he has written.
Development of language solutions based on TEI and ODD
The Text Encoding Initiative (TEI) is a vast, long standing and widely used encoding standard, covering different areas in the humanities. High quality documentation with many examples and active discussions on the rationale behind the available elements and attributes and their intended use are among the many qualities of the TEI. The TEI presents itself as guidelines, trying to cover as many areas and use-cases in the humanities as possible. The TEI is also designed to be customized for use in specific situations. Customization is achieved via a "One Document Does it all" (ODD). An ODD offers a mechanism to override, restrict, eliminate and extend (parts of) the guidelines in a documented way. ODD can be seen as a powerful abstraction layer from which validation, documentation, but also processing models can be generated. A nice, but complex feature of ODD is that they can be chained, enabling you to have focused ODD's and to promote reuse. In my work on corpora and dictionaries at the Fryske Akademy ODD is the basis from which I generate XSD, configuration, SQL, Java, bind.xml etc. In my presentation I will show you how we benefit from ODD in for example editing and publishing solutions, our goal being to enhance tool development and interoperability through standardization. Outline of the presentation: 1. ODD - explanation and background - chaining - generation 2. Usages - corpora - dictionaries - lexicons - interoperability 3. editing - oXygen: validation and customizing author mode 4. The future - processing model and TEI Publisher
Eduard Drenth lives in the north of the Netherlands together with his wife and youngest son. Eduard Drenth has been in ICT for over 20 years. His main expertise is in Java, EE, databases and XML-technologies. Over the last four years he has mainly been working for language researchers and lexicographers on corpus linguistics, lexicons, dictionaries and digital editions at the Fryske Akademy. TEI and universal dependencies are the most important standards in the data layer. The Akademy can be seen as maintainer of Frisian language data and provider of (web)services. Both researchers and the general public are served using the same data and services. Since the Akademy is small it is important to limit the number of technologies, to have well documented, reusable libraries based on stable build processes. That's where ODD comes in...
Declarative Programming of TV Application Using NCL
NCL is the declarative programming language used to develop TV applications in IPTV systems and Terrestrial TV standardized by ITU[1] and Brazilian TV Forum, respectively. Its main characteristics are support: defining temporal synchronization among media assets and viewer interactions; layout reuse facilities ( and ); support multi-device presentation; scripts in the light-weight and embeddable language Lua; and an API for building and modifying applications on-the-fly called NCL editing command. This talk briefly introduces NCL, highlights its recent advances and discuss the future of the language.
Alan Guedes holds a Ph.D. from PUC-Rio, where he acts as research engineer at TeleMídia Lab. In his career, he worked in different TV/video research projects. Today, he works in the NCL player open source implementation and contributes to the NCL standards in both Brazilian TV Forum (Technical module) and ITU SG16 (Question 13 for IPTV). His research interests include Interactive Multimedia and Machine Learning applied to Multimedia.
Parsing text With XSLT 3
Although there is at least one parser generator that targets XSLT, this talk is about hand-written parsers. Some difficulties will be described, along with mitigations. Some techniques made possible by XSLT 3 will be described and some examples given. Attention to debugging and testing is also given.
The techniques have much wider application than formal parsing, and it turns out to be both useful and fun.
Liam Quin was for many years in charge of XML work at W3C; they left in 2018 and now runs Delightful Computing, which is active in XML, Web and accessibility training and consulting.
XProc - A pipelining language
XProc is an XML based programming language for complex document processing. Documents flow through pipelines in which steps perform processing like conversion, validation, split, merge, report, etc. It’s an almost perfect fit for the kind of processing necessary in complex document engineering. In 2016 a W3C community group started working on XProc 3.0 to replace the never very popular 1.0 version (the 2.0 proposal never made it). Main goals were to make the language much more usable, understandable and concise, update the underlying standards (most notably XPath) and allow processing of non-XML documents as well. The XProc 3.0 core specification has been stable for over a year now. There is one functioning processor (MorganaXProc-IIIse by Achim Berndzen) and one in the making (XML Calabash 3.0 by Norman Tovey-Walsh). There is a book (XProc 3.0: A Programmer Reference by Erik Siegel) that describes the language. This tutorial covers the basics of XProc 3.0. Participants that are in for the hands-on exercises: please download MorganaXproc-IIIse and try to flight-test it. For more information and, important, instructions for preparing for the tutorial, visit the tutorial's GitHub pages.
Erik Siegel (http://www.xatapult.nl/) works as a content engineer, XML specialist and technical writer. His main customers are in publishing and standardization. He is a member of the XProc 3.0 editorial committee.
SaxonJS Tutorial
SaxonJS is an XSLT 3.0 processor written in JavaScript and XSLT. It offers all of the traditional declarative features of XSLT in any modern browser and, on the server side, in Node.js. This tutorial will explain how to setup and use SaxonJS. We’ll cover the interactive extensions that make SaxonJS a powerful platform for developing browser-based applications. We’ll also explore how to use it on Node.js for traditional server-side automation tasks. Participants will be guided through a series of hands-on sessions where they will experience first hand how easy and fun it is to build applications with SaxonJS.
Norm Tovey-Walsh is a Senior Software Developer at Saxonica. He has also been an active participant in international standards efforts at both the W3C and OASIS. At the W3C, Norm was chair of the XML Processing Model Working Group, co-chair of the XML Core Working Group, and an editor in the XQuery and XSLT Working Groups. He served for several years as an elected member of the Technical Architecture Group. At OASIS, he was chair of the DocBook Technical Committee for many years and is the author of DocBook: The Definitive Guide. Norm has spent more than twenty years developing commercial and open source software. Debbie Lockett joined Saxonica back in early 2014 in the days of Saxon 9.6; when XPath 3.0 and XQuery 3.0 were brand new, and XSLT 3.0 was approaching "last call working draft" status. She had no idea what any of these things meant, and has learned everything she knows about software development and XML technologies while at Saxonica. Debbie previously worked as a post-doctoral researcher in Mathematics at the University of Leeds, writing papers on symmetries of infinite relational structures, and once taught an undergraduate course to a class of 200 students. Debbie has worked on SaxonJS since its inception in 2016, and is now a lead developer.
Hands-on ixml
We choose which representations of our data to use, JSON, CSV, XML, or whatever, depending on habit, convenience, or the context we want to use that data in. On the other hand, having an interoperable generic toolchain such as that provided by XML to process data is of immense value. How do we resolve the conflicting requirements of convenience, habit, and context, and still enable a generic toolchain? Invisible XML (ixml) is a method for treating non-XML documents as if they were XML, enabling authors to write documents and data in a format they prefer while providing XML for processes that are more effective with XML content. For example, it can turn CSS code like body {color: blue; font-weight: bold} into XML like color blue font-weight bold or body depending on choice. More details at invisiblexml.org. This tutorial provides a hands-on introduction to ixml: how to specify how documents are transformed into XML, and what choices you have.
Steven Pemberton is a researcher affiliated with CWI Amsterdam, the Dutch national research centre for mathematics and informatics. His research is in interaction, and how the underlying software architecture can support users. He co-designed the ABC programming language that formed the basis for Python. Involved with the Web from the beginning, he organised two workshops at the first Web Conference in 1994. For the best part of a decade he chaired the W3C HTML working group, and has co-authored many web standards, including HTML, XHTML, CSS, XForms and RDFa. He now chairs the XForms group at W3C.
RumbleDB: Data independence for large, messy datasets
We introduce Rumble, a query execution engine for large, heterogeneous, and nested collections of JSON objects built on top of Apache Spark. While data sets of this type are more and more wide-spread, most existing tools are built around a tabular data model, creating an impedance mismatch for both the engine and the query interface. In contrast, Rumble uses JSONiq, a standardized language specifically designed for querying JSON documents. The key challenge in the design and implementation of Rumble is mapping the recursive structure of JSON documents and JSONiq queries onto Spark's execution primitives based on tabular data frames. Our solution is to translate a JSONiq expression into a tree of iterators that dynamically switch between local and distributed execution modes depending on the nesting level. By overcoming the impedance mismatch in the engine, Rumble frees the user from solving the same problem for every single query, thus increasing their productivity considerably. As we show in extensive experiments, Rumble is able to scale to large and complex data sets in the terabyte range with a similar or better performance than other engines. The results also illustrate that Codd's concept of data independence makes as much sense for heterogeneous, nested data sets as it does on highly structured tables.
Abbreviations used in this presentation:
CLI: Command-Line Interface CSV: Comma-Separated Values DAG: Directed Acyclic Graph (a graph with no directed cycles, also known as dependency graph) ETL: Extract-Transform-Load, to import data in a database FLWOR: for-let-where-orderby-return (pronounced as 'flower') HDFS: Hadoop Distributed File System, it is an open-source framework for storing very large datasets on a cluster. HTTP: Hypertext Transfer Protocol, the basis of the Web JSON: JavaScript Object Notation RDD: Resilient Distributed Dataset, Spark's data primitive ROOT: CERN's native format for high-energy physics data. S3: Simple Storage Service, Amazon's cloud storage service SQL: Structured english Query Language UDF: User-Defined Function
Ghislain Fourny is a senior scientist at ETH Zurich with a focus on databases and game theory. He holds a Master of Science in Computer Science and a Doctorate of Science from ETH Zürich. Ghislain teaches Big Data courses for computer scientists as well as non-computer scientists. His research interests cover query languages for large-scale, heterogeneous, nested datasets, as well as rebooting game theory with a non-Nashian form of free choice. Ghislain was a member of the W3C XML Query working group from 2011 to 2014 and is a co-designer of the JSONiq query language and of the Rumble engine.
Features of a modern XML Resolver
XML Resolvers are a core extension feature in XML parsers and other applications in the XML stack. They allow you to transparently satisfy requests for DTDs, schemas, stylesheet modules, etc., with local copies of those resources. This offers improvements in both performance and security. XML Resolver 3.0, available in Java and (soon!) C#, provides full support for the XML Catalogs standard and a broad range of features designed to make deploying and using catalog-based resolution faster and easier. This talk will highlight the new features of the resolver including:Dynamic catalog construction with caching.Automatically loading catalogs from extension modules (jar files or assemblies).Improved support for resources distributed in extension modules. Handling http: and https: entries transparently. Validation of catalog files. Namespace-based resource discovery by indirection through RDDL documents.
Norm Tovey-Walsh is a Senior Software Developer at Saxonica. He has also been an active participant in international standards efforts at both the W3C and OASIS. At the W3C, Norm was chair of the XML Processing Model Working Group, co-chair of the XML Core Working Group, and an editor in the XQuery and XSLT Working Groups. He served for several years as an elected member of the Technical Architecture Group. At OASIS, he was chair of the DocBook Technical Committee for many years and is the author of DocBook: The Definitive Guide. Norm has spent more than twenty years developing commercial and open source software.
Roaster - declarative routing for eXist-db
Declarative approach to routing requests in eXist-db A brief introduction into the status quo, followed by a presentation of a new approach to declaratively design APIs and route requests with examples for several use cases. Introduction I will explain the basics of routing in general. After that, we will have a look at the status quo of routing requests in eXist-db. In particular using rest, RestXQ and the controller.xq and their pros and cons. - rest - has some quirks - does not encourage RESTful interface best-practices - can be hard to secure - restXQ - route handlers can be somewhere in a package - can lead to duplicate code for multiple output formats - parameter handling authentication and error handling is left to the user - controller.xq - can be hard to secure - parameter, authentication and error handling is left to the user - complex controllers can get hard to read - can only pass strings as parameters to handlers History Because of said limitations of all routing options on exist-db I made several attempts to come up with a better solution that maps a route to a function. 2019 I had a series of small breakthroughs and a working prototype that roughly modelled after the express router known from nodeJS. By mid 2020 Wolfgang Meier expressed the need for a better routing option to use with TEI-publisher. I showed him what I got and he ran with it. He had the brilliant idea to implement the OpenAPI standard and thus created a router where you first create the documentation. You declare which routes exist and what they expect and return. In this configuration you also set things like headers, mime-types and more. This ongoing collaboration is now part of e-editiones, the same society that governs TEI-publisher. Hands-On At the beginning we will have a look at an example JSON file that declares a simple API of an exist-db package. 1. Using a test page created from our declaration 2. Looking at the JSON file itself Then we will create a new route that will output different formats like (HTML, XML, JSON, CSV). I will show how to set arbitrary headers per route, in a handler function, dynamically, for cacheing and also using a middleware for all routes. To round things up, how to secure routes with cookies and basic auth, how to handle authorisation of requests and how to use a custom authentication method. What's next? What are our medium and long-term goals and how can you contribute.
Juri Leino is a software gardener from Berlin with over 15 years of experience in web development. In most recent years he has joined the exist-db project as a core developer focussing on the XQuery runtime. Next to consulting for exist-solutions and jinntec he also maintains and develops node-exist and gulp-exist and created XQuery libraries like xbow, exist-jwt and dicey.
Extracting Microcontent from DITA Topics
We transform DITA to HTML in the browser via a Single Page App (SPA). Our product GUI is also an SPA – Why not put our content directly in the GUI? This talk shows the advantages of dynamic transforms, and how we use that technique to extract subsets from the single source and display them in the product.
Chris Despopoulos is an old hand at technical writing. He is currently Publications Manager at Turbonomic Inc. In this role, he works with a small team that uses Git to manage DITA source, and a number of home-grown processes that exploit DITA to manage docs as code, harvest content from source code (for the API docs), produce release notes, integrate with markdown and other formats, and rebrand the pubs product for a matrix of Agile teams and projects. Tired of waiting for the “experts” to do it, he designed and implemented 4D Pubs, a single-page app for online help… Static site, dynamic client.
Functional, Declarative Audio Applications
Audio software, and particularly digital signal processing, is an application domain where the imperative, object oriented programming model dominates. In part, this can be justified by the realtime constraints that underly the domain, and that C/C++ has historically dominated the high performance native software landscape. But this is not without cost: the high barrier to entry prevents developers from trying to write audio software, and the industry spends far more time than needed to deliver new products. In this talk, we'll look at some of the complications that come from writing low level native audio software in C/C++ with an imperative, object oriented model. Then we'll reframe the conversation to show why a functional, declarative approach may be fundamentally more fitting for the problems we want to tackle when writing new audio software. Finally, I'll introduce Elementary Audio: a new JavaScript runtime for writing realtime, native audio applications with a functional, declarative API. We'll see how Elementary applies the declarative model to audio software, and then finish with a detailed example of a small drum synthesis library written in Elementary.
Nick Thompson is an audio software developer, contractor, and consultant. He is the owner of a small audio plugin company, Creative Intent, and the author of Elementary Audio and React-JUCE. Nick's interest lies in tools that enable and promote creativity and simplicity, both in music making and in software development.
SchemaCom - An XML Schema Comparator
People working with large XML vocabularies often face the task of upgrading to a new version [1]. Ideally such are guided with a specification of how to map the components of an XML instance to the new version of the vocabulary. SchemaCom was created to assist situations where such a specification is not available. It highlights the differences (and similarities) between the constituent content models in the respective vocabularies, this information can then guide the analysis necessary to specify the missing mappings and can be applied without loss of generality between schemas representing different vocabularies (as opposed to different versions of the same one). A distinguishing feature is the delivery of the user interface as an XForm.
Ihe Onwuka has been working with XML since 2003 and is a System Engineer with LS Technologies assisting in the development and architecting of large complex data models for the US federal government. He is a great believer in functional programming and declarative technologies in general. Hobby wise he enjoys street dance choreography and had a long rugby career during which he played in 4 of the 5 continents and only retired because none of the professional clubs came in with an offer big enough to entice him to carry on.
Aparecium, an XQuery / XSLT parser library for invisible XML
'Invisible XML' ('ixml') is a method for treating non-XML information as if it were XML; it was proposed by Steven Pemberton in 2013. The basic idea is straightforward: a context-free grammar is used to describe the structure of the information, annotations in the grammar specify how the raw parse tree of a sentence in the language described by the grammar is to be represented into XML, and an ixml parser uses the grammar to parse the non-XML document into an XML form. This allows all the tools of the XML toolbox to be applied to the data: XQuery and XSLT for general processing, XForms for creating user interfaces to the data, XML schema languages for validation, and so on. Aparecium is an ixml parser written in XQuery and XSLT, as a library of functions callable from XQuery and XSLT. (The name is a reference to a spell in the Harry Potter novels, which makes invisible writing visible.) When used to parse external resources, Aparecium can be thought of as a replacement for the standard doc() function which can read non-XML data and deliver it as XML; it can also be used to parse strings which obey a context-free grammar, such as CSS style specifications, XSLT pattern expressions, SVG path expressions, and so on. The latter makes Aparecium useful for handling XML formats which use micro-grammars for some portions of documents. For simplicity, Aparecium is implemented as a pipeline of processes. First the extended BNF notation allowed in ixml grammars is translated to an equivalent unextended BNF. This grammar is then used by an Earley parser to parse the input; the result is a large set of 'Earley items' describing various aspects of the parse. From the set of Earley items, Aparecium then constructs a 'parse-forest grammar' describing the set of parse trees in the input. As a final step, a parse tree is extracted from the parse-forest grammar and returned to the caller. Alternate interfaces may be used to specify that the parse-forest grammar should be returned, instead; this may be helpful in cases of ambiguity, since it allows the caller to study the ambiguity and in some cases to extract the preferred parse tree. In some cases the caller will have the grammar in the non-XML form described in the ixml specification; in others, the grammar will be available as an XML document; sometimes the caller will have a URI for the grammar. The input may similarly be available either as a string or as a URI. Aparecium provides distinct calls for each of these situations, to simplify the use of Aparecium in constructing applications. The talk will briefly describe the current status of Aparecium implementation and (the gods willing) show a simple demo; it will conclude with a discussion of some next steps in the work on Aparecium and in the development of broader support for invisible XML.
C. M. Sperberg-McQueen is the founder of Black Mesa Technologies LLC, a consultancy specializing in the use of descriptive markup to help memory institutions preserve cultural heritage information. He co-edited the XML 1.0 specification, the Guidelines of the Text Encoding Initiative, and the XML Schema Definition Language (XSDL) 1.1 specification.
Declarative is a Feminist Issue
Front-end web development is rooted in two declarative languages (HTML and CSS) and one imperative language (JavaScript) that can be written in a functional style. Front-end web development is also noted for contentious and ever-shifting gender dynamics – one year HTML and CSS are “for girls” and “not real programming,” another year it's JavaScript that's looked down upon. In this talk, we'll look at the history of front-end development through the twin lenses of gender and declarativity. Along the way, we'll see how gendered programming trends boosted the adoption of popular frameworks – and led to the quiet death of others. We'll get real about the social forces that have affected the credibility and “approachability” of declarative methods in front-end, and talk about how these same forces might play out in other declarative projects.
Betsy Haibel is a San Francisco-based engineering leader with over a decade of experience. She writes fiction and non-fiction in English and a variety of programming languages, and prior to the pandemic co-organized the Learn Ruby in DC meetup.
Getting started with RumbleDB
I will give a smooth introduction to how to get started with RumbleDB: local installation, write hello worlds queries, download and query a small dataset, then a larger one, then how to do relational algebra, and if time permits user-defined types, conversion to binary formats for better performance, and machine learning.
Ghislain Fourny is a senior scientist at ETH Zurich with a focus on
databases and game theory. He holds a Master of Science in Computer
Science and a Doctorate of Science from ETH Zürich. Ghislain teaches Big
Data courses for computer scientists as well as
non-computer scientists and is making his textbook publicly available
online. His research interests cover query languages for large-scale,
heterogeneous, nested datasets, as well as rebooting game theory with a
non-Nashian form of free choice. Ghislain was
a member of the W3C XML Query working group from 2011 to 2014 and is a
co-designer of the JSONiq query language and of the RumbleDB engine.
Schematron tutorial
Schematron is a validation language that goes beyond the capabilities of DTD, W3C Schema and RelaxNG. Using XPath expressions, it can validate almost anything in an XML document.
No prior knowledge of Schematron is required for this tutorial. We start at the very beginning. But because the language is simple and elegant, we'll soon reach some more advanced topics (like abstract rules and using XSLT). The tutorial will be a mix of theory and hands-on exercises.
Erik Siegel (http://xatapult.com/) works as a content engineer, XML specialist and technical writer. His main customers are in publishing and standardization. He is a member of the XProc 3.0 editorial committee.
Introduction to Fore
'Fore' is a pure client-side, declarative and open source XML editing solution for structured XML following the principles of the XForms 2.0 standard though taking some freedom here and there.
It is specifically suited to build complex, form-based editing front-ends but can also be used to implement complete data-driven applications.
It is pure client-side library implemented as a set of plain-vanilla Web Components, needing no special framework to run. It is written in ES6 JavaScript.
It supports XQuery/XPath 3.1 as expression language via the fantastic fontoXPath library. Alongside XML Fore can also handle JSON via the XPath map syntax.
In this tutorial we want to give you an introduction in how to create your own interactive UI elements including webforms with Fore.
After a short introduction on the underlying principles and differences to XForms, we want participants create following examples
- a todo app
- a clock
- a TEI-header editor (or other xml-fragment editor)
Juri Leino is a software gardener from Berlin with over 15 years of experience in web development. In most recent years he has joined the exist-db project as a core developer focussing on the XQuery runtime. Next to consulting for exist-solutions and jinntec he also maintains and develops node-exist and gulp-exist and created XQuery libraries like xbow, exist-jwt and dicey.Joern Turner is one of the directors at Jinntec GmbH and has been
developing XML solutions for
over two decades now. He founded the Chiba and
betterFORM projects which implemented the XForms standard and
contributes to
several open source projects like TEI-Publisher, eXist-db, Roaster,
Tuttle and others. Two years ago he started the Fore project as a
follow-up of betterFORM.
Advanced ixml Hands-on
This year marked the
official release of ixml at invisiblexml.org, and several implementations
are now available. At last year's Declarative Amsterdam there was a
tutorial introduction to ixml, which introduced the concepts, covered all
elements of the language and its basic use. This tutorial will extend that
tutorial by covering subjects such as Whitespace handling, Ambiguity,
Multi-character tokens, Separators, and Insertions, and will include a case
study of the ixml grammar of ixml. A condition of doing this tutorial is
that attendees must already have done the introductory tutorial, which is available for
self-study. A similar technique of interweaving lecture with exercises, and
the tutorial being available for independent study after the conference
will be used.
Steven Pemberton is a
researcher affiliated with CWI Amsterdam, the Dutch national research
centre for mathematics and informatics. His research is in interaction, and
how the underlying software architecture can support users. He co-designed
the ABC programming language that formed the basis for Python. Involved
with the Web from the beginning, he organised two workshops at the first
Web Conference in 1994. For the best part of a decade he chaired the W3C
HTML working group, and has co-authored many web standards, including HTML,
XHTML, CSS, XForms and RDFa. He now chairs the XForms and ixml groups at
W3C. This year he was awarded the ACM SIGCHI Lifetime Practice
Award.
Declarative axiomatic and provable correct systems in Swift
Swift's type system allows the creation of small
self-contained modules controllable by sending typed messages.
These messages DSL type encodes commands that can follow
spoken english quite closely, i.e.:
.increase(.brightness,by:10.point,on:light))
increases the
brightness of light by 10% points.
In this tutorial we will
implement an application using «Khipu», a fully immutable
implementation of Robert C. Martin's «Clean Architecture».
Khipu is fully testable with tests not only valid
in TDD, but en par with the Scientific Method itself. We will
explore «Partial Application» as an alternative to problematic
class-centred designs.
Manuel Meyer has been developing iOS applications
since 2009. He experiences a startup scene that proves
impossible to use engineering principles or professionalise
themselves in any other manner. In his search for better
coding he discovered the Declarative Domain Paradigm, which he
currently explores for his book “The Declarative Revolution”.Online playground at https://swiftfiddle.com/jyzdziivsvfnrhmwdtmevhc37m
Declarative thinking in SQL, its teaching and its unused potential
Despite its many warts and wrinkles, SQL is undeniably a declarative success story. Teaching declarative thinking is often impeded by the procedural mindset of students originating from their early training almost exclusively in imperative programming languages. While SQL databases are optimized for executing SQL, students tend to write inefficient and clumsy code using disputable so-called “procedural extensions”. A few typical examples will be presented. While a limited kind of declarative constraints are supported in SQL, implementations generally fall short of the potential of database-wide declarative constraints. Some notions are illustrated by looking at declarative referential integrity, triggers as a poor substitute and an experimental RDBMS called “Rel”. As a concluding impulse for discussion, the presumed waste of talented declarative minds by the overpowering imperative bias in teaching is addressed.
Günter Burgstaller teaches “Database and Information Systems” at the HTBLuVA Wiener Neustadt’s informatics department since 2009. Austria’s unique technical college system, recognised in the EU as third-level education, trains students from 14 years on to become highly-demanded engineers in the course of five years. Graduates are also eligible for university.Prior to teaching, Günter Burgstaller worked for 15 years as a database and systems engineer in various IT companies.
Atomic Data: a modular specification for making the semantic web more practical
The vision for a semantic web is over 20 years old, yet so little
of our internet is currently using linked data. Where did it go wrong,
and what can we do to revive this dream? Atomic Data is a modular
specification that aims to make the semantic web achievable again
through a developer-friendly ecosystem of tools that enable high data
interoperability and return ownership back to the user.
Joep Meindertsma is an entrepreneur and software developer who has
worked on the e-democracy tool Argu in the past years. During the
development of this software, he got interested in the semantic web and
linked data, which eventually led him to work on various RDF-related
projects. Currently, he's working on the Atomic Data ecosystem.
On the representation of abstractions
Data is an abstraction. In order to transfer it or talk about it, the abstraction has to be represented using some notation.
The design of notations is a neglected area of study, yet the designs affect both what you can represent, and what you can do with what is being represented.
XML is currently the most suitable and flexible of the available notations for representing data abstractions, and yet it has restrictions and some shortcomings that get in the way of properly representing abstractions.
This paper discusses the issues, and reflects on which aspects of XML get in the way of abstractions.
Steven Pemberton is a
researcher affiliated with CWI Amsterdam, the Dutch national research
centre for mathematics and informatics. His research is in interaction, and
how the underlying software architecture can support users. He co-designed
the ABC programming language that formed the basis for Python. Involved
with the Web from the beginning, he organised two workshops at the first
Web Conference in 1994. For the best part of a decade he chaired the W3C
HTML working group, and has co-authored many web standards, including HTML,
XHTML, CSS, XForms and RDFa. He now chairs the XForms and ixml groups at
W3C. This year he was awarded the ACM SIGCHI Lifetime Practice
Award.
Element classification: a bottom up perspective
XML is top down oriented. In a DTD, an element type declaration specifies what content the element may contain, but in no way you can constrain in what context that element may be included. In document oriented languages, where mixed content is typical, it is worth looking up.
By seeing in what way an element may occur in its potential parent elements one can classify elements. It raises questions like: is it "suspicious" if an element may occur in the mixed content of some elements, and in the element content of others? Is it suspicious if an element that may occur in mixed content has itself element content? What kind of elements occur in a repeatable OR group? What elements can serve as word boundaries? What does a SEQ group "mean"?
Two sequence paradoxes will be mentioned, and by taking as an example both the content and the parents of the "paragraph next door" element we will find that what should be the second most used element in any document instance is actually missing in most industry standard markup languages.
Diederik Gerth van Wijk celebrates this year that 45 years ago
he wrote his first computer program, studying economics at the
Erasmus University Rotterdam. As an assistent to the department
of computer science he had to write user manuals, which prompted
him to invent his own markup system. His master thesis was on
using “Word Lists for Intentional Text Processing”, and after
his graduation he started working as a programmer with one of
the electronic publishing houses of Wolters Kluwer’s. For 15
years he was responsible for maintaining and developing the DTDs
of the law publishing division, during which he joined the Dutch
normalisation committee for SGML; he also became editor of
Case study of a semantic library underpinning the four-corner model for document exchange
ERP and business systems around the world are having to interconnect more today than ever before. Changing one's system to address interoperability can be a disruption and a risk. The four-corner model for document exchange addresses this in an interoperable network by requiring users' network representatives to exchange the semantic content of documents using a single declarative syntax. For business documents, multiple global networks have chosen the OASIS Universal Business Language (UBL) semantic library and XML syntax between access point network representatives. This is a model any industry sector can adopt to introduce successful interoperability into legacy environments.
G. Ken Holman is the editor of the Universal Business Language (UBL)
2.3 standard and the facilitator of the Business Payments Coalition
(BPC) Exchange Framework Semantics Workgroup. He has spent his career
helping committees and companies in determining semantic ontologies
and taxonomies, and then realizing them in declarative markup.
Fixing EPUB - Great Expectations
The Extensible Markup language (XML) provides a way to share information with tagging specific to content domains, rhetorical forms, or other needs. By contrast, HTML markup is evolved synergistically with Web browsers in mind, and the tagging is largely specific to browsers.
Although EPUB was (is) based on XML, it's also based on HTML: it uses XHTML for the documents. As a result, there is little to no support for domain-specific markup, and the ebook experience seems unimaginative, constrained by the intersection of Web browsing and content silos.
What might happen if there could be communities evolving domain-specific markup for electronic texts? For 3D models or live mathematics, for poetry, for music and musical scores inside books? For context-specific dictionaries and glossaries, for per-chapter tables of contents and wayfinding cards, and so much more.
Is there any way to enrich ebook readers in this direction?
Liam Quin was for many years in charge of XML work at W3C; they left in 2018 and now runs Delightful Computing, which is active in XML, Web and accessibility training and consulting.
Exploring the declarative nature of XProc 3.0
In this presentation we want to explore some aspects of the XProc 3.0 language that could be considered as declarative, while pointing out some other aspects that are potentially not. For doing so we will build the case by looking at some common patterns relevant to pipeline authors.
Stating that a programming language is "declarative", is not just a claim in computer science. It also has practical consequences. It is generally understood that declarative programming languages are easier to use and do allow people to solve problems in a shorter amount of time compared to other programming paradigms. Claiming that a programming language would be "declarative" implies that its use has commercial advances over other approaches to solve the same problem.
The question whether XProc 3.0 is a declarative language is, to our knowledge, rarely addressed. Control structures like p:choose or p:try in XProc 3.0 might raise doubts about the declarative nature. On the other hand, the level of abstraction in XProc's step infrastructure might at least hint to declarative features. We felt that this tension offered enough motivation to give it some further thoughts.
This presentation will not take the computer science theory route. Instead, we will focus on user experience while solving problems with XProc 3.0.
Achim Berndzen earned an M.A. in philosophy at Aachen University and has more than 20 years of teaching experience in communications.
In
2014 he founded . He is developer of MorganaXProc
and MorganaXProc-III, a fully compliant XProc 3.0 processor with an
emphasis on
configurability and plugability. Achim also works on projects use the
power of XProc and other XML technologies for customers.Geert Bormans has long been an angle-bracket jack-of-all-trades. He loves
the beauty of a well-architected solution or a pure and simplified process.
Geert makes a living as an independent consultant providing XML or Linked
Open Data solutions, mainly to the publishing industry. He does so with a
broad geographical flexibility.Currently Geert works for the Swiss administration. He is involved in the
publication of legislation using technologies such as XML, XSLT,
Schematron, RDF and a lot of XProc 3.0.
Tools for thought as cultural practices, not computational objects
Over the last
few years, “Tools for thought” has become a hot concept among software
makers. We’ve seeing a surge in note-taking apps and knowledge
management tools presenting themselves as TFTs. But there’s a strange
paradox here: taken at face value, the phrase “tool for thought” doesn’t
have the word ‘software’, ‘computer’ or ‘digital’ anywhere in it. It
sounds like an idea that belongs to philosophy and cognitive science
department, rather than computing. How and when did it become
intertwined with computers? What are we to make of all the software
upstarts identifying with this movement? And how can we push beyond them
into a new generation of computational tools for thought?
Maggie Appleton is a
designer, anthropologist, and mediocre developer. She currently leads
design at Ought, an AI research lab exploring how machine learning can
support open-ended reasoning. She previously spent years working in
developer education, designing visual metaphors for programming
concepts. She’s enthusiastic about visual programming, end-user
development, and digital gardening.