JSONiq Updates for RumbleDB
JSON is ubiquitous in Big Data systems, enabling heterogeneous data to be efficiently expressed in a human readable format. Analysing JSON data means using either extensions to SQL or languages with native-JSON support, like JSONiq. The former relies on extensions to the relational model and so cannot adequately express JSON’s heterogeneity. While the latter intelligently expresses analyses of document data, standard JSONiq systems do not support modifying documents without complete overwrites. JSONiq Updates solves this by introducing Pending Update Lists and Update Primitives to support several new expressions that enable fine-grained updates to JSON data.
In this presentation, we showcase these expressions in a JSON execution engine, namely RumbleDB, and introduce a unified Target-Selector-Content interface for expressions that enabled extensible and simple processing. Further, we describe a simple scoping mechanism to dynamically enforce the mutability semantics of JSONiq updates and we outline RumbleDB’s integration with Databricks’ ACID compliant Delta Lake to persist these updates. We also showcase the extent of the expressivity of JSONiq Updates in RumbleDB through the TPC-C and Github archive benchmarks.