While this setup is specifically for Travis-CI you will possibly find useful information to integrate existDB in other CI platforms as well.

It is language independent but was tested only with Java and nodeJS. Both setups are provided.

You can test applications, that are served from the db or consume one of its APIs.

tl;dr

CI? You should have it.

The complete setup to test anything against one or more versions of eXistDB on TravisCI with caching enabled

Preface

I was recently working on node-exist. It is a node package, that consumes eXist's RPC API. In order to run the tests the database needed to be up.

Mocking of the database responses could have solved that problem, but…

  • Now you got 2 problems
  • how to test against another version or multiple ones
  • validating mock responses (of multiple versions)

A much better solution is for the tests to run on a continous integration platform. Here, every commit can be tested against different versions of the database in parallel. And, there is no need to have them runnning on the development machine. Plus, everyone with access can verify if a certain build is running and which database versions are supported by your application.

There are more arguments pro continuous integration, which you can easily find online.

TravisCI seemed to be a reasonable choice, not only because eXistDB itself runs its automated tests here.

Which version to test against?

As travis automatically starts one build per entry in env your tests will be run against the 3.0.RC1 and the 2.2 release of eXist with

env: - EXISTDBVERSION=eXist-3.0.RC1 - EXISTDBVERSION=eXist-2.2

Add or remove versions as you need them.

before install

It proved to be handy to store the installation folder of the current DB version in an environment variable `EXISTDBFOLDER`. It will be used by following scripts and commands.

before install: - export EXISTDBFOLDER=${HOME}/exist/${EXISTDBVERSION}

installation and setup

To install the database we download the source from github, extract it and then call its build routine.

export TARBALLURL=https://github.com/eXist-db/exist/archive/${EXISTDBVERSION}.tar.gz mkdir -p ${EXISTDBFOLDER} curl -L ${TARBALLURL} | tar xz -C ${EXISTDBFOLDER} --strip-components=1 cd ${EXISTDBFOLDER} ./build.sh

All of the above can be nicely packed into a setup script.

install: - ci/setup-db.sh
  • Note: `EXISTDBVERSION` is the environment variable we defined at the very beginning. This can be a tag or branchname or even a commit hash.

Can't we start already?

Yes, but in order to do that we have to start eXist in the background and wait for it to listen to requests. Last, ensure that by doing a very simple one.

cd ${EXISTDBFOLDER} nohup bin/startup.sh & sleep 30 curl http://127.0.0.1:8080/exist

Yes, you guessed it. There is a database start-up script.

before_script: - ci/start-db.sh
  • Note: Without changing to its installation folder a bunch of exceptions will end up in nohup.out complaining about log files that cannot be found and opened.

That's it

Run your tests!

script: - do test /

Now it is time to clean up the closet:

afterscript: - cd ${EXISTDB_FOLDER} - bin/shutdown.sh

Tweaks

Caching

Downloading and building exist from source can take up to 3 minutes. So, you may want to speed up your tests by caching the built database. The archived cache has still to be loaded from s3 but you will gain an extra minute or so - YMMV.

Is this version of existDB already cached?

In our case that boils down to: does the folder exist?

if [ -d "$EXISTDBFOLDER" ]; then echo "Using cached eXist DB instance: ${EXISTDBVERSION}." exit 0 fi

Teardown this Database

Remove any data that is or might be left behind by your tests and remove logs, too. To make sure that you will always get the latest version for branches or refs like HEAD, anything but releases should be excluded from caching.

if [[ "${EXISTDBVERSION}" == eXist ]]; then echo "reset data and logfiles for ${EXISTDBVERSION}" cd ${EXISTDBFOLDER} ./build.sh clean-default-data-dir rm webapp/WEB-INF/logs/.log exit 0 # fi echo "exclude ${EXISTDBVERSION} from cache" rm -rf ${EXISTDBFOLDER}

Put together:

before_cache: - ci/teardown-db.sh cache: directories: - ${HOME}/exist

What if Java is not you first language?

If you are not testing a java-application, as I was, you need to install java 1.8 into to the testing container with:

addons: apt: packages: - oracle-java8-installer

and make this version the default by adding

- export JAVA_HOME=/usr/lib/jvm/java-8-oracle

to the before_install step.

The Gist of the Story

(Integration) testing is necessary, so if you're application depends on eXistDB then you should definitely have a look at this project.

examples on GitHub

results on Travis

contents

  • .travis.yml (Java setup)
  • node.travis.yml (nodeJS setup)
  • ci/* (helper scripts described above)
  • utility/* (ant, Java setup utility functions)
  • project/* (sample project)

XML Prague 2016

XML Prague 2016

This year the XML Prague conference is taking place from 11th to 13th of February.

Topics will be:

  • Markup and the Extensible Web
  • Semantic visions and the reality
  • Publishing for the 21st century
  • XML databases and Big Data
  • State of the XML Union

As each year members of the community and other interested parties are invited to join the 'eXist-db Unlike Preconference' to share and exchange knowledge around eXistdb.

eXist 3.0.RC1 available for download

It is our great pleasure to announce the first release candidate for eXist 3.0.

As always, the purpose of this release candidate is to collect feedback on performance and compatibility. Whilst it is considered feature complete, we would not yet recommend it for production environments.

eXist 3.0.RC1 is the culmination of over 550 changes made in the last six months. The main focus has been on fixing bugs, migrating to Java 8 and improving the performance of eXist. The move to Java 8 alone has brought many internal improvements in eXist, enabling us to work with a leaner and safer code base; resulting in better resource and lock management, and improved performance for our users.

New Features

  • Support for XQuery 3.1, including the array and map data types, serialisation and JSON parsing
  • Support for Braced URI Literals from XQuery 3.0
  • Facility to boost attributes in the Lucene full text index
  • eXist version detection for EXPath packages. Packages should explicitly specify which versions of eXist they are compatible with; eXist 2.2 is assumed by default.
  • Prototype support for Portable EXPath XQuery Extension Functions written in Haxe

Improved Performance

  • Sequence type checking on recursive function parameters has been drastically sped up
  • Lucene full-text and range indexes have been switched to "near realtime" behaviour. This improves query performance on frequently updated documents
  • Improved optimization of wildcard steps in path expressions, e.g. prefix: and :name
  • Better performance for util:eval
  • Optimisation of fn:fold-left and fn:fold-right

Mission Critical Bug Fixes

There have been numerous bug fixes and enhancements since eXist 2.2, the most critical are:

  • Patched a memory leak in the Java service wrapper that occurred on certain Linux systems
  • Solved a potential deadlock which manifested when storing XQuery files into the database under certain conditions
  • Fixed a memory leak when storing query results into the HTTP session; Web applications making use of the HTTP session should now consume less memory and scale further
  • Fixed an occasional deadlock when shutting down the database
  • Fixes to match highlighting with the Lucene full text index
  • Lucene range index now correctly handles != comparisons

Clean up and Refactoring

  • Rewritten HTML5 Serializer
  • Removed the legacy SOAP API and SOAP Server
  • Removed the legacy Full Text Index
  • Removed the Versioning extension; will be made available as a separate app package.
  • Rewritten XML:DB and XML-RPC APIs
  • Updated to the latest version of RESTXQ
  • Improved Java Admin Client document viewing and editing
  • Clean up of eXist's Test suite
  • Extensive internal refactoring to exploit new Java 8 features

Backwards Compatibility issues

  • eXist-3.0.RC1 is not binary compatible with previous version of eXist; the on-disk database file format has been updated, users should perform a full backup and restore to migrate their data.
  • eXist 3.0.RC1 and subsequent versions now require Java 8; Users must update to Java 8!
  • Due to the legacy Full Text Index being removed, the text (http://exist-db.org/xquery/text) XQuery module has also been removed. Users should now look toward fn:analyze-string.
  • There have been some small changes to some of the internal APIs. e.g. XQueryService has been moved from DBBroker to BrokerPool.
  • EXPath packages that incorporate Java libraries may no longer work with eXist 3.0 and may need to be recompiled for our API changes; packages should now explicitly specify the eXist versions that they are compatible with.

eXist-3.0.RC1 is available for download from bintray.com. The older Sourceforge download page is no longer updated. Maven artifacts for eXist-3.0.RC1 are available from our mvn-repo.

XQuery 3.1 Arrays and JSON Support

The current development version of eXistdb includes full support for the array data type and related features from the XQuery 3.1 Candidate Recommendation. In combination with maps, arrays allow for a more "natural" representation of JSON in XQuery. Processing JSON or interfacing with external services returning JSON has become a lot more straightforward.

But even if you are only mildly interested in JSON, arrays are a welcome addition to the XQuery language, mainly because unlike sequences, arrays can be nested. I guess most XQuery programmers have encountered a situation in which it would have been nice to return a sequence of sequences from a function. And sometimes you may want to indicate that particular items in a result sequence are empty. With arrays you can do all that. Arrays may contain other arrays or maps, sequences or even the empty sequence as members.

Recorded presentation of this article during the eXistdb user pre-conference in Prague

Array Constructors

Array constructors come in two flavors: square and curly constructors. The square constructor will look familiar to most people:

let $array := [1, (), (3, 4)] return $array(3)

Within the square constructor, the "," is just a separator (like in a function call), so the resulting array will correspond to the comma-separated members. In the example above we're retrieving the third member using $array(3), which is the sequence containing numbers 3 and 4. Getting the second member with $array(2) will return the empty sequence accordingly.

The curly constructor behaves slightly different: it takes a sequence of items and creates an array member from each of them:

let $array := array { 1, (), (3, 4) } return $array(3)

Above query returns "4"! See the difference? The "," in this case is the XQuery sequence constructor, so the sequence from which the array is built is 1, 3, 4.

As already announced, you can arbitrarily mix sequences, arrays and maps, resulting e.g. in:

let $books := [ map { "title": "eXist", "author": [ [ "Adam", "Retter" ], ["Erik", "Siegel" ] ], "language": "English" }, map { "title": "XQuery", "author": [ [ " Priscilla", "Walmsley" ] ], "language": "English" } ]

Array Lookups

Just like a map, an array is also a function, which accepts a single integer parameter corresponding to the position of the member to retrieve (as always in XQuery, counting starts at 1). We have seen simple examples above. For nested data structures, just chain the function calls, e.g.:

$books(2)("title")

Alternatively, there's a lookup operator, which is often a bit easier to read. It works on arrays as well as maps (but not on other data types):

$books?2?title

The lookup may also appear inside a predicate. In this case, the left hand argument (the array or map) is often skipped and defaults to the context item:

$books?*[?title = "eXist"]?language

The operator excepts an integer, name, parenthesized expression or a wildcard as its right hand argument. So to use e.g. a variable for the lookup, wrap it into parens:

let $field := "title" return $books?2?($field)

Note: because $books is an array, the lookup argument must evaluate to a sequence of integers or you'll see an error. It is possible to look up more than one array item at a time, e.g.: $books?(1 to 2)?title.

The wildcard returns the keys or members of a map or array. When used on a map, it results in a sequence of keys, whereas on an array, you get a sequence of members:

["Hello", "world", "!"]?*

Use the wildcard as a quick way to iterate an array:

for $book in $books?* return $book?title

Function Library

The XQuery 3.1 functions spec also includes a huge library of functions to process and modify arrays. All functions use the prefix array.

array:sizereturns the size of the array
array:headreturns the first member
array:tailan array with all members except the first
array:subarraycreates an array containing a subset of members
array:reversereverse members
array:for-eachiterate over members
array:filterfilter the arrays with a function
array:fold-leftapply function to members and collect results from left to right
array:fold-rightapply function to members and collect results from right to left
array:for-each-pairiterate members pair-wise
array:appendappend a member to an array
array:insert-beforeinsert new member
array:removeremove member
array:joinConcatenates the contents of several arrays into a single array

As all data types in XQuery, arrays are immutable and cannot be modified. The functions above will thus always return a new array. eXist tries to implement this in an efficient way for functions like array:tail, array:append, array:subarray, array:remove without creating redundant copies.

Please note that I did not implement array:sort yet. It will be added later.

Many of the functions mirror other functions already available in the standard function library, but take an array instead of a sequence as input. For example, array:fold-left works like fn:fold-left.

declare function local:price($hoursPerTask as array(xs:integer), $rate as xs:double) as xs:double { array:fold-left($hoursPerTask, 0.0, function($sum, $hours) { $sum + $hours * $rate }) }; local:price([3, 8, 6, 5, 2], 96.0)

In this example we multiply the hours required for some task by our hourly rate and return the sum.

JSON Support

Obviously, representing JSON data within an XQuery has become straightforward using maps and arrays. The function fn:parse-json takes a string of JSON data and returns either a map (for a JSON object), an array, an atomic value (xs:string, xs:double for numbers or xs:boolean), or the empty sequence (corresponding to null in JSON):

let $json := '{"user": "bob", "id": 10, "valid": true}' let $user := parse-json($json) return $user?id

Note that by default parse-json is rather strict about the JSON syntax. For example, strings must use double quotes and duplicate keys generate an error. You can tell the function to be more relaxed about the JSON syntax by passing in a map of options:

let $json := "{'user': 'bob', 'id': 10, 'valid': true}" let $options := map { "liberal": true(), "duplicates": "use-last" } let $user := parse-json($json, $options) return $user?id

To see the function in action on a real-world example, assume we would like to retrieve a list of commits from a git repository, using the HTTP/JSON API provided by github:

xquery version "3.1"; import module namespace http="http://expath.org/ns/http-client"; declare function local:log($json as array(*)) { <table> { for $entry in $json?* let $commit := $entry?commit return <tr> <td>{$commit?committer?date}</td> <td>{$commit?committer?name}</td> <td>{$commit?message}</td> </tr> } </table> }; let $url := "https://api.github.com/repos/eXist-db/exist/commits?since=2015-01-01T00:00:00Z" let $request := <http:request method="GET" href="{$url}" timeout="30"/> let $response := http:send-request($request) return if ($response[1]/@status = "200") then local:log(parse-json(util:binary-to-string($response[2]))) else ()

Here we're using the httpclient module to talk to the github API, which gives us more control over the communication. But there's also a simpler approach, using the fn:json-docfunction:

let $url := "https://api.github.com/repos/eXist-db/exist/commits?since=2015-01-01T00:00:00Z" return local:log(json-doc($url))

fn:json-doc retrieves the contents of the given URI and parses them using fn:parse-json. It works with external resources as well as binary documents stored in eXist. To access stored resources, just use a local path, e.g. /db/test/data.json.

Serialization

eXistdb has supported serialization to JSON for several years, but the old serializer was based on mapping an XML query result to JSON, which caused some difficulties at times, e.g. if you had to produce an array for a certain property, even if it was empty. Contrary to this, the new JSON output method defined by the XQuery 3.1 Serialization spec is straightforward: it takes an array, map, atomic value or empty sequence and produces valid JSON.

The JSON serializer is selected if you set the serialization option method to json. This applies to both, the old and the new serializer. To distinguish between the two while preserving backwards compatibility, we use the following convention:

  • if the sequence to serialize is a single XML element node, the old serializer is used
  • if the sequence contains more than one item or the single item is not an XML element, it will be passed to the new serializer

This convention allows us to run all the old code unchanged without violating the 3.1 specification too much (according to the specs, a single XML element would be serialized to a JSON string).

To see the serializer in action, use the fn:serialize function:

xquery version "3.1"; declare namespace output="http://www.w3.org/2010/xslt-xquery-serialization"; let $array := map { "k1": array { "v1", "v2" }, "k2": "v3" } return serialize($array, <output:serialization-parameters> <output:method>json</output:method> </output:serialization-parameters>)

or save the query and define the serialization method as an output option:

xquery version "3.1"; declare namespace output="http://www.w3.org/2010/xslt-xquery-serialization"; declare option output:method "json"; declare option output:media-type "application/json"; map { "k1": array { "v1", "v2" }, "k2": "v3" }

Other Functions Using Arrays

Some applications require calling a function dynamically without knowing the number of arguments it takes in advance. Without arrays, this had been rather difficult to solve: because sequences cannot be nested, passing arguments containing more than one item has been tricky. For example, we solved this in the templating module by using function items. The code becomes rather bloated though.

The newly added fn:apply function makes this straightforward. It takes a function item as first argument and an array containing the parameters as second:

fn:apply(sum#1, [(1, 2, 3)])

Availability

The new features will be available in the eXistdb 2.3 release, but we encourage users to help us testing. We tried to preserve backwards compatibility with existing XQuery code, so most, if not all, apps should work as before.

To experiment with arrays and JSON, just build from source or use a nightly. You may look through the test cases for some inspiration.

Finally, I also recommend watching Dannes' presentation on Mongrel: the MongoDB extension driver for eXistdb, which will rely on the features described in this article.

eXist-db preconference day @ XML Prague 2015

Dear eXistentialist,

if it has escaped you there will be an eXist-db meetup @ XML Prague 2015 according to our tradition.The preconference day is Friday February 13th, see our eXist-db info at http://preconference.info/xmlprague2015/index.html.

We already have a handful confimed presenters wanting to share stuff with you, but the more the merrier and time flies, so if you wish to speak, present, ask or demo something please contact us at mailto:info@preconference.info so that I can put your contribution in the programme.

The preliminary programme is to be found on http://xmlprague2015.preconference.info/xmlprague2015/program.html

Since the preconference day is an official part of the XML Prague conference you need to register for the conference too at http://www.xmlprague.cz/conference-registration/ if you have not already done so.

Welcome!

Leif-Jöran