or ABESfosant for short.

And because I am a lazy bastard, autonomous components will be called autocomps from here on.

Batches, Events and Items

The word Batch have been overloaded, so it will no longer be used. To replace it, we now use the word Item. An Item is a thing with a history, consisting of Events. Items are the only thing an autocomp can trigger on. Most everything in the DOMS can be an Item, but not all Objects are Items.

Each Item will have the content model ContentModel_Item. It defines (requires) the Item to have an Events datastream. Each Item will also have a "lastModified" timestamp, but this is something all Fedora Objects already have.


Each Autocomp is associated with a Event-name.

An autocomp will have to query for all Items which have

  • a specific type/content model
  • whose lastModified is greater than the timestamp of the named event in the Items history, or, who does not have an EVENTS datastream

Furthermore, one execution should be able to grab a great number of Items to work on. Currently, we can select the number to work on Concurrently, but we will need to also be able to select how many should be locked and queued. This is necessary, as we envisage a lot of the autocomp to be very quick. An autocomp is executed at fixed intervals, usually something like every 5 minutes. It is it vital that, for performance, that most executions must take longer than this interval.


We will have to upgrade to a Solr based search index. Actually, we do not NEED to do so, but that's another story.

Solr gives us the ability to compare two properties of a record in the search query. This is not possible in Lucene. The point of this is that we can query for all Items that have lastModified > lastEventTimestamp. 

We will furthermore have to rework the code that retrieves the results from SBOI, as we currently just retrieve the first 1000 and forget about the rest. We will have to sort the results based on lastModified and only then can we page through them.

Process Monitor

We will have to think about if the standard process monitor is the right way to display the work on the Items. We expect to have in the vicinity of 300k Items when we promote all Editions to Items. 

Published Objects

It would be useful to allow for the EVENTS datastream to be edited on Published Objects. Otherwise, there could end op being a number of calls made, after the event have been timestamped, to get the object published again. Much simpler to say that EVENTS are not protected by the publishing.

  • No labels