Skip to end of metadata
Go to start of metadata

This is a system for exporting recently-updated (or created) DOMS entries as XML files that can be imported to LARM. The source code is maintained at https://github.com/statsbiblioteket/larm-doms-exporter .

Export Format

A sample xml export file, together with relevant clarification of the various elements it contains, is kept in the codebase. See the file src/main/resources/CHAOS_envelope_template_test.xml .

DOMS Export Logic

The basic logic is based on that used in the Broadcast Transcoder Application. It uses a database-backed object-store as a persistent queue/log. Each entry in the persistence layer is a tuple: (doms-pid, doms-timestamp, last-exported-timestamp, status). status has values (PENDING, REJECTED, FAILED, COMPLETE). (For objects in status COMPLETE, the two timestamps should be equal.)

The logic has producer and consumer phases.

Producer Phase:
  1. Find the most recent doms-timestamp for objects in status COMPLETE. This is the last consumed object.
  2. Query DOMS for a list of all objects and last-updated timestamps with timestamps after this (in the relevant Collection).
  3. For each entry retrieved from DOMS, if there is already a store-entry with this doms-pid, update its doms-timestamp and set its status to PENDING.
  4. Otherwise create a new store-entry for this object, with last-exported-timestamp set to null.
Consumer Phase:

Fetch from the object-store an ordered list of all PENDING exports and put them in a queue, oldest doms-timestamp first.

For each entry in the queue:
	if the object is not a radio program:
		mark as REJECTED and move to next object
    else if the object has no shard analysis
		delete the object from the database
	else if the object is REJECTED or FAILED in the bta database
		mark as rejected and move to next object
	else if the object is PENDING or missing(*) in the bta database
		leave as PENDING in LDE database and move to next object
	else (ie object is COMPLETE in bta database) 
		if there is a significant change in DOMS object between doms-timestamp and last-exported-timestamp
			export object
		update last-exported-timestamp to doms-timestamp
		mark as COMPLETE		

\* Note that objects in the bta database are only consulted if there is a shard analysis. Therefore we do not usually expect to find objects missing in the bta database.

 

 

 

  • No labels

2 Comments

  1. Performance-wise the producer part of this chain can be vastly improved if we just pull information from the bta database instead of DOMS. So in step 2 of the producer phase, we query bta for all objects in state COMPLETE and with doms-timestamp after the most recently exported object from lde.

  2. On 2nd thoughts we also need to take all those in state PENDING from bta because of the way bta processes its queues so there can be objects in state PENDING older than those in state COMPLETE.