The Bitrepository platform is used for longterm preservation of the newspaper data

The bitrepository for newspapers will consist of:

In-house wiki description of the Bitrepository:


The bitrepository ingester takes care of the archiving of the jp2 files into the bitrepository archive. This is done by traversed the batch structure and for each jp2 files perform the following steps:

  1. Ensure that files are kept online for processing during the ingest process.
  2. Generated a unique FileID identifying the file in the repository. 
  3. Ingest the file into the bit repository verifying the ingest using the checksum for the file.
  4. Register the archived file in the DOMS system.



The releaser handles the task of releasing files from their forced online state.

This should happen when a batch has been approved and all currently needed processing of the files has been completed.

Data processing

As part of the general architecture files are processed when they have been ingested in the repositories. I.e. Bitrepository for datafiles (jpeg2000) and DOMS for metadata.

A tape based pillar normally gives no guarantees of when files are online, which poses a problem when using it for processing. To solve this problem the tape backend api (layer 2, python) is being extended with additional methods so that:

The api methods for the tape backend is:

Additionally the contract is that if a force-online call fails (i.e. the online files qouta is exceeded), it will still be possible to ingest files in the bitrepository, they will however be rolled out on tape by the usual policy.