Dashboard

Ah, didn't notice that index() and indexFile() of course takes different parameters, but I guess the file could still be retrieved to make line 65 go something like indexer.indexFile(warcFile);

Ah, didn't notice that index() and indexFile() of course takes different parameters, but I guess the file could still be retrieved to make line 65 go something like indexer.indexFile(warcFile);

The indexer is dynamically assigned here based on the file type, but the indexer variable is not dynamically assigned in the part at line 65 - any reason for this? https://sbforge.org/fisheye/stati...

The indexer is dynamically assigned here based on the file type, but the indexer variable is not dynamically assigned in the part at line 65 - any reason for this?
If using dynamic assignment the indexer-fields at line 29-30 can be replaced by just one Indexer-class, no?

Cleanup repository: remove unused docker stuff, rewrite README to reflect changes

  1. … 43 more files in changeset.
Integration of Hadoop dedup indexing with GetMetadataArchiveMapper now works - still needs few tweaks though

Dedup Indexing of Metadata Files
Dedup Indexing of Metadata Files
Enable indexing of deduplication records in metadata files. This is done in the same hadoop-class as "ordinary" indexing but with an alternative execution path based on the filename. Also the exist...

Enable indexing of deduplication records in metadata files. This is done in the same hadoop-class as "ordinary" indexing but with an alternative execution path based on the filename. Also the existing (batch job) indexing class is just reused.

Remember to include ServiceUrl.class in list of service classes

Tiny settings change for NARK-1882 review

Added a conf flag to switch between standard indexing and dedup indexing for metadata files.

test corrections excludes .gz

Fix inverted check for existing file (stupid bug introduced in review followup)

Added tests

Changed test to use paths relative to module root

Make it configurable which operation timeout is used in pillar-acceptence tests
Make it configurable which operation timeout is used in pillar-acceptence tests
Make it configurable which operation timeout is used in pillar-acceptence tests

added test methods for archive files and negative testing

Bump message-xml version to 31-SNAPSHOT

Merge branch 'getFileIT' of github.com:bitrepository/reference into getFileIT

Bump message-xml version to 31-SNAPSHOT

Update versioning in examples

  1. … 32 more files in changeset.
Update definition of AllFileIDs to match the actual use of the protocol. This retains backwards compability while updating the specification, but should however be rectified as a contentless element. However as it would be a breaking change, it will be grouped with other breaking changes.

Followup after review

Followup after review

Merge branch 'NARK-1882-hadoop-indexing' of https://github.com/netarchivesuite/netarchivesuite into NARK-1882-hadoop-indexing

Small refactor of ArchiveFile/HadoopUtils, few touch ups and started on metadata job