søren vejrup carlsen <> in NetarchiveSuite-Github

Set version to 5.3.2-RC2

  1. … 25 more files in changeset.
Added the jobscheduler-synchronization fix to 5.2.3 branch

Work on NAS-1722 - finished implementing the modified WaybackIndexer. Some cleanup probably needed

Work related to NAS-2722. Made tool to generate DeduplicationCDX'es that does lookup in the deduplicationmigration entries if it exists

Merge pull request #74 from netarchivesuite/NAS-2702

Nas 2702

NAS-2710 minor change after review

Spring-cleaning of the NetarchiveResourceStore class: removed unused code, and unused imports

Merge pull request #72 from aponb/NAS-2710

NAS-2696 - changing copyright line to end with -2018 instead of -2017

  1. … 1166 more files in changeset.
NAS-2712 - improve logging in cacheData method. Add url and mimepattern to the request to distinguish between the cdx and crawlog-request

NAS-2702 - improved logging, and some refactoring

NAS-2702 - changed default maxTotalSize to 8000000. Added a setting to switch between the two different ways of fetching domainconfigs for a iterative snapshot harvest. The old way is the default. Finally, changed a log.debug statement to log.trace in Domain.setCrawlerTraps method

NAS-2712 - fixing of this is postponed to future release. The changed logging clarifies the problem

Set pom version to 5.3.2-RC2

  1. … 25 more files in changeset.
Update RawMetadataCache to be compliant with internal compression requirements for indexing

Make sure that BitarchiveMonitorTester doesn't fuck up unittests

Follow-up - NAS-2704

Removed useless synchronization from method signature (NAS-2702)

Removed 5 instances of jobgenerationperiode from test data (NAS-2704)

NAS-2702 - added extra logging to checkSpecificAcceptConditions() method

Fixed NAS-2711 - minor improvements to our GUI-pages

NAS-2702 - 1) Changed some logging from log.debug to log.trace. 2) Corrected the DomainDBDAO.getHarvestInfoForDomainInHarvest() method to select the latest entry, if multiple entries for the same domainconfig in the same harvest (can happen, if the job is resubmitted)

NAS-2702 - Added missing DISTINCT to DomainDBDAO#getDomainsInSnapshotHarvestOrder() method

Added more logging to the AbstractJobGenerator to log the two reasons why canAccept() returns false - NAS-2702

Merge pull request #71 from scheylord/master

Fixing NAS-2650 & NAS-2649

NAS-2702 - optimization of the algorithm - start with filtering away domains absent from the historyInfo for previous harvest

Work on NAS-2702. Fixed missing synchronization in JobGeneratorTask. Reduced logging-noise in DomainDBDAO.readKnown()

Fix minor issue with setting settings.harvester.scheduler.jobgenerationperiode (NAS-2704)

NAS-2702 - work on fixing the job generation for a snapshot harvest continuing a previous harvest