NetarchiveSuite-Webdanica

Clone Tools
  • last updated a few seconds ago
Constraints
Constraints: committers
 
Constraints: files
Constraints: dates
Bumped version to 2.1-SNAPSHOT

Fix missing logning in rejectedLog of urls rejected by UrlUtils.isRejectableURL - only duplicates were written to rejectedlog

WEBDAN-269: Tweaked the menu.

removed references to old webdanica-version

WEBDAN-273 - remove existing disabled seeds in webdanicaseeds before updating it

Update on solution for WEBDAN-273 so that existing seeds in webdanica-seeds are also checked up against the current defaultseedlist

Fixed critical bug WEBDAN-276

Pushed version to 2.0

Implemented WEBDAN-273

    • -8
    • +22
    /scripts/cronjobs/check_apps_alive.sh
    • -0
    • +106
    /webdanica-webapp/src/main/webapp/notice_master.html
Fix for WEBDAN-270 - changed the use of a ignoredProtocols list to a acceptedProtocols list

  1. … 8 more files in changeset.
Follow-up after initial accept-test. loadDomains.sh now uses --accepted argument like loadSeeds . PHOENIX_CLIENT_JAR now added to setenv.sh and used by ingestTool.sh. Error in loadDomains.sh fixed. Updates to installation manual after being used to install webdanica in PROD

  1. … 6 more files in changeset.
Forgot to add the improved version of RunNetarchiveSuite.sh

Fixed WEBDAN-266 - now checks the number of NetarchiveSuite apps alive as well

WEBDAN-262 - removed the logging of the ignored cdx-lines. It gives too much noise

Bumped version to 2.0-RC4

Final fix for WEBDAN-262 - the logic in the SingleSeedHarvest.isRecordForJob() method was faulty. An unittest has been added for that method as well

    • -0
    • +288
    /webdanica-core/src/test/resources/batch_result_for_jobid_15.txt
Describing remedy for WEBDAN-263 in workflow install-guide, and remove output if no harvestlogs found

    • -1
    • +5
    /workflow-template/webdanica-analysis-cron.sh
Fix for WEBDAN-262 - changed the way we split up the record key to find the 'jobid' value

Fix for WEBDAN-262 - getReports return too much data

Work on WEBDAN-261 - made a LocalArcRepositoryClientReadableOnly class that shouldn't write privileges

Fixed critical bug WEBDAN-252 - now we check, that settings.common.tempDir exists and is writable by tomcat

Added fixed WEBDAN-257 and updated the bundled crontabs in scripts/cronjobs

    • -2
    • +2
    /workflow-template/verify_pig_bootup.sh
More corrections to the installations manual. moved the sample crontabs to the cronjobs folder

    • -0
    • +12
    /scripts/cronjobs/crontab.test
    • -0
    • +11
    /scripts/cronjobs/crontab.webdanica
Simplified zipball layout. Moved 'scripts' folder including cronjobs to root-folder 'scripts'

    • -0
    • +11
    /scripts/cassandra/create_blacklists.txt
    • -0
    • +68
    /scripts/cassandra/create_criteriaresults.txt
    • -0
    • +19
    /scripts/cassandra/create_harvests.txt
    • -0
    • +25
    /scripts/cassandra/create_ingestlog.txt
    • -0
    • +2
    /scripts/cassandra/create_keyspace.txt
    • -0
    • +28
    /scripts/cassandra/create_seeds.txt
    • -0
    • +23
    /scripts/cronjobs/check_apps_alive.sh
    • -0
    • +14
    /scripts/cronjobs/cleanup_oldjobs.sh
    • -0
    • +2
    /scripts/cronjobs/rebootTomcat
  1. … 68 more files in changeset.
upgraded maven-war-plugin and maven-assembly-plugin to latest version 3.1.0

Removed obsolete templates folder

    • -20
    • +0
    /templates/criteriaRun-combinedCombo-seq.pig
    • -32
    • +0
    /templates/log4j_hadoop-pig.properties
    • -13
    • +0
    /templates/parse-text-extraction.README
    • -13
    • +0
    /templates/parse-text-extraction.sh
    • -20
    • +0
    /templates/scripts/criteriaRun-combo-v1-seq.pig
    • -32
    • +0
    /templates/scripts/criteriaRun-combo.pig
    • -20
    • +0
    /templates/scripts/criteriaRun-comboNov-v1-seq.pig
  1. … 23 more files in changeset.