SuccessChanges

Summary

  1. dedupIndexer can now send progress info to hadoop and thus hopefully (commit: 936944d28b6add356f10837d7f4b1f5f3f8efa39) (details)
  2. Fixed bitmag getfileids and some cleanup (commit: 520e4de5e267e3af2d40fa6e78803c15a33df510) (details)
  3. Writing direct to hdfs. (commit: 2ab46b46fece97462f8000d14ddcf3017c77dd2c) (details)
  4. Added direct output streaming from hdfs (commit: 480e7411841e986bc48e2bc9fa7d5f759f5eead7) (details)
  5. Fixed some issues with holding large hadoop result sets in memory (commit: c5d6c5d38bb0cfe52a15493ccde9fa24a07e6a8d) (details)
Commit 936944d28b6add356f10837d7f4b1f5f3f8efa39 by Asger Askov Blekinge (abr)
dedupIndexer can now send progress info to hadoop and thus hopefully
prevent timeouts
(commit: 936944d28b6add356f10837d7f4b1f5f3f8efa39)
The file was modifiedwayback/wayback-indexer/src/main/java/dk/netarkivet/wayback/hadoop/DedupIndexer.java
The file was addedwayback/wayback-indexer/src/main/java/dk/netarkivet/wayback/hadoop/ProgressableOutputStream.java
Commit 520e4de5e267e3af2d40fa6e78803c15a33df510 by Colin Rosenthal (csr)
Fixed bitmag getfileids and some cleanup
(commit: 520e4de5e267e3af2d40fa6e78803c15a33df510)
The file was modifiedcommon/common-core/src/main/java/dk/netarkivet/common/distribute/bitrepository/action/getfileids/GetFileIDsAction.java
The file was modifiedwayback/wayback-indexer/src/main/java/dk/netarkivet/wayback/indexer/WaybackIndexer.java
The file was modifiedwayback/wayback-indexer/src/main/java/dk/netarkivet/wayback/indexer/ArchiveFile.java
The file was modifiedwayback/wayback-indexer/src/main/java/dk/netarkivet/wayback/indexer/FileNameHarvester.java
The file was modifiedcommon/common-core/src/main/resources/dk/netarkivet/common/settings.xml
The file was modifiedcommon/common-core/src/main/java/dk/netarkivet/common/CommonSettings.java
The file was modifiedwayback/wayback-indexer/src/main/java/dk/netarkivet/wayback/indexer/IndexerQueue.java
The file was modifiedcommon/common-core/src/main/java/dk/netarkivet/common/distribute/bitrepository/action/getfileids/GetFileIDsEventHandler.java
The file was modifiedcommon/common-core/src/main/java/dk/netarkivet/common/utils/service/FileResolverRESTClient.java
Commit 2ab46b46fece97462f8000d14ddcf3017c77dd2c by Colin Rosenthal (csr)
Writing direct to hdfs.
(commit: 2ab46b46fece97462f8000d14ddcf3017c77dd2c)
The file was addedwayback/wayback-indexer/src/main/java/dk/netarkivet/wayback/hadoop/CDXStrategy.java
The file was modifiedwayback/wayback-indexer/src/main/java/dk/netarkivet/wayback/indexer/ArchiveFile.java
Commit 480e7411841e986bc48e2bc9fa7d5f759f5eead7 by Colin Rosenthal (csr)
Added direct output streaming from hdfs
(commit: 480e7411841e986bc48e2bc9fa7d5f759f5eead7)
The file was modifiedwayback/wayback-indexer/src/main/java/dk/netarkivet/wayback/indexer/WaybackIndexer.java
The file was modifiedcommon/common-core/src/main/java/dk/netarkivet/common/utils/hadoop/HadoopJobTool.java
The file was modifiedcommon/common-core/src/main/java/dk/netarkivet/common/utils/hadoop/HadoopJobUtils.java
The file was modifiedwayback/wayback-indexer/src/main/java/dk/netarkivet/wayback/indexer/ArchiveFile.java
Commit c5d6c5d38bb0cfe52a15493ccde9fa24a07e6a8d by Colin Rosenthal (csr)
Fixed some issues with holding large hadoop result sets in memory
(commit: c5d6c5d38bb0cfe52a15493ccde9fa24a07e6a8d)
The file was modifiedharvester/harvester-core/src/main/java/dk/netarkivet/viewerproxy/webinterface/Reporting.java
The file was modifiedharvester/harvester-core/src/main/java/dk/netarkivet/harvester/indexserver/RawMetadataCache.java
The file was modifiedcommon/common-core/src/main/java/dk/netarkivet/common/utils/hadoop/HadoopJobUtils.java
The file was modifiedwayback/wayback-indexer/src/main/java/dk/netarkivet/wayback/indexer/ArchiveFile.java