Details
-
New Feature
-
Resolution: Fixed
-
Minor
-
None
-
3.8
-
None
Description
It is currently possible to disable deduplication in the harvesting jobs by removing the DeDuplicator from the default heritrix template or just disabling the DeDuplicator.
But the deduplication indices will in any event be requested and fetched from the
index server.
The quick fix is only to fetch the deduplication index from the index server if
deduplication is enabled in the heritrix template used by the harvesting job.
The more general solution is probable to have a deduplication on/off field in the 'harvestdefinitions' table.