Details
-
Task
-
Resolution: Fixed
-
Minor
-
None
-
None
-
None
-
Uncertain
-
NetarchiveSuite-5.0-Alpha
Description
This task includes the writing of extensive documentation on how to migrate.
Beware that already generated jobs that are in one of the states (submitted, started or failed) will not be resubmittable. So it is a good idea to minimise this number, for instance by disabling all active harvestdefinitions.
Consider adding a versioning schema to the "order_templates" table This versionnumber is then added to the jobs table to supplement the fields "orderxml" and "orderxmldoc"
Note that Heritrix 3.0.0 comes with a tool for converting Heritrix 1.X templates to Heritrix 3.X templates. The following is taken from the Heritrix 3 release notes.
Migration Tool Limited
The current migration utility, executable class org.archive.crawler.migrate.MigrateH1to3Tool, only works reliably for changed basic order.xml configuration values as reflected in our bundled default configuration. Also, it makes no effort to convert H1 per-domain/per-host settings overrides. By providing a model H3-based configuration with some values brought over, and reporting the values it cannot convert, it may still provide a useful base for other hand-conversion. Please let us know what advanced configuration in your Heritrix1 crawls is most important to support in the automated tool.