Description
Problem with the analysis workflow: at lot of seeds in state "klar til analyse" never exits that state:
One of them is:
https://twitter.com/intent/tweet?via=dmidk&url=http://beta.dmi.dk/en/learn/temaer/klima/fremtidens-ekstreme-vejr/
One should lookup at these cases, what is actually harvested, if anything, and
why the seed is not during analysis moved from state "klar til analyse" to one of states:
"seed færdigbehandlet", "Seed afventer kurator beslutning", "Analysen på seeden gik galt".
Note BTW the high level of entries state "Analysen på seeden gik galt" (2203) which could mean that something is wrong there.
The tool responsible for ingesting the data is the /ingestTool.sh which is the last step of the analysis workflow:
https://github.com/netarchivesuite/webdanica/blob/master/workflow-template/ingestTool.sh
The input to this script are a harvestlogfil and a criteriaresultsdir, where it puts any criteriaresults generated from the data harvested by the harvests mentioned
in the harvestlog
The java code is
https://github.com/netarchivesuite/webdanica/blob/master/webdanica-core/src/main/java/dk/kb/webdanica/core/tools/CriteriaIngestTool.java
https://github.com/netarchivesuite/webdanica/blob/master/webdanica-core/src/main/java/dk/kb/webdanica/core/datamodel/criteria/CriteriaIngest.java
Note that if the harvested seed is not part of the data that has been analysed,
the the seed is put in state "Analysen på seeden gik galt" (Status.ANALYSIS_FAILURE)
Extract from CriteriaIngest#processFile method:
if (addToDatabase && !foundAnalysisOfSeed) { SeedsDAO sdao = daofactory.getSeedsDAO(); Seed s = sdao.getSeed(seed); s.setStatus(Status.ANALYSIS_FAILURE); String harvestname = TextUtils.findHarvestNameInStatusReason(s .getStatusReason()); if (harvestname == null) { harvestname = "N/A"; } s.setStatusReason("Set to status '" + Status.ANALYSIS_FAILURE + "' as we have now no criteria analysis of the seed itself. harvestname is '" + harvestname + "'"); sdao.updateSeed(s); }