[NAS-2612] Jobs Receiving Second DONE Message Created: 29/Mar/17  Updated: 07/Jun/17  Resolved: 22/May/17

Status: Resolved
Project: NetarchiveSuite
Component/s: Harvest Definition
Affects Version/s: None
Fix Version/s: 5.3.1

Type: Bug Priority: Critical
Reporter: Colin Rosenthal Assignee: Søren Vejrup Carlsen (Inactive)
Resolution: Fixed  
Labels: None
Remaining Estimate: Not Specified
Time Spent: 8m
Original Estimate: Not Specified

Attachments: Text File SendCrawlStatusMessage.java.txt    
External reference:

https://sbprojects.statsbiblioteket.dk/jira/browse/NARK-1235

Sprint: NAS 5.3.1
Verification:

Tested using the script attached to this issue.
And called using
java -Ddk.netarkivet.settings.file=/path/to/settings_HarvestJobManagerApplication.xml testtools.SendCrawlStatusMessage $JOBID

Where JOBID refers to job in status DONE or FAILED


 Description   

How does a Job send a "DONE" message twice? And is it the correct behavious to set such a job to "FAILED"?



 Comments   
Comment by Søren Vejrup Carlsen (Inactive) [ 07/Jun/17 ]

When using this program we get the string "Received unexpected CrawlStatusMessage for job 1 with new status DONE, current state is DONE. Marking job as DONE. Reported harvestErrors on job: null" appended to the harvest errrors column of the job details page

Comment by Søren Vejrup Carlsen (Inactive) [ 07/Jun/17 ]

Attached test-program that given a jobID looks the job up
and sends a CrawlStatusMessage to the HarvestJobManager with the same JobStatus and an empty DomainHarvestReport

Comment by Søren Vejrup Carlsen (Inactive) [ 03/May/17 ]

The classes in question are:

./harvester/harvest-scheduler/src/main/java/dk/netarkivet/harvester/scheduler/HarvestSchedulerMonitorServer.java
./harvester/harvest-scheduler/src/test/java/dk/netarkivet/harvester/scheduler/HarvestSchedulerMonitorServerTest.java
Comment by Søren Vejrup Carlsen (Inactive) [ 02/May/17 ]

No, we shouldn't mark job as failed, which is already marked as done.
This seems to be a case of the job being reported more than once by mistake

In the case of job status being done, we could argue for ignoring the incoming data or keeping the data but sending a notification in this case

Comment by Søren Vejrup Carlsen (Inactive) [ 02/May/17 ]
Class:                  com.sun.messaging.jmq.jmsclient.ObjectMessageImpl
getJMSMessageID():      ID:4666-130.226.228.76(cc:89:1b:77:90:7)-36918-1490342357804
getJMSTimestamp():      1490342357804
getJMSCorrelationID():  null
JMSReplyTo:             null
JMSDestination:         PROD_COMMON_THE_SCHED
getJMSDeliveryMode():   PERSISTENT
getJMSRedelivered():    true
getJMSType():           null
getJMSExpiration():     0
getJMSPriority():       4
Properties:             null
09:09:49.699 INFO  d.n.h.scheduler.JobSupervisor - 0 jobs has been resubmitted.
09:09:49.734 TRACE d.n.common.distribute.JMSConnection - Unpacked message 'CrawlStatusMessage:
JobID: 272651
StatusCode: DONE
dk.netarkivet.harvester.harvesting.report.LegacyHarvestReport@1e8b8f8
ID:4666-130.226.228.76(cc:89:1b:77:90:7)-36918-1490342357804: To PROD_COMMON_THE_SCHED ReplyTo PROD_COMMON_ERROR OK'
09:09:50.702 WARN  d.n.h.s.HarvestSchedulerMonitorServer - Received CrawlStatusMessage for job 272651 with new status DONE, current state is DONE. Marking job as FAILED
09:09:50.703 WARN  d.n.h.s.HarvestSchedulerMonitorServer - Job 272651 failed: Received CrawlStatusMessage for job 272651 with new status DONE, current state is DONE. Marking job as FAILED
Generated at Wed Apr 24 06:35:59 CEST 2024 using Jira 9.4.15#940015-sha1:bdaa9cbecfb6791ea579749728cab771f0dfe90b.