Details
-
Bug
-
Resolution: Fixed
-
Major
-
5.2.2
-
None
-
NAS 5.3.1, NAS 5.4
Description
This is behaviour we have seen in production. A harvester finishes a long-running job and sends a begins to send "ready" messages that it can now accept new jobs. But HarvestJobManager sends multiple messages which then build up on the JMS queue. Here is an example of how it can happen. A harvester finished a job at 15:01:35.255 and started sending ready-messages every minute:
Message ID Sent Received ID:15929-130.226.228.72 15:01:49.410 15:03:13.017 ID:15932-130.226.228.72 15:02:49.401 15:03:13.169 ID:15935-130.226.228.72 15:03:49.401 15:03:49.401 ID:15938-130.226.228.72 15:04:49.401 15:04:49.401
See that HJM first began to receive messages after nearly one-and-a-half minutes, so it received the first two message almost simultaneously. After that, the messages are delivered almost instantaneously.
We don't fully understand the cause of this. The broker log shows the following
[26/Mar/2017:15:01:49 CEST] Creating new Producer 1112770684487722240 on T:PROD_COMMON_HARVESTER_STATUS_TOPIC for connection 1112770635834038018
So the broker apparently starts work on the first message right away, but still takes 1.5 minutes to deliver it.
However there is apparently a simply workaround. So long as the interval between ready messages is longer than the longest delay for receiving the first message, there should be no problem. This is controlled by the parameter settings.harvester.harvesting.sendReadyInterval . We are now setting this to 300 seconds in production. This should have no negative effect on performance. The first ready message, in the above case, would still be received after 1.5 minutes, exactly as now.