Work pulling: unexpected RequestNext messages

manuelp · December 2, 2020, 1:43pm

Despite the never failing workers, I can still reproduce the above situation. Given:

One node
One producer, backed by a durable queue
Two workers
One (long-running) job

When I:

Start the node
Submit the job
Stop the node when the submitted job is still in progress.

I get the following warning:

[WARN] [2020-12-02 14:21:15,217] | [saas-akka.actor.default-dispatcher-5] | a.a.CoordinatedShutdown: Coordinated shutdown phase [actor-system-terminate] timed out after 10000 milliseconds

So I think it’s reasonable to assume that the worker couldn’t send the ConsumerController.Confirmed message. Then, when I start again the node, the WorkPullingProducerController restarts, reloading the unconfirmed message. This is what I expected:

Both workers starts, registering themselves to the WorkPullingProducerController.
The WorkPullingProducerController receives two requests from them. Since there is an unconfirmed message to re-send, it sends the unconfirmed message to one of the workers, and sends a RequestNext to the producer for the other one. From what I understand, this shouldn’t change even if I had 100 workers: only one RequestNext at a time is actually sent to the producer, waiting for a reply to it from the producer before sending the next one.

However, after activating the LogMessagesInterceptor, I see the following messages exchange as soon as I start the node again:

[DEBUG] [2020-12-02 14:21:00,866] | [saas-akka.actor.default-dispatcher-21] | a.a.t.i.LogMessagesInterceptor: actor [akka://saas/system/singletonManagerjobs-manager/jobs-manager] received message: WrappedRequestNext(RequestNext(Actor[akka://saas/system/singletonManagerjobs-manager/jobs-manager/producer-controller/$$a-adapter#1182625933],Actor[akka://saas/system/singletonManagerjobs-manager/jobs-manager/producer-controller#580539556]))
[DEBUG] [2020-12-02 14:21:00,955] | [saas-akka.actor.default-dispatcher-19] | a.a.t.i.LogMessagesInterceptor: actor [akka://saas/user/rendition-worker-2] received message: WrappedDelivery(Delivery(GenerateRendition(ConversionJob("test"))),Actor[akka://saas/user/rendition-worker-2/consumer-controller#216793132],jobs-manager-dd7a608d-c606-4e6f-8244-d2efb0d7ca98,1))
[DEBUG] [2020-12-02 14:21:01,241] | [saas-akka.actor.default-dispatcher-21] | a.a.t.i.LogMessagesInterceptor: actor [akka://saas/system/singletonManagerjobs-manager/jobs-manager] received message: WrappedRequestNext(RequestNext(Actor[akka://saas/system/singletonManagerjobs-manager/jobs-manager/producer-controller/$$a-adapter#1182625933],Actor[akka://saas/system/singletonManagerjobs-manager/jobs-manager/producer-controller#580539556]))

Result: the producer receives first RequestNext, then another one before the first has been used to submit more jobs. So it fails, restarts and gets back both RequestNext messages one after the other:

[DEBUG] [2020-12-02 14:21:01,272] | [saas-akka.actor.default-dispatcher-21] | a.a.t.i.LogMessagesInterceptor: actor [akka://saas/system/singletonManagerjobs-manager/jobs-manager] received message: WrappedRequestNext(RequestNext(Actor[akka://saas/system/singletonManagerjobs-manager/jobs-manager/producer-controller/$$a-adapter#1783057446],Actor[akka://saas/system/singletonManagerjobs-manager/jobs-manager/producer-controller#1183803959]))
[DEBUG] [2020-12-02 14:21:01,302] | [saas-akka.actor.default-dispatcher-21] | a.a.t.i.LogMessagesInterceptor: actor [akka://saas/system/singletonManagerjobs-manager/jobs-manager] received message: WrappedRequestNext(RequestNext(Actor[akka://saas/system/singletonManagerjobs-manager/jobs-manager/producer-controller/$$a-adapter#1783057446],Actor[akka://saas/system/singletonManagerjobs-manager/jobs-manager/producer-controller#1183803959]))

Both the code in the documentation and the more extensive sample code assume that this should never happen. So my questions are:

Why this is happening?
How should I manage those cases?

Topic		Replies	Views
Akka work pull multiple producers support Akka Cluster akka , java , akka-cluster	2	501	January 19, 2023
Distributed Workers Sample - Confirmation to WorkManager leads to dead-letters for long-running work items Akka Libraries akka , akka-typed , scala , akka-cluster	2	777	July 3, 2020
Why is my source producing more than the buffer size? Akka Libraries scala , streams	1	1097	June 14, 2018
Akka streams graph stage stuck if onPull doesn't push Akka Streams & Alpakka	4	857	November 19, 2020
Google Pub/Sub stops pulling randomly Akka Streams & Alpakka	2	1121	April 26, 2019

Work pulling: unexpected RequestNext messages

Related topics