The documentation says order of messages between two actors is guaranteed because the connection between them is a TCP one. But is the order of messages between two actors with different nodes and Availability Zones guaranteed? Akka typed clusters use EntityRef. I understand this is not a direction connection? Is this mediated by the ShardCoordinator? So it doesn’t guarantee order messages because there is a mediator?
Thank you Patrik. But the initial communication goes through the Shard Coordinator and buffered in the Shard Region when the shard cannot be reached right? If we have Kubernetes setup and the cluster is active while undergoing rolling restart or a pod is destroyed for some reason, then actors are moved around in pods/nodes. In this case two messages of an actor A to actor B might be buffered in different Shard Regions as either or both actors A and B are moved into other pods/nodes. In this case the order cannot be guaranteed right?
I have this scenario please correct me if I’m wrong. We have 3 nodes in the cluster: N1, N2, N3 and their corresponding shard regions: SR1, SR2, SR3. We have 2 actors: A on N1 and B on N2. A sends message M1 to actor B in the cluster. Prior to this SC has informed SR1 that actor B is being hand-offed. So SR1 buffers M1. Then actor A is rebalanced to N2, then A send message M2 to actor B. Say Actor B is still not spawned, so M2 is buffered to SR2. Then we have M1 buffered in SR1 and M2 buffered in S2. If Actor B is rebalanced to N3, there is no guarantee M1 will be delivered prior to M2?
The ordering guarantee is for a single sending thread or actor instance, if the senders are different instances of the actor, or non-actor logic invoking send on different threads, the guarantee does not apply.
Sharding does however drop buffered messages in the case where the buffer could have caused re-ordering during rebalance, (when this happens you can see it logged as a warning Dropping [n] buffered messages to shard [X] during hand off to avoid re-ordering), and the sending instance of A on N1 will first have to shut down before it is started on N2 before sending.
Even if there is no guarantee in for the scenario of different sender actor instances it is at least unlikely that the message from the new instance of A arrives before the message from the old instance because of that buffer drop but also because of coordination around rebalance of A taking a little bit of time.
If you need a complete ordering guarantee for a sharded sender in the face of rebalance I think the sending entity would have to record/persist the intent to send and also an acknowledgement of delivery before it sends the next message.
Thank you Johan. Since I am talking about rebalancing causing the actor to be reinstantiated, would you say dropping the buffers in this case would prevent all the scenarios you know of that could cause reordering of messages between two actors?
Yes, that’s the expectation. The same as for local actors, messages may not arrive (at most once) but they will not be re-ordered. As described in the doc section here: Cluster Sharding concepts • Akka Documentation