Hello,
I’m using akka cluster sharding feature.
I have two nodes. one is worker node and the other one is proxy node.
Configuration and the code of starting node are as follows.
For proxy
ClusterSharding.get(actorSystem)
.startProxy(
"MyActor",
Optional.empty(),
new myShardMessageExtractor()
);
akka.cluster {
roles = ["MyProxy"]
sharding {
role = "MyManager"
}
}
For worker
ClusterSharding.get(actorSystem)
.start(
myActor.SHARD,
Props.create(GuiceInjectedActor.class, injector, myActor.class),
ClusterShardingSettings.create(actorSystem),
MyActor.shardExtractor(),
new ShardCoordinator.LeastShardAllocationStrategy(1, 3),
new StopActor()
);
akka.cluster {
roles = ["MyManager"]
sharding {
role = "MyManager"
}
}
I usually start worker first and proxy later.
But when I try to replace worker node, my service looks not work properly.
This is what I did.
- There are 1 proxy and 1 worker.
- Depoy new worker node. There are 1 proxy and 2 worker at that time.
- It works good.
- Destroy old worker.
- There are 1 proxy and worker node. and there is no unreachable node.
But my service didn’t work. It looks something was wrong.
I found below log as soon as destroy old worker node.
logger=EmptyLocalActorRef, tn=myActor-akka.actor.default-dispatcher-13, message=Message [akka.cluster.sharding.ShardCoordinator$Internal$RegisterProxy
] from Actor[akka://myActor/system/sharding/myProxy#-1187429952] to Actor[akka://myActor/system/sharding/myCoordinator/singleton/coordinator] was not delivered. [9] dead letters encountered. If this is not an expected behavior t
hen Actor[akka://myActor/system/sharding/myCoordinator/singleton/coordinator] may have terminated unexpectedly. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-lette
rs-during-shutdown'.,
And then proxy node looks that it try to register coordinator itselves even new worker node is exist.
logger=ShardRegion, tn=myActor-akka.actor.internal-dispatcher-4, message=my: Trying to register to coordinator at [ActorSelection[Anchor(akka://je
judo/), Path(/system/sharding/myCoordinator/singleton/coordinator)]], but no acknowledgement. Total [1] buffered messages. [Coordinator [Member(address = akka://myActor@10.26.93.192:2552, status = Up)] is reachable.],
If I want to make my service healthy, I have to restart proxy node.
Is there a safe way to replace worker node?