I’m using cluster sharding without persist actor. (I have only one shardRegion)
I tried to test 2 nodes with 4 shards and It was good as I expected.
But when I run new test with 1000 shards, akka application didn’t work properly.
Why I try to increase number of shard is I need to make scaleable microservice.
Here is log which I got.
level=WARN , loggingId=, logger=ShardRegion, tn=jejudo-akka.actor.default-dispatcher-3, message=Retry request for shard [818] homes from coordinator at [Actor[akka://akka@10.16.20.226:2552/system/sharding/DeviceCoordinator/singleton/coordinator#-2030528096]]. [1] buffered messages.,
Why this happened?
What is max number of shard and proper value when I have 100M actors?
What’s your concern? It’s just WARNing level, and it’s saying it had to retry (from the code, more than five retries) for a single shard, #818. Does the log message repeat? Even if it does repeat, does it eventually stop?
The recommendation of shard count is based more on cluster size than actor count. 10x the max number of nodes:
As a rule of thumb, the number of shards should be a factor ten greater than the planned maximum number of cluster nodes. Less shards than number of nodes will result in that some nodes will not host any shards. Too many shards will result in less efficient management of the shards, e.g. rebalancing overhead, and increased latency because the coordinator is involved in the routing of the first message for each shard.
EDIT: Classic Cluster Sharding • Akka Documentation
We use 100 shards for 5 nodes with about 50M actors. So we’re a little over-provisioned on the shard count but hasn’t had any issues with it.