I am using Akka remember entities in Akka Sharding to automatically start the actor in case of failure/ rebalanced. I am using “eventsourced” as remember entities store with Cassandra as event source DB. I found that for every newly created actor there is Shard entry getting created in Cassandra which never gets remove. There are many actors getting created and passivate but those shard entry for actors remains in DB. Is there any way I can delete those shard entry for the passivate actors.
I am using Akka 2.7.0
There is no way around that the event sourced remember entities store need to store store an event for each started entity, and then another once it stops. That is how it remember which entities are running.
It does however store snapshots every 1000 events (configured through setting akka.cluster.sharding.snapshot-after
), it keeps 2 batches of events in between snapshots, for consistency in distributed databases like Cassandra (setting akka.cluster.sharding.keep-nr-of-batches
), but older events are supposed to be deleted.
Note that with Cassandra there are some caveats to read up on and understand regarding deletion and tombstones, see the docs.
Thanks for the clarification.
Does the snapshot setting mentioned above for the shard by default applicable or we need to explicitly
provide the values in application.conf to enable it. Because I can see lots of shard entries in Cassendra ( more than 100 k )
e. g
sharding/AttemptActorShard/15
sharding/AttemptActorShard/16
.
.
.
so on
Yes it is the default.
I think I missed the fact that it is 100k shards though, not many remembered entities for each shard.
There will be one shard store for each shard in your application, keeping track of the started and stopped entities in that shard. If you have configured your cluster to have 100k shards (using number-of-shards
or a custom function that extracts the shard id that returns 100k unique shard ids) that will mean 100k entries.
There is no complete cleanup of shards, they are not expected to come and go like the entities themselves. Only the remembered entity entries for individual shards are snapshotted and deleted.
Note that going as high as 100 000 shards is not a good idea in general even without remember entities, because of the overhead it adds keeping track of so many shards.
I have configured 1000 shards. Entries of remember entities are like ,
sharding/AttemptActorShard/15 seq no 1
sharding/AttemptActorShard/15 seq no 2
sharding/AttemptActorShard/15 seq no 3
.
.
.sharding/AttemptActorShard/16 seq no 1
sharding/AttemptActorShard/16 seq no 2
'.
.
so on
This way the entries are becoming 100k
After that I changed the Shards to 50 as I read it should be 10 factor number of nodes. We are planning to have 5 nodes cluster. Now, the remember entities are not much but I want them to get clean up periodically and not to grow Cassandra.
For some of the actors I see durable state is correct option but Akka documentation says durable state is not supported for Cassendra. Snapshot is the alterative I see for Caasendra.
What is recommended, to go with Cassendra with snapshot or move to jdbc for durable state support ? Is there plan to have support for durable state for Cassendra ?
I am running performance with Akka cluster where I see below issues for remember entities fetching from event sourced DB when Akka Sharding trying to spawn actor. Can someone suggest if there is any tuning required ,
2023-06-13 09:00:55.991 [ERROR] [outbound-campaign-cluster-akka.actor.default-dispatcher-361-510] akka.actor.OneForOneStrategy – Async write timed out after 5.000 s
java.lang.RuntimeException: Async write timed out after 5.000 s
at akka.cluster.sharding.Shard$$anonfun$waitingForRememberEntitiesStore$1.applyOrElse(Shard.scala:685) ~[akka-cluster-sharding_2.13-2.7.0.jar!/:2.7.0]
at akka.actor.Actor.aroundReceive(Actor.scala:537) ~[akka-actor_2.13-2.7.0.jar!/:2.7.0]
at akka.actor.Actor.aroundReceive$(Actor.scala:535) ~[akka-actor_2.13-2.7.0.jar!/:2.7.0]
at akka.cluster.sharding.Shard.akka$actor$Timers$$super$aroundReceive(Shard.scala:410) ~[akka-cluster-sharding_2.13-2.7.0.jar!/:2.7.0]
at akka.actor.Timers.aroundReceive(Timers.scala:52) ~[akka-actor_2.13-2.7.0.jar!/:2.7.0]
at akka.actor.Timers.aroundReceive$(Timers.scala:41) ~[akka-actor_2.13-2.7.0.jar!/:2.7.0]
at akka.cluster.sharding.Shard.aroundReceive(Shard.scala:410) ~[akka-cluster-sharding_2.13-2.7.0.jar!/:2.7.0]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:579) [akka-actor_2.13-2.7.0.jar!/:2.7.0]
at akka.actor.ActorCell.invoke(ActorCell.scala:547) [akka-actor_2.13-2.7.0.jar!/:2.7.0]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270) [akka-actor_2.13-2.7.0.jar!/:2.7.0]
at akka.dispatch.Mailbox.run(Mailbox.scala:231) [akka-actor_2.13-2.7.0.jar!/:2.7.0]
at akka.dispatch.Mailbox.exec(Mailbox.scala:243) [akka-actor_2.13-2.7.0.jar!/:2.7.0]
at java.util.concurrent.ForkJoinTask.doExec(Unknown Source) [?:?]
at java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source) [?:?]
at java.util.concurrent.ForkJoinPool.scan(Unknown Source) [?:?]
at java.util.concurrent.ForkJoinPool.runWorker(Unknown Source) [?:?]
at java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source) [?:?]
I am using Cassandra as event sourced DB and remember entities also event sourced.