Hello, I am trying out Akka Persistence and have run into a problem regarding concurrent access to the same persistent actor. I have a simple persistent actor called CustomerActor that has some simple lab purpose functionality, for the sake of explanation let’s call it property A the state of which get altered via a command and persisted using event sourcing.
The application accepts concurrent input meaning that if I in the code access the CustomerActor like this:
val customerActor = system.actorOf(Props(classOf[CustomerActor], "SE",id), CustomerActor.name("SE", id))
where the code can execute concurrently (val customerActor is a locally scoped value so for two concurrently executing calls there will be two instances of CustomerActor each of which is event sourced from the persistent storage. I was actually unaware of this until I discovered unexpected results doing highly concurrent computing on the same entity.
Now, if I do
val customerActorFuture = system.actorSelection("user/customeractor.SE.{$id}").resolveOne()
val customerActor = Await.result(customerActorFuture, timeout.duration)
I will use the same instance and everything is fine if no race condition occurs and the time looking up the CustomerActor is short enough.
My question is: Is there is a state of the art standard for doing this that I am missing? I need to ensure that I use exactly one instance of the persistent actor which is tone of the core feature we want using actors for this. Therefore I actually expected system.actorOf to have the semantics “create a new intance only of there is not already one” safeguarding against the obvious race condition if I use actorSelection (two threads “simultaneously” discovering that there is no actor and therefore creating on each.
Grateful for a clarifying response to this question.
I made a silly mistake that ruined the application behaviour: Of the two actor references obtained using system.actorOf
I forgot to use the name argument in the second one.
Of course this cannot work since then the name will be generated by Akka and there will be two distinct instances of the CustomerActor with the same persistence id.
Now I corrected the code to use the name parameter in both instances and then I get the expected behaviour: create the actor only if it does not already exist with the stated name.
Good that you figured it out.
Akka Persistence is commonly used together with Akka Cluster Sharding which takes care of never having more than one of each persistence id alive (usually across a multi node cluster, but could also be used in with a single node cluster), so that may also be interesting to look into: https://doc.akka.io/docs/akka/current/cluster-sharding.html#dependency
Yes, I am aware of the cluster sharding feature and have have read through the documentation. However, I didn’t notice any information about this specific perspective that I ran across. I.e. that there are two distinct ways for a client to obtain an ActorRef (system.actorOf and system.actorSelection).
Conceptually, cluster sharding can be seen as a enabler for the property that if an actor exists, it exists in exactly one shard.
I don’t know all the details about cluster sharding, but hopefully the following question makes sense:
Will I use the same logic when using cluster sharding (i.e. first trying system.actorOf and if the actor already exists use system.actorSelection)? As far as I understand, the fact that cluster sharding is used only changes the situation regarding actor location. Given that actor A0 is located to shard S0 the same question about creation vs selection applies. System.actorOf will succeed in my code iff the actor is either never created or needs to be restarted. If I take the system down and then restart it I will need to use system.actorOf to load the persistent actor from storage, so the client obtaining the actorRef can never know which of system.actorOf and system.actorSelection will succeed.
Hope this was understandable.
With cluster sharding you hand off the reponsibility of spawning and finding the actors to the sharding tools and only provide a way to map messages to an identity and that identities to a shard id, so you would not user either of actorOf
and actorSelection
but will instead send the messages through the sharding infra. Sharding will revive the sharded entity on the first message to it, unless you enable the rememberEntities
feature which keeps track of actors that were alive and restart them without first having a message for them.
I wouldn’t call actorOf
a way to obtain a client, it is a way to start an actor, a client to the system should likely not start actors on its own or you will get a system that is hard to understand.
actorSelection
better matches “a way to obtain a reference”.
Note that it is often (but not always) better to avoid selections if possible since they are string based and couples the implementation and the client a bit too tight. An alternative would be passing messages through a single ActorRef
parent that can take care of spawning children if needed before forwarding messages to the right child. This allows you to later change the hierarchy of actors without breaking the client code.
OK, thanks for the reply. Then it seems the cluster sharding infrastructure will solve this particular issue the way I expected.
" wouldn’t call actorOf
a way to obtain a client, it is a way to start an actor, a client to the system should likely not start actors on its own or you will get a system that is hard to understand."
With “client to obtain an ActorRef” I meant any code that sends a message to the actor. I did not mean a client external to the application, so I believe we are aligned in this regard.
Just to make sure I don’t miss something important: Do you mean that the parent actor would simply use context.child(name:String)
) instead of the cumbersome actorOf
followed by actorSelection
?
That sounds indeed like an attractive alternative compared to my two step process, just want to make sure I interpret you correctly.
Yes, context.child(name)
returns an option, and the parent can react on None
by starting a child if that is the behavior you want. Since only one message is processed at a time you know that there is no race in creating a child, note that there could be in stopping though, you could get a reference back that has stopped itself before you reach the point where you send it a message.
If the children needs to be adressed through some other kind of key than a string, or you do not want to put that string in the path of the child a common pattern is to keep a Map[Key, ActorRef]
map in the parent actor (watching children and removing them when they stop etc).
Ok, thanks a lot for these advice.