How is entityId past to entity in cluster sharding?

It’s piece picked from the document of Akka cluster sharding. I get confused with how is the actual value of parameter entityId past to context.

class HelloWorldService(system: ActorSystem[_]) {
  private val sharding = ClusterSharding(system)

  sharding.init(Entity(typeKey = HelloWorld.TypeKey) { entityContext =>
    HelloWorld(entityContext.entityId, PersistenceId(entityContext.entityTypeKey.name, entityContext.entityId))
  })

  def greet(worldId: String, whom: String): Future[Int] = {
    val entityRef = sharding.entityRefFor(HelloWorld.TypeKey, worldId)
    val greeting = entityRef ? HelloWorld.Greet(whom)
    greeting.map(_.numberOfPeople)
  }
}

object HelloWorld {
  sealed trait Command extends CborSerializable

  val TypeKey: EntityTypeKey[Command] = EntityTypeKey[Command]("HelloWorld")

  def apply(entityId: String, persistenceId: PersistenceId): Behavior[Command] = {
    Behaviors.setup { context =>
      context.log.info("Starting HelloWorld {}", entityId)
      EventSourcedBehavior(persistenceId, emptyState = KnownPeople(Set.empty), commandHandler, eventHandler)
    }
  }
}

The sharding.init(Entity(entityKey)(createBehavior)) will create entities dynamically. I suppose that sharding.init() is a registering method, the createBehavior will be called once whenever a node creating. The createBehavior is a factory, it picks entityId from the entityContext. And the sharding.entityRefFor(entityKey, entityId) will trigger the call to createBehavior and pass the worldId as entityId to entityContext.

Is it right? Thanks.

sharding.init sets up sharding on a node for the given type of entity, it is run on each cluster node as bootstrapping.

Sharding will start (and passivate) entities on some node of the cluster, when needed, making sure there is at most one entity for a given persistence id at any point in time. By default the creation/start of the entity actor is only triggered by an incoming message for the entity.

Messaging entities running in sharding is done through a sharding proxy actor (usually the “shard region” actor) on the sending node, this proxy actor knows which node the shard the entity belongs to entity is on and forwards messages to the right node. On the right node, sharding keeps track of what shards and in those what entities are alive, if the entity is not yet started, your behavior factory method is invoked to get a Behavior to start.

The EntityRef is a thin facade for this proxy, making it look like a bit like an ActorRef but it does in fact only track the entity id and how to construct an envelope containing the message and the persistence id before handing that to the proxy. This envelope is unwrapped on the receiving side before the message is passed to the entity itself.

The PersistenceId will usually contain both the unique id of the entity and the type of entity, so that multiple types of entities can live in the same journal without entity id conflicts.

Edit: added some clarifications about delivery and the envelope

@johanandren Thank you.

According to the example offered by Lightbend on Github, I suppose the call stack in the simple way as below.

  1. The HelloWorldService registers factory method of entity HelloWorld by sharding.init(), and shows API greet(worldId, whom).
  2. At the beginning, the call to greet(worldId = "mars", whom) will refer to sharding.entityRefFor(HelloWorld.TypeKey, worldId = "mars").
  3. Because no entity corresponding to worldId = "mars" existed, then sharding will pass the "mars" to the factory method in step-1.
  4. The factory passes "mars" to HelloWorld.apply(entityId = "mars", persistenceId), to construct a HelloWorld entity object corresponding to "mars", and shard it in the cluster.
  5. When sharding.entityRefFor("mars") returns entityRef, the entityRef ? HelloWorld.Greet(whom) will ask the entity object with command Greet(whom).
  6. After then, any call to greet(worldId = "mars", whom) will locate the entity object sharded in step-4 by sharding.entityRefFor() directly, but no construction any more.

How about this? Thanks.

Almost correct. What is missing is that the start of the actor is only triggered once the shard receives the message for it, not when the sending side creates the EntityRef. On message send the EntityRef wraps the message in an envelope together with the entity id, and then hands it to the local shard region (or proxy).

The local shard region already knows, or if it doesn’t, communicates with the shard coordinator to find out, which node in the cluster the shard of the recipient belongs to, and forwards the message there.

When the shard receives a message for an entity id that is not currently running, it is started using the factory provided when initialising cluster sharding.

2 Likes

@johanandren Thank you again.

I focus on the deploying micro-services with Docker and Kubernetes which is developed with Akka framework.

Basically, I use the EventSourcedBehavior and an SQL database to implement CQRS pattern in the write side, and publish the entities to the cluster. In the read side, I pull events by the projection with Kafka or any else, and materialize them with a NoSQL database.

The Akka framework reduces the cost of maintaining message queue, concurrent conflict and persistence, etc. I really like it, but I didn’t find any book about Akka Typed except the online official document. The document seems to be a reference, it takes me too much time, maybe more later.

Best regards

Hi @Casperlet
I’m writing Akka in Action 2nd edition with Manning. It’s close to being finished and it has a section on Akka Sharding. We have released that chapter in an early release in case you are interested.

Kind regards,


Fran

@franciscolopezsancho It’s a great message, thank you!

The Akka in Action and Reactive Messaging Patterns with the Actor Model written by Vaughn are my favorites about Akka.

I suppose the CQRS and event sourcing are great solutions for micro-services. I wish we can get more advice about them with Akka in your book, maybe the best practice, ah ha.

By the way, the docker is popular and helpful deployment tool. I wish to get more stories about it and Akka too.

See you.