How do I load balance incoming messages onto the nodes of the Akka cluster?

How do I load balance incoming messages onto the nodes of the Akka cluster?

This topic is based on a conversation between an anonymous user of Akka and our “Ask AI about Akka” bot in the documentation.

Akka provides several strategies for load balancing messages across nodes in a cluster. Here are a few of them:

  1. Adaptive Load Balancing: The AdaptiveLoadBalancingPool or AdaptiveLoadBalancingGroup performs load balancing of messages to cluster nodes based on the cluster metrics data. It uses random selection of routees with probabilities derived from the remaining capacity of the corresponding node. It can be configured to use a specific MetricsSelector to produce the probabilities, a.k.a. weights. The selectors include heap, load, cpu, mix, and any custom implementation of akka.cluster.metrics.MetricsSelector. The collected metrics values are smoothed with exponential weighted moving average. You can adjust how quickly past data is decayed compared to new data in the Cluster configuration. Here is an example of how to use it:
int totalInstances = 100;
int maxInstancesPerNode = 3;
boolean allowLocalRoutees = false;
Set<String> useRoles = new HashSet<>(Arrays.asList("backend"));
ActorRef backend =
    getContext()
        .actorOf(
            new ClusterRouterPool(
                    new AdaptiveLoadBalancingPool(
                        SystemLoadAverageMetricsSelector.getInstance(), 0),
                    new ClusterRouterPoolSettings(
                        totalInstances, maxInstancesPerNode, allowLocalRoutees, useRoles))
                .props(Props.create(FactorialBackend.class)),
            "factorialBackendRouter3");

Source

  1. Cluster Aware Routers: Akka also provides cluster aware routers that can distribute messages to actors on different nodes in the cluster with routing strategies like round-robin and consistent hashing. The group routers rely on the Receptionist and will therefore route messages to services registered in any node of the cluster. Source
  2. Consistent Hashing: This strategy uses consistent hashing to select a routee based on the sent message. Consistent hashing delivers messages with the same hash to the same routee as long as the set of routees stays the same. When the set of routees changes, consistent hashing tries to make sure, but does not guarantee, that messages with the same hash are routed to the same routee. Source
  3. Akka Cluster Sharding: Akka Cluster Sharding provides stable routing and rebalancing of the routee actors. Source
  4. Akka Projections: You can also run your projections in a distributed fashion to spread the load over the different nodes in an Akka Cluster. Source

Remember to choose the strategy that best fits your use case.

How to connect to an Akka cluster to send messages?

To connect to an Akka cluster and send messages, you need to follow these steps:

  1. Enable Akka Cluster in your configuration: You need to specify the actor provider as “cluster” in your application.conf file. Here is an example:
akka {
  actor {
    provider = "cluster"
  }
  remote {
    artery {
      enabled = on
      canonical {
        hostname = "localhost"
        port = 2551
      }
    }
  }
  cluster {
    seed-nodes = [
      "akka://cluster-system@localhost:2551"
    ]
  }
}

Source

  1. Create an ActorSystem: An ActorSystem is a heavyweight structure that will allocate 1…N Threads, so create one per logical application.
Config config = ConfigFactory.load("application.conf");
ActorSystem system = ActorSystem.create("cluster-system", config);

Source

This example is using the classic actor api. To use the typed ActorSystem api, please have a look here.

  1. Send Messages: You can send messages to actors in the cluster using the tell method. Here is an example of how to send a message:
ActorRef publisher = system.actorOf(Props.create(SenderActor.class), "publisher");
publisher.tell("hello from microservice A ", null);

Source

  1. Serialization: You have to enable serialization to send messages between ActorSystems (nodes) in the Cluster. Serialization with Jackson is a good choice in many cases, and the recommended option if you don’t have other preferences or constraints.
    Source

  2. Distributed Publish Subscribe in Cluster: If you want to send a message to an actor without knowing which node it is running on, or send messages to all actors in the cluster that have registered interest in a named topic, you can use the DistributedPubSubMediator.
    Source

Remember, Akka Cluster is not a service discovery solution. New nodes need to know the address of at least one existing member of the cluster to join. This can be done with configuration, or Akka Management’s Cluster Bootstrap.

Are Akka HTTP endpoints load balanced?

Akka HTTP itself does not provide load balancing for endpoints. However, when used in a distributed system like Akka Cluster or when deployed on a platform like Kubernetes, you can leverage the load balancing features provided by these systems.

For example, in an Akka Cluster, you can use routers to distribute messages across multiple nodes. These routers can use several strategies like round-robin, random, smallest mailbox, etc., to balance the load across the nodes. Source

When deploying an Akka gRPC service on Kubernetes, you can use a Kubernetes Service of type LoadBalancer to expose a TCP endpoint outside the cluster. This allows you to use the load balancing features provided by Kubernetes. However, there are some limitations, such as being restricted to 1 service per IP and not having advanced features like TLS termination and path-based routing. Source

For gRPC clients, Akka supports client-side load balancing when using the default static or the grpc-dns discovery mechanism. You can set the load-balancing-policy client configuration option to round_robin to enable the round_robin client-side load balancing strategy provided by grpc-java. However, this feature is marked as experimental in grpc-java. Source

Remember, the choice of load balancing strategy depends on your specific use case and the characteristics of your system.