So, before addressing the way to setup the Akka cluster, you probably have to start by better understanding your network.
I don’t mean to be argumentative, but your statment “the Kubernetes cluster s can be within a flat network allowing “naturally” pods from different clusters to communicate to each other, so you don’t need to install any additional ‘thingies’.” just isn’t true. Kubernetes clusters cannot communicate with each other “naturally”: Kubernetes CNI is specifically designed to isolate the clusters. Now there are many CNCF projects that are designed to allow for multi-cluster network interconnection. Here is an excellent summary of some of those projects. I mentioned Submariner, and Patrik mentioned Cilium. But there are several others in this article.( I’ve been doing a bit of investigation on Skupper recently which is why I had this article so handy.)
You may not think there are additional ‘thingies’ in your K8S cluster to allow cluster inter-communication, but if there is intercommunication happening, then there must be something installed. And I mention this because those different projects handle service discovery (and network communication in general) very differently.
In Submariner, you specifically have to create a Kubernetes API object that explicitly allows for service export from one cluster to the other. In theory, once you’ve configured that, you should be able to use Akka Discovery to allow for intercommunication. (Because Submariner does utilize a flat network, and that Submariner API object will allow for service discovery in the same namespace. Note, I have not actually tried running an Akka Cluster over Submariner so there may be “gotchas” I’m not aware of.)
Fundamentally, with Akka Kubernetes Discovery all you are doing is running a query against the Kubernetes API with the query defined in pod-label-selector
in the namespace pod-namespace
. It is incredibly straightforward. If you can get all of K8S pods for your Akka nodes to be returned when you run the query specified in pod-label-selector
, and all of the pods can communicate with each other on IPs provided by the K8S API, then the Akka cluster should form.
I’m less confident that Skupper would work as easily, but that might be because of my own lack of around Skupper. (I don’t have a working install of Skupper yet, but it seems to presume separate namespaces and seems to work only at the service layer, not directly between pods.)
But even if we presume that you figure out your discovery issues, we then run into the question of “is this a good architecture”. By its nature, (vanilla) Akka Cluster is fairly chatty. It utilizes consensus protocols that require frequent intercommunication and partitions have be detected quickly so there is aggressive failure detection. These tradeoffs are why Multi-DC Akka Cluster exists. It allows for Akka nodes to be “location aware” and prioritize communication locally over remote communication. As the docs mention, it’s not necessarily limited to scenarios of physical boundaries. And this seems like a scenario where it would be very applicable: latency and overhead of inter-cluster communication would be much higher than local communication. It’s these exact kinds of asymmetric deployments where Multi-DC Cluster is appropriate.
However, Multi-DC Cluster isn’t particularly commonly used, because it has a lot of tradeoffs. In many ways, it functions as a federation of clusters rather than as a single cluster. For example, singletons are per datacenter (and therefore not really singletons anymore). Same with sharding. Which leads me back to my original question: “What is your real goal here?”. In most stateful scenarios, I feel like you would be better off with running separate clusters with Replicated Event Sourcing. It would require your application to be more aware of eventual consistency, but I think it would be much more scalable and reliable. But it would certainly depend a bit on your availability vs consistency requirements.
For stateless scenarios, I feel like you have more options. I still don’t really think trying to run a flat cluster across the K8S clusters makes a lot of sense, even for something stateless. Because even brief interruptions in service between the clusters will have significant impact on the cluster as it tries to avoid split-brain scenarios. A multi-DC Akka Cluster would make a lot more sense, but I’m not sure I see the benefit. If you are already in an environment like K8S where there is a lot of good service discovery, it seems like an Akka cluster per K8S cluster would be the typical architecture. But, since I don’t know your use case, it’s hard to be certain, as it would depend on what features of Akka Cluster you were using.
I’m also curious to hear how this is turning out, because I’m doing a lot of investigation on patterns for multi-K8S cluster architectures. But a single stretched Akka cluster really does seem to me like a very unusual (and probably unwise) architecture.