Hello, I would like to ask for some cluster high-level design advice.
I have been trying to figure out if Akka would be a viable alternative to Quasar to write a distributed Java application, even though Akka does not support Java as well as it supports Scala. But writing the application in Scala is simply not an option.
So I have been studying Akka for a few weeks but I’m still having problems deciding on the best design approach for my Akka use case.
At the core of the design I would like to set four independent processes running in four different hosts on the same network. The purpose is not to distribute client load among the four processes, but rather to have each process perform a specifically dedicated unique job. The system does not require libraries to be shared across the four applications (not like a “distributed monolith” the way I read it).
As a requirement, I need actors in the four systems to be able to communicate with each other in order to update their dynamic state. The state updates may originate from any of the four processes. A state is a shallow POJO which contains around 40 primitive fields. Eventual consistency for the states of (let’s say ten thousand) client actors is not a problem as long as it doesn’t take an update more than around 4 seconds to propagate among all four systems.
I have the impression that Akka clusters are more geared toward distributing client load than distributing functionality. So I am still struggling with the following questions-
1- Does the 4-second max latency requirement sound unrealistic for updating states in 10,000 actors?
2- Would Akka cluster distributed data be a good solution? Or would distributed publish-subscribe be a better choice for this use case?
3- Should I use Akka routing? Is it an alternative to distributed data or to pub-sub in this case? Or does it work in addition to either?
4- Would it make sense to make each system’s top-level actor a singleton?
5- One application.conf file with four roles would start four members in each cluster node (16 members in all four processes). Since I am not concerned with redundancy and I only need to assign one role per process, would it be better to have four different config files each with a single different role? But wouldn’t that cause four separate clusters to spin up independently and thus prevent the actors from communicating state updates between processes (even if I include a seed role and the cluster names are the same)?
It’s a big topic, so I’m not asking for definitive answers. But if you can point out which of these strategies have better chance of success it would be a big help.
Thank you so much.