Let’s say there is a frontend node built on top of Akka HTTP which sends commands to a backend node through Cluster Sharding. In order to avoid a coupling from backend node on frontend actors for sending of the responses, backend node publishes the response events for the commands to some topic. So, when a frontend request actor is created (per each request), it subscribe for that topic in preStart, after that it sends the command to backend and stay waiting for a response being published to the topic. In summary, the communication from frontend to backend happens through Cluster Sharding and from backend to frontend through PubSub.
I have been facing a problem that sometimes a backend node publishes an event to a topic and frontend actor does not receive it. I configured the PubSub to forward the message to dead letters when there is no subscriber and I have been seeing a message saying that this happened.
Then, I have a suspicion: the mediator of Distributed PubSub takes some time to synchronize across whole the cluster the information about what are the subscribers for topics. And, the problem happen when the backend try to publish the response event to a topic but its mediator did not receive the information that exist a subscriber for that topic yet. In other words, the response time of backend is faster than the time taken for mediator to be aware of the new subscriber (the request actor of frontend).
So, my question is: is the Distributed PubSub a recommended approach for this scenario where a backend node needs to send events to frontend nodes without create a tight coupling between them?
PS 1: there are few topics, not only one for all responses.
PS 2: I know that there is a gossip-interval setting which I could use for speeding up the time taken for synchronization of new the new subscribers. However, I suppose If I have to make this value much smaller, this is an answer that Distributed PubSub is not a recommended approach for this use case.