I have a scenario where I am using log compacted kafka topics and I would like to store each unique entity from kafka in its own actor. I want to be able distribute these actors by running multiple instances of the app using the same kafka consumer group (using alpakka kafka).
Let’s say I am consuming two different topics, an update to entity actor A from topic-1 needs to be able to query entity actor B from topic-2 to merge the two entities and publish to another topic.
I was looking at using distributed pub sub to be able to message actors without knowing where they are in a cluster, but the problem with that approach is the eventual consistency. I would need a way of waiting for the actor ref in Put to be replicated to each pub sub mediator before continuing processing which I don’t believe is possible. This is because if an entity actor gets created closely before another related one, I need the second one to be able to query the first one but if the pub sub mediators are not consistent yet it will think that entity actor doesn’t exist yet and the target topic will be in an invalid state.
I have also looked into using a cluster aware router, I think this would only work if I was able to route using kafkas partitioning strategy.
I’d rather not use sharding and persistence because I have log compacted topics, so ideally I could use kafkas partitioning and not have to persist the actors as the entities are already in kafka and always will be.
Any ideas would be greatly appreciated!