Building actor system for comparing large immutable objects concurrently

omrigm · July 3, 2019, 12:54pm

Hi,
I would like to use the scatter\gather pattern to compare large objects as part of a system that uses neural networks.
I have an actor that represents a question.
I have a bunch of actors that represent the (possible) answer.
I would like to find the answer that matches the question.
The main problem is that the question data and the answer data are large (several GB) and I don’t want to clone them.
Any ideas on the best way to implement this?
If I was not using actors I would have the answers in an immutable data structure and share it across worker threads to perform a compare operation.
All the answers are immutable and immutable in nature.

TimMoore · July 3, 2019, 10:01pm

Akka doesn’t usually clone messages. If you are sending messages within a local ActorSystem, it will share references, so you should be fine.

Large messages can be a problem when using Akka Cluster and sending messages between nodes, as it will require serializing the message at the sender and deserializing at the receiver. In that case, you are better off designing a way to colocate the question and answers on the same node.

omrigm · July 4, 2019, 11:18am

Thanks Tim for the reply!

Here is what I am thinking of doing. Create an actor for an answer. The answer actors will have some kind of NLP searchable vector that can be used for searching. When an answer actor receives a message to perform a comparison with some question text, it will spawn a searcher actor and when the searcher actor is asked to search, it will have the answer passes “by reference”. This will work only because:
The answer data is completely immutable
The search actors are always created on the same machine as their parent.
Doing it this way, I can ask my answer actor to perform many compare tasks to different questions without being a bottleneck and in addition reply to other messages faster without being tied up in a “compare” task that can be time consuming. I can now shard my answers across as many nodes if needed. As long as the searcher workers are always created on the same node as the parent, I can share the model state of the answer actor.

I would be very happy to hear your thoughts on this idea. Is this good practice or will something come back and bite me.

TimMoore · July 15, 2019, 12:53am

That sounds good to me.

Topic		Replies	Views
Stateful Actors Persistence / Event Sourcing	2	543	April 16, 2020
There must be something shared between actor systems ,How do they avoid lock? Akka Libraries akka	4	669	July 30, 2019
What's the best practice to send data between actors? Akka Libraries	2	428	August 16, 2020
Searching actors in a cluster Akka Cluster	1	772	June 6, 2018
Akka and actor usecase Akka Cluster	1	631	March 9, 2022

Building actor system for comparing large immutable objects concurrently

Related topics