Starting Lagom in development mode is extremely slow

Simiil · October 11, 2019, 1:38pm

Hi
I have a small-ish lagom application with around 14 services. I am running this project using the sbt runAll task.

While starting, i get a whole lot of warnings that look like this:

akka.cluster.sharding.ShardRegion [sourceThread=service-application-akka.actor.default-dispatcher-28, akkaSource=akka.tcp://service-application@127.0.0.1:42367/system/sharding/ManufactureEventProcessor, sourceActorSystem=service-application, akkaTimestamp=13:17:44.687UTC] - MyEventProcessor: Trying to register to coordinator at [ActorSelection[Anchor(akka://service-application/), Path(/system/sharding/MyEventProcessorCoordinator/singleton/coordinator)]], but no acknowledgement. Total [4] buffered messages. [Coordinator [Member(address = akka.tcp://service-application@127.0.0.1:42367, status = Up)] is reachable.]

or this:

15:18:56.077 [warn] akka.cluster.sharding.ShardRegion [sourceThread=service-akka.actor.default-dispatcher-44, akkaSource=akka.tcp://service@127.0.0.1:37263/system/sharding/kafkaProducer-my-topic, sourceActorSystem=service, akkaTimestamp=13:18:56.077UTC] - kafkaProducer-my-topic: Retry request for shard [...] homes from coordinator at [Actor[akka://service/system/sharding/kafkaProducer-my-topicCoordinator/singleton/coordinator#409734202]]. [1] buffered messages.

these errors completely flood the console, so there are way to many of them to count.

there is also this occasional warning:

15:19:06.168 [warn] akka.cluster.Cluster(akka://service-application) [sourceThread=service-application-akka.actor.default-dispatcher-20, akkaTimestamp=13:19:06.165UTC, akkaSource=akka.cluster.Cluster(akka://service-application), sourceActorSystem=service-application] - Cluster Node [akka.tcp://service-application@127.0.0.1:40769] - Scheduled sending of heartbeat was delayed. Previous heartbeat was sent [2894] ms ago, expected interval is [1000] ms. This may cause failure detection to mark members as unreachable. The reason can be thread starvation, e.g. by running blocking tasks on the default dispatcher, CPU overload, or GC.

Furthermore, i also get Errors during the startup:

15:18:49.809 [error] akka.cluster.sharding.PersistentShardCoordinator [sourceThread=service-application-akka.actor.default-dispatcher-3, akkaSource=akka.tcp://service-application@127.0.0.1:43377/system/sharding/MyEventProcessorCoordinator/singleton/coordinator, sourceActorSystem=service-application, akkaTimestamp=13:18:49.784UTC] - Persistence failure when replaying events for persistenceId [/sharding/MyEventProcessorCoordinator]. Last known sequence number [0]
akka.persistence.RecoveryTimedOut: Recovery timed out, didn't get snapshot within 30000 milliseconds

or

com.lightbend.lagom.internal.persistence.cassandra.NoServiceLocatorException: Timed out after 2 seconds while waiting for a ServiceLocator. Have you configured one?

or

15:28:17.633 [error] akka.cluster.sharding.PersistentShardCoordinator [sourceThread=service-application-akka.actor.default-dispatcher-17, akkaTimestamp=13:28:12.779UTC, akkaSource=akka.tcp://service-application@127.0.0.1:44641/system/sharding/MyEntityCoordinator/singleton/coordinator, sourceActorSystem=service-application] - Persistence failure when replaying events for persistenceId [/sharding/MyEntityCoordinator]. Last known sequence number [0]
akka.pattern.CircuitBreaker$$anon$13: Circuit Breaker Timed out.

or

akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://service-application/user/cassandraOffsetStorePrepare-singleton/singleton/cassandraOffsetStorePrepare#-194060245]] after [60000 ms]. Message of type [com.lightbend.lagom.internal.persistence.cluster.ClusterStartupTaskActor$Execute$] was sent by [Actor[akka://service-application/user/cassandraOffsetStorePrepare-singleton/singleton/cassandraOffsetStorePrepare#-194060245]]. A typical reason for `AskTimeoutException` is that the recipient actor didn't send a reply.

These Errors are the most common while starting, and are not service specific, i.e. the service that is referenced in these errors is not always the same.

After waiting for the (Services started, press enter to stop and go back to the console...) message,
these errors and warnings continue to occur, until they slowly stop over the next minute or so. After that, the application runs correctly. This whole process (from running runAll to the point where i am able to use the services) easily takes up to five minutes every time. The hot reload of services also fails fairly often, so that i am forced to restart this process.

In contrast, building the application in prod mode (either as debian packages or docker containers), deploying and starting it takes at most 30 seconds for the whole process.

More Information: I am not using the internal cassandra and kafka service, but rather have an external service running (though on the same, local machine.) I am using the lagom-sbt-plugin in version 1.5.1

Is this the normal behavior of the runAll task? can i speed it up somehow?

Thanks,
Samuel

patriknw · October 11, 2019, 8:00pm

What’s the memory usage on the machine? Is it swapping?

TimMoore · October 14, 2019, 2:42am

14 services are a lot for one project. In development mode, all services run in the same JVM as sbt, so you’ll need to launch it with plenty of heap allocated, and might need to adjust other JVM settings to get good performance.

Simiil · October 15, 2019, 7:36am

I doubled the heap space, and now it seems to start far more reliably, Thank you

Topic		Replies	Views
Running services failed - akka: No coordinator found to register. Probably, no seed-nodes configured and manual cluster join not performed? Lagom	4	3924	December 29, 2019
Lagom service initialization problem in docker swarm Lagom	1	938	July 1, 2018
No acknowledgement received when trying to register to coordinator, messages buffered. Lagom Persistence API akka-cluster , configuration	0	1463	April 23, 2020
Debugging runAll Lagom	1	1310	July 5, 2018
Error running the reference hello Lagom application for Scala Lagom Development Mode scala , kafka	15	4613	March 26, 2018

Starting Lagom in development mode is extremely slow

Related topics