Cassandra is down and when it is up, services are not checking for Cassandra connectivity

vemisettipriyanka · April 9, 2019, 6:14am

The current behavior is :
When Cassandra DB is down for some reason, and is restarting.
During that down time if Services tries to connect to DB, it gives Cassandra DB not found errors and exits the request.
We are now required to restart the Service manually, after which it is able to connect to Cassandra.

(This issue is not observed during container startup or since we are handling it using the Init_Container feature of Kubernetes.
We are seeing this issue only when Cassandra goes down intermittently for some reason and is bringing itself up.)

Expected/Preferred behavior:
If Cassandra is not available, Service should keep checking or wait till it is up, and then reconnect to it.
This will provide a graceful reconnection mechanism.

Can you please let us know if there is any inbuilt lagom feature that would enable this behavior.
Or if we should write a retry mechanism code.

We observed this reconnection mechanism already exists for Kafka. Whenever Kafka goes down and up, the services connect to it automatically.

Please provide and help on this issue.

dpennell · April 9, 2019, 1:51pm

We are seeing the same behavior in k8s. If the cassandra statefulset restarts a pod, the IP address changes, but lagom/akka-persistence continues to use the old addresses. Note that we are binding to the cassandra service, but akka-persistence is caching the initial ip addresses from the dns service lookup.

aklikic · April 9, 2019, 8:13pm

@vemisettipriyanka, @dpennell,

I’m not sure if you are expiriencing the same issue but regarding cassandra endpoint discovery and access will mainly depend on used ServiceLocator implementation.
ServiceLocator is responsible for, depending in the implementation, querying endpoints, caching and load-balancing.

For the reference, Lagom akka persistance cassandra session provider is ServiceLocatorSessionProvider.

Prefered ServiceLocator implementations:

reactive lib - Lagom 1.4.x (not supported by Lagom 1.5.x)
Lagom akka discovery Lagom 1.4 & Lagom 1.5

Implementations are improving with every new version so check for updates.

Hope this helps.

Br,
Alan

dpennell · April 10, 2019, 7:35pm

I’m using reactive lib. It appears that the service lookup for cassandra is done once and only once.

aklikic · April 11, 2019, 12:03am

@patriknw can you maybe comment on this?

patriknw · April 13, 2019, 9:57am

The connection pool in the Cassandra driver should reconnect itself after the initial discovery and connect to the contact points. I can see that this could be a problem if the entire Cassandra cluster is restarted with new IP adresses. I think there is a recently added issue about this, https://github.com/akka/akka-persistence-cassandra/issues/445

vemisettipriyanka · April 15, 2019, 6:31am

We are using the following configuration for Cassandra persistence.
When Cassandra is down, service is coming down and not looking for Cassandra connectivity when Cassandra is up and running.

–START
// Enable dependency injection
play.modules.enabled += com.lightbend.rp.servicediscovery.lagom.javadsl.ServiceLocatorModule

cassandra-keyspace = test

cassandra.default {
session-provider = akka.persistence.cassandra.ConfigSessionProvider

list the contact points here

contact-points=[{?CASSANDRA_HOST}] port={?CASSANDRA_PORT}
}

cassandra-journal {
keyspace = {cassandra-keyspace} contact-points = {cassandra.default.contact-points}
port={cassandra.default.port} first-time-bucket = "20160225T00:00" session-provider = {cassandra.default.session-provider}
}

cassandra-snapshot-store {
keyspace = {cassandra-keyspace} contact-points = {cassandra.default.contact-points}
port={cassandra.default.port} session-provider = {cassandra.default.session-provider}
}

lagom.persistence.read-side.cassandra {
keyspace = {cassandra-keyspace} contact-points = {cassandra.default.contact-points}
port={cassandra.default.port} session-provider = {cassandra.default.session-provider}
}

Enable new sharding state store mode by overriding Lagom’s default

akka.cluster.sharding.state-store-mode = ddata

Enable serializers provided in Akka 2.5.8+ to avoid the use of Java serialization.

akka.actor.serialization-bindings {
“akka.Done” = akka-misc
“akka.actor.Address” = akka-misc
“akka.remote.UniqueAddress” = akka-misc
}

get seed nodes from environmental variables

akka.cluster.seed-nodes = [
${?SEED_NODES_0}
]

lagom.broker.kafka.service-name = “”
lagom.broker.kafka.brokers = ${?KAFKA_SERVICE_NAME} # this can be a comma-separated string if you have >1

–END

Topic		Replies	Views
What is the right way to configure cassandra in prod Lagom Persistence API	1	1592	July 6, 2018
Lagom service initialization problem in docker swarm Lagom	1	938	July 1, 2018
Kubernetes: Cassandra timeout during CAS write query at consistency SERIAL Lagom Persistence API scala , configuration	3	5228	December 12, 2019
Unable to start lagom service in Kubernetes Cluster Lagom Development Mode	1	1203	March 22, 2018
BusyPoolException in Lagom with AWS Cassandra Lagom scala	5	1530	March 9, 2021

Cassandra is down and when it is up, services are not checking for Cassandra connectivity

list the contact points here

Enable new sharding state store mode by overriding Lagom’s default

Enable serializers provided in Akka 2.5.8+ to avoid the use of Java serialization.

get seed nodes from environmental variables

Related Topics