Rolling Update with Remember Entites issue with newer Writer

Hi everyone,

if I perform a rolling Update with Remember-Entities on then the new node get plenty of errors from this line of code:

val errMsg = s"Invalid replayed event [sequenceNr=${r.persistent.sequenceNr}, writerUUID=${r.persistent.writerUuid}]. " +
            s"There was already a newer writer whose last replayed event was [sequenceNr=${seqNo}, writerUUID=${writerUuid}] for the same persistenceId [${r.persistent.persistenceId}]." +
            "Perhaps, the old writer kept journaling messages after the new writer created, or duplicate persistenceId for different entities?"

Erros look like this:

2025-05-06 17:32:35.340 - WARN - a.persistence.journal.ReplayFilter - akka://ts-ng@192.168.21.144:2552/system/akka.persistence.cassandra.journal/$mb - Invalid replayed event [sequenceNr=1580783, writerUUID=3fd53bf6-0ea6-4a1f-8027-ce31811f6da6]. There was already a newer writer whose last replayed event was [sequenceNr=1580782, writerUUID=883dce50-bfb8-4216-a451-68872b9aac95] for the same persistenceId [/sharding/IShard/3728].

Perhaps, the old writer kept journaling messages after the new writer created, or duplicate persistenceId for different entities?

What can cause this issue?

thanks,

Kevin

One reason that would happen is if you get a split brain, where two clusters exist, disconnected from each other, and both write to the remember entities journal.

If you look in the database at those events you will be able to see timestamps for when it happened which can help with further investigation.

Note that if you see the error for remember entities there is also a big risk that you have duplicate/parallel writes for actual entities run in sharding, if you use it for event sourced entities, which is probably worse, since the remember entities journal can often be dropped without big problems but the entity business state is more important.

min-nr-of-members is 50 and the cluster size is 50. So I think there is no split-brain scenario.
We use the Split Brain Resolver from akka.

We use Event Sourced Remember Entities interestingly this is a test system and we don’t introduce new State on this example. The only thing that could happen is that some Data is outdated and we delete the actor and state, which is not a problem if the error only occours in this scenarios. But it looks scary.