I’m attempting to take advantage of Akka Projections (R2DBC) for processing IoT metrics collected in TimescaleDB hypertable.
I was successful in creating a basic source provider (ticket) that works correctly when using Sequence OffsetType
.
However this is only fine if I have a global event sequence for all device events which I very much would like to avoid. This seems to leave me with two options which do not seem viable:
Start a projection for every deviceId
Basically I could use what I have right now but with slightly different SQL query. Big downside here is that this cause many more DB calls since I need a new SourceProvider
instance for each device (10k+) and prevents me from using ShardedDaemonProcess
easily.
Implement TimestampOffset
based SourceProvider
I’ve done a sample implementation, I’ve extended SourceProvider
with BySlicesSourceProvider
and it works*, but there are couple of issues:
-
Since my event table does not contain slice column, I calculate slice from
deviceId%totalNumberOfSlices
this inevitably causes an exception when projection tries to store the offset, for the purpose of the test I’ve just set the min&max slice inBySlicesSourceProvider
to 0 and Int.MAX respectively which circumvents the issue. -
Every event processed produces a record in
akka_projection_timestamp_offset_store
. Amount of data stored roughly equates to the event itself which assuming I want to run 10 projections would effectively shift my storage requirements by an order of magnitude. (Maybe it’s a consequence of this sketchy approach to slices, I’m not sure if that is how its supposed to behave)
So I’m at a loss here, I ask for your suggestions how you would approach this problem.
Do you think using Akka Projections in this scenario is even valid?
Thanks