I tried to upload contents of a compressed tar file into S3 using Alpakka but only 1-2 entries were copied, rest of them were skipped.
When I increased the chunck size to a big number (double of file size in bytes), it worked but I suspect if the tar file size is too big then it will fail. Is it expected or I have missed something?
Below is my code:
lazy val fileUploadRoutes: Route = {
withoutRequestTimeout
withoutSizeLimit {
pathPrefix("files") {
post {
path("uploads") {
extractMaterializer { implicit materializer =>
fileUpload("file") {
case (metadata, byteSource) =>
val uploadFuture = byteSource.async
.via(Compression.gunzip(200000000))
.via(Archive.tarReader()).async
.runForeach(f => {
f._2.runWith(s3AlpakkaService.sink(FileInfo(UUID.randomUUID().toString, f._1.filePath, metadata.getContentType)))
})
onComplete(uploadFuture) {
case Success(result) =>
log.info("Uploaded file to: " + result)
complete(StatusCodes.OK)
case Failure(ex) =>
log.error(ex, "Error uploading file")
complete(StatusCodes.FailedDependency, ex.getMessage)
}
}
}
}
}
}
}
I guess issue is not in S3AlpakkaService as I tried logging the file names before calling s3AlpakkaService.sink and it printed only 1-2 entries. I tried setting chunck size to few KB to 10 MBs but that did not solve the issue. For uploading a .gz file of size around 100MB, I had to set chunck size to 200MB.
Interesting. What do you mean with chunk size? The value passed to Compression.gunzip? That shouldn’t actually make a difference but if it does it would point to a bug in the Archive.tarReader. Are you on the latest version of Alpakka? Can you post your Alpakka versions?