Hey, on my Flink application (hosted on AWS manage...
# troubleshooting
l
Hey, on my Flink application (hosted on AWS managed Flink) I get these warnings:
Copy code
AccessDenied: ...1722425503684/: Access Denied

Bulk delete operation failed to delete all objects; failure count = 1

org.apache.hadoop.fs.s3a.impl.MultiObjectDeleteSupport.translateDeleteException(MultiObjectDeleteSupport.java:92)
I thought this was todo with a dependecy i wasn’t using
flink-s3-fs-hadoop
, so I removed that and still see these thousands of warning logs. I can’t see where in my code I would be calling this application or trying to perform a delete in this way.
a
Could you provide more context about the trace? Flink uses the s3 filesystem to manage checkpoints, savepoints, ha and so it doesn't specifically need to part of your code.
l
yes!
Copy code
{
  "content": {
    "host": "/aws/kinesis-analytics/flink",
    "message": "AccessDenied: 7b3d0a3681dfd0b918d30888438b019f-946374002659-1722425503684/: Access Denied",
    "attributes": {
      "messageType": "WARN",
      "service": "stream-deactivation-us-ash",
      "locationInformation": "org.apache.hadoop.fs.s3a.impl.MultiObjectDeleteSupport.translateDeleteException(MultiObjectDeleteSupport.java:107)",
      "logger": "org.apache.hadoop.fs.s3a.impl.MultiObjectDeleteSupport",
      "messageSchemaVersion": "1",
      "host": "/aws/kinesis-analytics/flink",
      "id": "38442495305608014481312845103217598507977785145102172165",
      "applicationVersionId": "4",
      "threadName": "s3a-transfer-3db4bd0e0168751d35dc925c1fa9414b79d097b8-unbounded-pool2-t86",
      "timestamp": 1723821108370
    }
  }
}
Copy code
{
  "content": {
    "message": "Bulk delete operation failed to delete all objects; failure count = 1",
    "attributes": {
      "messageType": "WARN",
      "locationInformation": "org.apache.hadoop.fs.s3a.impl.MultiObjectDeleteSupport.translateDeleteException(MultiObjectDeleteSupport.java:92)",
      "logger": "org.apache.hadoop.fs.s3a.impl.MultiObjectDeleteSupport",
      "messageSchemaVersion": "1",
      "applicationVersionId": "4",
      "threadName": "s3a-transfer-3db4bd0e0168751d35dc925c1fa9414b79d097b8-unbounded-pool2-t98",
      "timestamp": 1723821288239
    }
  }
}
I believe I have to give
s3:Delete*
IAM permissions to my roles, but I’m not sure I understand which s3 buckets the flink job is trying to write to 🤔
Confused why I am getting
Bulk delete operation failed to delete all objects; failure count = 1
errors from the
org.apache.hadoop.fs.s3a.impl.MultiObjectDeleteSupport
logger. As a test I gave my Flink job on aws full delete permissions, I am still getting the warning log every minute or so:
Copy code
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Statement1",
      "Effect": "Allow",
      "Action": [
        "s3:Delete*"
      ],
      "Resource": "*"
    }
  ]
}
Just wondering where this comes from, and which s3 bucket it is trying to write to…
a
Ideally we would have gotten a better stackTrace to understand what exactly is failing to delete, but again this might be an operation of syncing checkpoints or ha, in this case the service probably doesn't use the service execution role you provide because it wouldn't be targeting your buckets anyway, it is hard to confirm or deny with such shallow stack trace.
l
That makes sense, I can try lowering the log level to info and seeing if there is anything interesting present there
It appears to be related to this log
Copy code
{
  "id": "AgAAAZF6LZOSU5DtngAAAAAAAAAYAAAAAEFaRjZMWm5CQUFCVFhEaUl6Z0VwTEFBQQAAACQAAAAAMDE5MTdhMmQtYWUwYS00YmNlLThlYTEtODg3ZTUzODI1Y2Jh",
  "content": {
    "timestamp": "2024-08-22T13:01:32.946Z",
    "message": "Committing 7b3d0a3681dfd0b918d30888438b019f-946374002659-1722425503684/checkpoints/7b3d0a3681dfd0b918d30888438b019f/chk-17/_metadata with MPU ID afQOxbeG84HLeNBL5Mi_nxzS046xQ.s7W9t2c9E7Hf5CaHpThEHvy_.fzikHtaCuKIILFFB.80XEYOK59WcEgboxanYI7ari20JWcN7.aR1jRqv3uRac56ZBzbnkCvsRpHb5KcRh6mhylUELdXBkNf._tDYq93PHF4C3qQCNej4-",
    "attributes": {
      "messageType": "INFO",
      "locationInformation": "org.apache.flink.fs.s3.common.writer.S3Committer.commit(S3Committer.java:67)",
      "logger": "org.apache.flink.fs.s3.common.writer.S3Committer",
      "messageSchemaVersion": "1",
      "id": "38453881722139690275353199166317417160391698146809085952",
      "applicationVersionId": "5",
      "threadName": "jobmanager-io-thread-1",
      "timestamp": 1724331692946
    }
  }
}
a
Yeah that indeed looks like a checkpoints file
l
alright
do you know how i can fix the warning, is there some sort of internal permissions i need to change?
a
you would probably need to contact the Service team since they manage the bucket and the configuration. Not sure if could be fixed from your side or not. also I can't tell if that affects the job in any means or not but a warn message every minute is pretty noisy
l
Okay thanks
yea it is a lot of logs
I don’t have AWS support unfortunately so I guess I will have to ask on re:Post
1