Jalil Alchy
04/06/2023, 8:49 PMSink: Committer (1/1)#0 (0c5abffd2f065e1edef3036b20ec42a5) switched from RUNNING to FAILED with failure cause: java.nio.file.AccessDeniedException: <path>/part-8ad796e1-3c85-4b54-9eed-eaaf06621185-0.snappy.parquet: initiate MultiPartUpload on <path>/part-8ad796e1-3c85-4b54-9eed-eaaf06621185-0.snappy.parquet: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: KC4SFHVXBGJC6XFW; S3 Extended Request ID: VQXppEZpXzKyDoc7u32AHkMAr608cMLxLOh4QIWwZVWIg0JBYgkMTxAT4M6+xmyVrzmMmnyMcUucHlBZmFBzSs/GrYJ+LmmlADwr4wjTjDs=; Proxy: null), S3 Extended Request ID: VQXppEZpXzKyDoc7u32AHkMAr608cMLxLOh4QIWwZVWIg0JBYgkMTxAT4M6+xmyVrzmMmnyMcUucHlBZmFBzSs/GrYJ+LmmlADwr4wjTjDs=:AccessDenied
My sink is configured like so:
return DeltaSink.forRowData(
new Path("s3a://<bucket/<path>"),
new Configuration() {
{
set(
"spark.hadoop.fs.s3a.aws.credentials.provider",
"com.amazonaws.auth.DefaultAWSCredentialsProviderChain");
}
},
FULL_SCHEMA_ROW_TYPE)
.withMergeSchema(true)
.build();
Writes to my local work, but writes to S3 are failing. The credentials on the machine are admin for the account. Any thoughts?Jeremy Ber
04/06/2023, 8:54 PMJalil Alchy
04/06/2023, 8:54 PMJalil Alchy
04/06/2023, 8:55 PM$ $AWS_REGION
us-east-1: command not found
And the bucket is in us-east-1Jeremy Ber
04/06/2023, 8:55 PMJalil Alchy
04/06/2023, 8:56 PMJalil Alchy
04/06/2023, 8:56 PMJeremy Ber
04/06/2023, 8:56 PMJalil Alchy
04/06/2023, 8:56 PMJalil Alchy
04/06/2023, 8:57 PMJeremy Ber
04/06/2023, 8:58 PMJalil Alchy
04/06/2023, 9:01 PM{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "*",
"Resource": "*"
}
]
}
Jalil Alchy
04/06/2023, 9:02 PMJalil Alchy
04/06/2023, 9:04 PMJalil Alchy
04/06/2023, 9:05 PMJeremy Ber
04/06/2023, 9:05 PMJeremy Ber
04/06/2023, 9:05 PMJeremy Ber
04/06/2023, 9:06 PMJalil Alchy
04/06/2023, 9:14 PMJeremy Ber
04/06/2023, 9:17 PMJalil Alchy
04/06/2023, 9:24 PM{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Dev Desk Access",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<other_aws_account>:root"
},
"Action": "s3:*",
"Resource": "arn:aws:s3:::<my_bucket>/*"
}
]
}
Jalil Alchy
04/06/2023, 9:28 PMKrzysztof Chmielewski
04/06/2023, 9:35 PMAWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
right?Jalil Alchy
04/06/2023, 9:37 PMspark.hadoop.fs.s3a.aws.credentials.provider"
instead so that it can pick up off of my IAM creds using the DefaultAWSCredentialsProviderChainJalil Alchy
04/06/2023, 9:37 PMAmazonS3
client and write a file, that works directly.Krzysztof Chmielewski
04/06/2023, 9:37 PMAWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
(RECOMMENDED since they are recognized by all the AWS SDKs and CLI except for .NET), or AWS_ACCESS_KEY
and AWS_SECRET_KEY
(only recognized by Java SDK)
?Jalil Alchy
04/06/2023, 9:39 PM~/.aws/
config and can use the env value AWS_PROFILE
to pick a profile from that config.Jalil Alchy
04/06/2023, 9:39 PMKrzysztof Chmielewski
04/06/2023, 9:39 PMAWS_PROFILE
Krzysztof Chmielewski
04/06/2023, 9:40 PMJalil Alchy
04/06/2023, 9:40 PMJalil Alchy
04/06/2023, 9:51 PMJalil Alchy
04/06/2023, 9:53 PMJalil Alchy
04/06/2023, 9:55 PMKrzysztof Chmielewski
04/07/2023, 12:09 PMenvironment:
- AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
- AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
The DeltaSink setup is very simple:
public static DeltaSink<RowData> createDeltaSink(
String deltaTablePath,
RowType rowType) {
Configuration conf = new Configuration();
conf.set("delta.checkpointInterval", "1000");
return DeltaSink
.forRowData(
new Path(deltaTablePath),
conf,
rowType)
.withPartitionColumns("age")
.build();
}
deltaTablePath is path to S3.
Im using a "fat jar" to submit my demo app.
my dependencies in pom.xml looks like this (im using aws profile when building the fat jar.
<dependencies>
<dependency>
<groupId>io.delta</groupId>
<artifactId>delta-flink</artifactId>
<version>${delta.version}</version>
</dependency>
<dependency>
<groupId>io.delta</groupId>
<artifactId>delta-standalone_${scala.main.version}</artifactId>
<version>${delta.version}</version>
</dependency>
<!-- <https://mvnrepository.com/artifact/org.apache.flink/flink-connector-files> -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-files</artifactId>
<version>${flink.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-parquet</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>${hadoop-version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-hadoop-fs</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-slf4j-impl</artifactId>
<version>${log4j.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-api</artifactId>
<version>${log4j.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<version>${log4j.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17</version>
<scope>provided</scope>
</dependency>
</dependencies>
<profiles>
<profile>
<id>aws</id>
<properties>
<flink.scope>compile</flink.scope>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-aws</artifactId>
<version>3.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>${hadoop-version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-file-sink-common</artifactId>
<version>${flink.version}</version>
</dependency>
</dependencies>
</profile>
</profiles>
Krzysztof Chmielewski
04/07/2023, 12:17 PM<hadoop-version>3.1.0</hadoop-version>
Krzysztof Chmielewski
04/07/2023, 12:18 PMCould this be caused by anything related to shading?I dont think so, if there would be a problem with shading/versions you woudl have ClassNotFound or ClassDefNotFound or MethodNotFound exceptions I think
Jalil Alchy
04/07/2023, 1:59 PMJalil Alchy
04/07/2023, 2:01 PMKrzysztof Chmielewski
04/07/2023, 2:02 PMJalil Alchy
04/07/2023, 2:02 PMJalil Alchy
04/07/2023, 2:02 PMKrzysztof Chmielewski
04/07/2023, 2:03 PMKrzysztof Chmielewski
04/07/2023, 2:03 PMKrzysztof Chmielewski
04/07/2023, 2:04 PMJalil Alchy
04/07/2023, 2:06 PMJalil Alchy
04/07/2023, 2:35 PMKrzysztof Chmielewski
04/07/2023, 2:40 PMJalil Alchy
04/07/2023, 2:41 PMJalil Alchy
04/07/2023, 2:41 PMKrzysztof Chmielewski
04/07/2023, 2:41 PMKrzysztof Chmielewski
04/07/2023, 2:42 PMJalil Alchy
04/07/2023, 2:47 PMJalil Alchy
04/07/2023, 2:47 PMJalil Alchy
04/07/2023, 2:56 PMJalil Alchy
04/07/2023, 7:08 PMKrzysztof Chmielewski
04/07/2023, 7:12 PMKrzysztof Chmielewski
04/13/2023, 3:36 PMJalil Alchy
04/13/2023, 3:37 PMJalil Alchy
04/13/2023, 3:37 PMJalil Alchy
04/13/2023, 3:38 PM