Hi everyone, Did anyone ever encounter the followi...
# troubleshooting
r
Hi everyone, Did anyone ever encounter the following error while trying to create a catalog with iceberg table sink using a Glue catalog implementation?
Copy code
org.apache.flink.table.api.ValidationException: Could not find any factory for identifier 'iceberg' that implements 'org.apache.flink.table.factories.CatalogFactory' in the classpath
I am trying to connect to an iceberg table hosted on AWS Glue catalog from a Flink application running on AWS Managed Service for Flink. The SQL statement I am running:
Copy code
CREATE CATALOG glue_catalog WITH (
        'type'='iceberg',
        'warehouse'='<s3://my-bucket/prefix>',
        'catalog-impl'='org.apache.iceberg.aws.glue.GlueCatalog',
        'io-impl'='org.apache.iceberg.aws.s3.S3FileIO'
);
I bundled the two necessary jar files (
iceberg-flink-runtime
and
iceberg-aws-bundle
as per the iceberg-aws integration docs) into an uber jar and checked that the
iceberg
identifier exists in that uber jar:
Copy code
Available factory identifiers are:

blackhole
datagen
filesystem
kinesis
iceberg
print
But when I try to run the flink app it doesn’t seem to find the
iceberg
identifier’s implementations and I get the above-mentioned error. Does anyone know how can I resolve this?
d
First check release notes and make sure that Iceburg & AWS Glue versions are compatible with your flink version. Read release notes carefully.
Also check everything is properly on the classpath. If your using some managed services there might be additional configurations needed. For AWS MSK for flink make sure the deployment configuration includes the needed jars on the classpath
make sure IAM role associated with your flink app has the permissions to interact with AWS Glue including creating, describing, updating catalogs & databases.
bump up Log level to DEBUG this might provide more information about why the factory is not loaded.
r
Ok it turns out I was a bit stupid. I was trying to run my application code locally first to see if I would have any issues before deploying it to Managed Flink, but I guess the local flink python driver is different from the aws one because when I ran it on AWS Managed Flink I did not get this error.
But thank you for all the help! 🙏
d
`yeah there can be some differences when moving onto AWS
You might try LocalStack for dev. It allows a very similar environment to AWS but run locally.