Hi, Flink SQL 1.19.1. I put jdbc connector + postg...
# troubleshooting
d
Hi, Flink SQL 1.19.1. I put jdbc connector + postgres driver jars into the ./lib folder in the official flink:1.19.1 image as explained in the docs. When I want to: • create a catalog of type 'jdbc' • create a table with connector 'jdbc' • insert data to the table • select data from the table I always get
java.lang.IllegalStateException: No ExecutorFactory found to execute the application.
Do you know what may be missing in the setup? I see the postgres table in the catalog, but I cannot execute any operations on it. When I want to create a new table in the catalog in sql-client, I additionally get
org.apache.flink.table.gateway.api.utils.SqlGatewayException: Failed to fetchResults.
d
The error message “No ExecutorFactory found to execute the application” typically indicates that Flink is unable to find the necessary components to execute your job, which might be due to classpath issues or configuration problems related to the JDBC connector you’re trying to use. Given that you’ve already placed the JDBC connector and PostgreSQL driver JARs into the
Copy code
./lib directory
You probably want to check that the versions of these jars are compatible with Flink 1.19.1
Double-check that Flink is indeed using ./lib
check flink config is configured correctly
Copy code
flink-conf.yml
Make sure Flink process has permission access to ./lib
Note: The additional error “Failed to fetchResults” from the SQL Gateway indicates a possible issue with the SQL Client Gateway service. So check that the gateway is running without errors and has the necessary permissions to access the catalog and the database.
👍 1
Lastly enable full debug mode and check the logs to see if there is additional information.
r
I think @D. Draco O'Brien has nailed all the salient points of things to check here. In case it helps, here's a general Flink SQL troubleshooting talk that I did last week at Current that might be of interest: https://talks.rmoff.net/9GpIYA/the-joy-of-jars-and-other-flink-sql-troubleshooting-tales
d
Thanks @D. Draco O'Brien for a quick response. • flink-conf.yaml is alright • ./lib jars are loaded correctly I forgot to mention that TableAPI app (pyflink) was able to save to postgres without any problems, only sql-client returns these errors.
So I checked sql-gateway. The process was indeed not running. Even after I started, it the same problem still appears (inside sql-client only).
@rmoff, could this problem appear because I didn't fetch all hadoop jars into my container image? Then is it expected that sql client requires something more than table environment?
r
@Damian Fiłonowicz can you share a copy of the files you're using? then maybe we can reproduce or more easily diagnose the issue
d
Sure
Copy code
lib:
total 199M
drwxr-xr-x 1 flink flink 4.0K Sep 23 12:23 ..
drwxr-xr-x 1 root  root  4.0K Sep 22 10:12 hive
drwxr-xr-x 1 flink flink 4.0K Sep 21 12:26 .
-rw-r--r-- 1 flink flink 121M Jun  6 14:32 flink-dist-1.19.1.jar
-rw-r--r-- 1 flink flink  15M Jun  6 14:31 flink-table-api-java-uber-1.19.1.jar
-rw-r--r-- 1 flink flink  21M Jun  6 14:31 flink-scala_2.12-1.19.1.jar
-rw-r--r-- 1 flink flink  37M Jun  6 14:31 flink-table-planner-loader-1.19.1.jar
-rw-r--r-- 1 flink flink 100K Jun  6 14:27 flink-csv-1.19.1.jar
-rw-r--r-- 1 flink flink 199K Jun  6 14:26 flink-json-1.19.1.jar
-rw-r--r-- 1 flink flink 546K Jun  6 14:24 flink-connector-files-1.19.1.jar
-rw-r--r-- 1 flink flink 3.4M Jun  6 14:22 flink-table-runtime-1.19.1.jar
-rw-r--r-- 1 flink flink 194K Jun  6 14:22 flink-cep-1.19.1.jar
-rw-r--r-- 1 flink flink 204K Feb  2  2023 log4j-1.2-api-2.17.1.jar
-rw-r--r-- 1 flink flink 295K Feb  2  2023 log4j-api-2.17.1.jar
-rw-r--r-- 1 flink flink 1.8M Feb  2  2023 log4j-core-2.17.1.jar
-rw-r--r-- 1 flink flink  24K Feb  2  2023 log4j-slf4j-impl-2.17.1.jar

lib/hive:
total 465M
drwxr-xr-x 1 root  root  4.0K Sep 22 10:12 .
drwxr-xr-x 1 flink flink 4.0K Sep 21 12:26 ..
-rw-r--r-- 1 root  root  371M Sep  4 20:53 aws-java-sdk-bundle-1.12.771.jar
-rw-r--r-- 1 root  root  658K Aug 24 19:06 commons-lang3-3.17.0.jar
-rw-r--r-- 1 root  root  1.1M Aug 22 13:59 postgresql-42.7.4.jar
-rw-r--r-- 1 root  root  1.6M Jul  5 17:15 jackson-databind-2.17.2.jar
-rw-r--r-- 1 root  root  569K Jul  5 17:03 jackson-core-2.17.2.jar
-rw-r--r-- 1 root  root   77K Jul  5 16:50 jackson-annotations-2.17.2.jar
-rw-r--r-- 1 root  root   50M Jun  6 13:39 flink-sql-connector-hive-3.1.3_2.12-1.19.1.jar
-rw-r--r-- 1 root  root  6.5M Jun  6 13:32 flink-sql-parquet-1.19.1.jar
-rw-r--r-- 1 root  root  209K May  6 22:22 delta-flink-3.2.0.jar
-rw-r--r-- 1 root  root   11M May  6 22:22 delta-standalone_2.12-3.2.0.jar
-rw-r--r-- 1 root  root   25K May  6 22:21 delta-storage-3.2.0.jar
-rw-r--r-- 1 root  root  379K Apr 18 10:15 flink-connector-jdbc-3.2.0-1.19.jar
-rw-r--r-- 1 root  root  192K Oct 10  2023 stax2-api-4.2.2.jar
-rw-r--r-- 1 root  root  941K Jul 29  2022 hadoop-aws-3.3.4.jar
-rw-r--r-- 1 root  root  1.6M Jul 29  2022 hadoop-mapreduce-client-core-3.3.4.jar
-rw-r--r-- 1 root  root  6.0M Jul 29  2022 hadoop-hdfs-3.3.4.jar
-rw-r--r-- 1 root  root  4.3M Jul 29  2022 hadoop-common-3.3.4.jar
-rw-r--r-- 1 root  root  102K Jul 29  2022 hadoop-auth-3.3.4.jar
-rw-r--r-- 1 root  root  112K Jun 30  2022 re2j-1.7.jar
-rw-r--r-- 1 root  root  3.3M May 26  2021 hadoop-shaded-guava-1.1.1.jar
-rw-r--r-- 1 root  root  3.2M Apr  9  2021 shapeless_2.12-2.3.4.jar
-rw-r--r-- 1 root  root  511K Jul 15  2019 woodstox-core-5.3.0.jar
-rw-r--r-- 1 root  root  2.7M Oct 18  2018 guava-27.0-jre.jar
-rw-r--r-- 1 root  root  603K Feb  5  2017 commons-configuration2-2.1.1.jar
-rw-r--r-- 1 root  root   61K May 16  2013 commons-logging-1.1.3.jar
r
ah, i meant more how you're invoking it - you mentioned a container, and pyflink? basically everything someone would need to reproduce the error you're seeing
d
Do you see a log file for sql-gateway? such as sql-gateway.log or flink-sql-gateway-*.log ? this is set somewhere in flink-conf.yml file and by default I think would be in /opt/flink/logs directory. If the process ran long enough to have outputs it might have produced a log file which could have some useful information about the cause of stoppage.
d
Ah, I run a pyflink batch app with official operator in native mode. Image is built similarly as suggested in docs. After the job finishes I sshed into the "jobmanager" pod and wanted to execute some SQL queries. I already know that I was missing sql-gateway (I thought it was optional tbh). I start the process inside the container with command below. I see it's running.
Copy code
./bin/sql-gateway.sh start -Dsql-gateway.endpoint.rest.address=localhost
Then start sql-client and execute 3 statements:
Copy code
./bin/sql-client.sh

CREATE CATALOG my_catalog WITH (
    'type' = 'jdbc',
    'default-database' = 'benchmark_results',
    'username' = 'postgres',
    'password' = 'postgres',
    'base-url' = 'jdbc:<postgresql://postgres-postgresql.tpcds.svc.cluster.local:5432>'
);
USE CATALOG my_catalog;
then
select * from mytable
(
show tables
shows the table from public schema in postgres). And that's how I get
ERROR org.apache.flink.table.gateway.service.SqlGatewayServiceImpl - Failed to fetchResults.
I attached a full stack trace. log/flink--sql-gateway [...] .out file contains only
Copy code
ERROR StatusLogger Reconfiguration failed: No configuration found for '5a2e4553' at 'null' in 'null'
ERROR StatusLogger Reconfiguration failed: No configuration found for '4a668b6e' at 'null' in 'null'
d
The full stack trace clearly points to a problem with the executor factory not being found to execute the Flink application. The root cause is identified here:.
Copy code
Caused by: java.lang.IllegalStateException: No ExecutorFactory found to execute the application.
        at org.apache.flink.core.execution.DefaultExecutorServiceLoader.getExecutorFactory(DefaultExecutorServiceLoader.java:88)
This is usually caused when Flink cannot determine how to execute the submitted job. Probably a mismatch or misconfiguration in the execution environment setup where custom or specific executors are required. So it’s probably how SQL Gateway is initialized or its configuration, rather than a missing JAR or classpath issue. Check that your Flink SQL Gateway configuration flink-conf.yaml has the correct execution settings. In particular see if you need to explicitly set execution.runtime-mode to batch or streaming depending on your operations. Also check if there’s any custom configuration related to the executor service loader ie
Copy code
execution.executor-service-loader.class
that might be needed.
👍 1
Confirm that your environment variables and system properties are correctly set up when launching the SQL Gateway. Sometimes, these can influence how Flink sets up its execution environment.
I would recheck the compatibility between Flink version 1.19.1 and the versions of the JDBC connector and PostgreSQL driver you’re using. Although less likely given your other applications work you might want to doublecheck this.
Also make sure the Java version used matches the requirements of Flink 1.19.1 and does not cause any compatibility issues. I think JDK 11 should work.
Increase the log level to DEBUG for
Copy code
org.apache.flink.core.execution and org.apache.flink.table.gateway.service.operation
to get more details. I think you may need to check the log4j settings. In Flink 1.19.1 it’s probably in
Copy code
/opt/flink/conf/log4j2.xml