Apache Flink #troubleshooting

Join Slack

Jark Wu

05/19/2022, 1:39 PM

set the channel description: Asking for helps on using Apache Flink

Jark Wu

05/19/2022, 1:57 PM

set the channel topic: Best place to ask for help or discuss any troubleshooting-related problems!

Jark Wu

05/19/2022, 1:57 PM

set the channel description: This is the best place to ask for help or discuss any troubleshooting-related problems!

Jark Wu

05/19/2022, 1:57 PM

set the channel topic: Asking for helps on using Apache Flink

Jark Wu

05/19/2022, 1:57 PM

set the channel topic: Asking for helps on using Apache Flink!

Xintong Song

05/20/2022, 11:12 AM

set the channel topic: Asking for helps on using Apache Flink! (Do not mention specific people.)

Jeremy Ber

06/02/2022, 5:35 PM

Hey there, architectural flink question for Flink <=1.13--I need a way to replicate side output of late events in a window in the Table / SQL api--What are the downsides to writing a UDF that looks at the current time (processing time) compared to a record’s event time and marks it as “late” if it is older than some predetermined watermark interval (5 minutes)?

💡 3

Benjamin Carlton

06/02/2022, 10:38 PM

Greetings all. I hope I’m not out of bounds by posting here. Please let me know if I am and I will delete this and seek assistance elsewhere. I’m a member of IBM’s IT support and we have found that IBM cannot access Flink’s URL on the domain: nightlies.apache.org After investigation, we believe that we may have been mistakenly placed on a deny list, and we need to request that IBM’s addresses be placed on an allow list or verified as already belonging to an allow list for the above listed domain. I have already emailed the Apache.org webmaster and created a Flink Jira ticket (https://issues.apache.org/jira/browse/FLINK-27887) but we were hoping to verify the correct process for our request. If you could help us gain access to the above domain from the attached list of addresses, it would be most appreciated. If there is another process I need to follow, please let me know and I will be happy to follow that process instead. Or, if there’s a specific person to whom I should make my request, I would appreciate their name.

👀 1

Jeff Levesque

06/03/2022, 8:34 PM

Hello All -- I'm working on a school project, where I'm trying to integrate AWS PyFlink examples using the

print

connector. My intention is to develop flink applications locally, to test the sliding window functionality with the

print

connector. If all goes well, I'd promote the PyFlink application to AWS. However, during my testing, the application just hangs, as if the

blackhole

connect was used instead of the

print

connector. I've created a StackOverflow post to hopefully better link the corresponding script/resources: • https://stackoverflow.com/questions/72311800/implement-print-connector-with-sliding-window-sink

HKB

06/04/2022, 2:31 AM

Hey Guys, how do you bundle your code to create the Flink tgz distribution? Details: I am trying to test out some changes I made on the code, and want to create a docker image to test it. I found this Docker file to dockerize Flink, but the missing piece is that there is a tgz used here which is picked using wget, as described here. This tgz is the Flink distribution. But, mvn packaging does not create this tar (only the jars are created). Is there a tool/script to create the tgz?

Marco Villalobos

06/04/2022, 3:32 AM

Hi everybody. I am trying to restore from a checkpoint. I am having two problems. On this page, https://nightlies.apache.org/flink/flink-docs-release-1.12/ops/state/checkpoints.html#resuming-from-a-retained-checkpoint I am not sure what "A job may be resumed from a checkpoint just as from a savepoint by using the checkpoint’s meta data file instead" means because I cannot find a specification of what that meta data file is. Does anybody know?

Marco Villalobos

06/04/2022, 3:33 AM

My checkpoints were deleted, but we have them versioned in s3. So, I am trying to figure out which files to restore in the "shared" directory. Is there an easy way to figure it out? I fear that it will be thousands of files.

Marco Villalobos

06/04/2022, 5:28 AM

so, I did find the _metadata file, converted to text, and grep matched what I assume are the shared files, and there are 22000 of them. This will be rough.

Sucheth Shivakumar

06/05/2022, 4:47 PM

Hi Everybody, Im working kafka to file sink and i see my part files are not rolling over https://stackoverflow.com/questions/72496963/apache-flink-filesink-not-rolling-over-part-files

Sucheth Shivakumar

06/05/2022, 4:47 PM

can someone please point-out what wrong im doing here ?

czchen

06/06/2022, 12:45 AM

We are trying to upgrade to Flink operator 0.1.0 to 1.0.0, and have the following problems: • New field

spec.serviceAccount

is needed. This is not mentioned in document. • The

initialSavepointPath

in https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.0/docs/operations/upgrade/#upgrading-with-existing-flinkdeployments is point to

/.../_metadata

. However, we always point to

_metadata

parent directory. Are they both supported?

sri hari kali charan Tummala

06/06/2022, 2:17 AM

Guys, to start reverse engineering flink open source code where to start https://github.com/apache/flink in the code repo?

sri hari kali charan Tummala

06/06/2022, 2:19 AM

can someone point me to the main method in the flink source code GitHub repo?

Jeesmon Jacob

06/06/2022, 12:53 PM

Hi there, nice to see release of flink operator v1.0.0. Congrats!!! I have a question on the validation webhook for CRDs. We are operating in a highly restricted multi-tenant environment where cluster scoped resources are not permitted except a service desk ticket to deploy CRDs and RBACs to create those resources for appropriate auth group. We are running operator as namespace scoped but as

ValidatingWebhookConfiguration

is cluster scoped, it is not allowed. So wondering what we will miss if we don't run the validation webhook and instead add some checks in our CI/CD pipelines. I see the

DefaultValidator

is called from both webhook and controller. So even if CR is admitted without validation through webhook, controller will still validate it as part of Reconcile logic? Thanks.

Jeesmon Jacob

06/06/2022, 1:56 PM

Just a suggestion, it will be nice to get release announcements to #C03FR2EDUF4 channel too. Like the operator v1.0.0 release happened yesterday 🙂

Jeesmon Jacob

06/07/2022, 12:20 AM

FlinkDeployment

CR, is there an option to resolve a

flinkConfiguration

from a ConfigMap or Env loaded from a ConfigMap. For example, in our side s3 bucket is created by another pipeline and endpoint and bucket name are exposed in a ConfigMap. We would like to load that ConfigMap to JM/TM pods using podTemplate and use the bucket name for

state.savepoints.dir

state.checkpoints.dir

and

high-availability.storageDir

instead of hardcoding it in

FlinkDeployment

. Thanks.

czchen

06/07/2022, 6:28 AM

We try to add java options in native Kubernetes application mode with the following command line:

Copy code

/opt/flink/bin/flink run-application --target kubernetes-application -Denv.java.opts.taskmanager='-XX:+UseG1GC -verbose:class'

However, when the application starts, the log indicates that it does not load this option correctly:

Copy code

2022-06-07 06:24:13,214 INFO  org.apache.flink.configuration.GlobalConfiguration           [] - Loading configuration property: env.java.opts.taskmanager, '-XX:+UseG1GC

Any idea how to solve this problem?

czchen

06/07/2022, 10:13 AM

We found that

flink-s3-fs-hadoop-1.15.0.jar

and

flink-gs-fs-hadoop-1.15.0.jar

are incompatible in Flink 1.15. Anyone know how to solve this issue? (original discuss in mailing list: https://www.mail-archive.com/user@flink.apache.org/msg47950.html) The following are detail: We got the following error when migrating to Flink 1.15

Copy code

Caused by: java.lang.ClassCastException: class com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem cannot be cast to class org.apache.hadoop.fs.FileSystem (com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem and org.apache.hadoop.fs.FileSystem are in unnamed module of loader 'app')

Add

-verbose:class

to Java options show the following:

Copy code

[2.074s][info][class,load] org.apache.hadoop.fs.FileSystem source: file:/opt/flink/opt/flink-s3-fs-hadoop-1.15.0.jar
...
[8.094s][info][class,load] com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem source: file:/opt/flink/opt/flink-gs-fs-hadoop-1.15.0.jar

Looks like Flink are trying to cast

com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem

flink-gs-fs-hadoop-1.15.0.jar

org.apache.hadoop.fs.FileSystem

flink-s3-fs-hadoop-1.15.0.jar

without success.

✅ 1

aromal

06/07/2022, 10:37 AM

We have an Apache Flink Kafka-Kafka pipelines and now we are moving towards “Avro Confluent Schema” for both Kafka Key and Value. What I found difficulty is in implementing Kafka source(KafkaSource<OUT>) and sink(KafkaSink<IN>) with confluent Kafka schema for both key and value. Value alone being confluent Avro is straightforward. #C03G7LJTS2G

Veeramani Moorthy

06/07/2022, 11:17 AM

In Flink SQL, is it possible to materialise aggregated result into sink table (here, the connector is filesystem with csv format)? I am running the below query & getting the error response.

🧵 1

Veeramani Moorthy

06/07/2022, 11:17 AM

INSERT INTO dept_count SELECT dept, count(emp_id) AS emp_count FROM employee GROUP BY dept;

Veeramani Moorthy

06/07/2022, 11:17 AM

Error: [ERROR] Could not execute SQL statement. Reason: org.apache.flink.table.api.TableException: Table sink 'default_catalog.default_database.dept_count' doesn't support consuming update changes which is produced by node GroupAggregate(groupBy=[dept], select=[dept, COUNT(emp_id) AS emp_count])

Veeramani Moorthy

06/07/2022, 11:18 AM

I don't need streaming outcome here. I am good with batch outcome.

Veeramani Moorthy

06/07/2022, 11:19 AM

Both source & sink tables are working with filesystem as connector here

liwei li

06/07/2022, 12:29 PM

Hi, guys, I have a question. We want to extend the SQL syntax to Flink in a non-invasive way without modifying the source code. Add some custom syntax. Is there any good way? thank you