Apache Flink #troubleshooting

Abolfazl Ghahremani

03/27/2023, 8:56 AM

Hi all can anyone modeled a matching engine(e.g taxi onlines and stock orders ) before with flink and stateful function? Thanks

Michael Helmling

03/27/2023, 9:09 AM

After upgrading to Flink 1.17, I am running into this exception:

Copy code

java.lang.ArrayIndexOutOfBoundsException: Index -2147483648 out of bounds for length 5
	at org.apache.flink.streaming.runtime.watermarkstatus.HeapPriorityQueue.removeInternal(HeapPriorityQueue.java:155)
	at org.apache.flink.streaming.runtime.watermarkstatus.HeapPriorityQueue.remove(HeapPriorityQueue.java:100)
	at org.apache.flink.streaming.runtime.watermarkstatus.StatusWatermarkValve$InputChannelStatus.removeFrom(StatusWatermarkValve.java:300)
	at org.apache.flink.streaming.runtime.watermarkstatus.StatusWatermarkValve$InputChannelStatus.access$200(StatusWatermarkValve.java:266)
	at org.apache.flink.streaming.runtime.watermarkstatus.StatusWatermarkValve.markWatermarkUnaligned(StatusWatermarkValve.java:222)
	at org.apache.flink.streaming.runtime.watermarkstatus.StatusWatermarkValve.inputWatermarkStatus(StatusWatermarkValve.java:140)
	at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:153)
	at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:110)
	at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
	at org.apache.flink.streaming.runtime.io.StreamMultipleInputProcessor.processInput(StreamMultipleInputProcessor.java:85)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:550)
	at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:839)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:788)
	at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:952)
	at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:931)
	at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:745)
	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
	at java.base/java.lang.Thread.run(Unknown Source)

This is a simple job (one Kafka input, one Kafka output, stateless) with watermark alignment disabled. Is this a bug?

Tsering

03/27/2023, 10:15 AM

Good morning, afternoon and evening i want to do a perf/load test for my KDA flink application and dont know where to start, can some offer me so some suggestion ? i saw this benchmark Kinesis Data Analytics Flink Benchmarking Utility from aws doc but that looks like it is only for session base window (please correct me if i am wrong ) while mine is tumbling window.

Zhiyu Tian

03/27/2023, 11:00 AM

Hi all, Flink Kubernetes Operator Hadoop config related. 1. When I submit the Flink job bellow, the TaskManager could not be created due to error: """ Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 3m39s default-scheduler Successfully assigned magnetar/statemachine-taskmanager-1-24 to minikube Warning FailedMount 97s kubelet Unable to attach or mount volumes: unmounted volumes=[hadoop-config-volume], unattached volumes=[hadoop-config-volume flink-config-volume kube-api-access-ft56w flink-logs]: timed out waiting for the condition Warning FailedMount 91s (x9 over 3m39s) kubelet MountVolume.SetUp failed for volume "hadoop-config-volume" : configmap "hadoop-config-statemachine" not found""" 2. The HadoopConfMountDecorator.java (https://github.com/apache/flink/blob/master/flink-kubernetes/src/main/java/org/apa[…]/kubernetes/kubeclient/decorators/HadoopConfMountDecorator.java) says that it would create one, but no error log found in the JobManager. Could you help on this?

Copy code

apiVersion: <http://flink.apache.org/v1beta1|flink.apache.org/v1beta1>
kind: FlinkDeployment
metadata:
  name: statemachine
  labels:
    type: flink-native-kubernetes
spec:
  image: zhiyut/statemachine:2.0
  # image: flink:1.16
  flinkVersion: v1_16
  flinkConfiguration:
    taskmanager.numberOfTaskSlots: "2"
  serviceAccount: flink
  podTemplate:
    apiVersion: v1
    kind: Pod
    metadata:
      name: pod-template
    spec:
      containers:
        # Do not change the main container name
        - name: flink-main-container
          volumeMounts:
            - mountPath: /opt/flink/log
              name: flink-logs
          env:
            - name: MT_TOKEN
              value: <###>

      volumes:
        - name: flink-logs
          emptyDir: { }
  jobManager:
    resource:
      memory: "2048m"
      cpu: 1
  taskManager:
    resource:
      memory: "2048m"
      cpu: 1
  job:
    jarURI: local:///opt/flink/usrlib/StateMachineExample.jar
    entryClass: org.apache.flink.streaming.examples.statemachine.StateMachineExample
    parallelism: 2
    upgradeMode: stateless

Jashwanth S J

03/27/2023, 11:37 AM

Hi Team, We're using https://github.com/apache/flink-kubernetes-operator/tree/release-1.4 to deploy flink operator through ArgoCD. Unable to deploy with these errors. Can someone provide information on the cause and solution

高志翔

03/27/2023, 12:29 PM

2023-03-27 171232,799 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Decline checkpoint 145808 by task f05374c958a69e3e10c941a67bc989a4 of job 315916e1be6a757f393973f31df8fc1c at container_e80_1670459712027_0027_01_000002 @ worker-06.hadoop.xtadmins.com (dataPort=37846). org.apache.flink.util.SerializedThrowable: Asynchronous task checkpoint failed. at org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.handleExecutionException(AsyncCheckpointRunnable.java:279) ~[flink-dist_2.11-1.13.6.jar:1.13.6] at org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:175) ~[flink-dist_2.11-1.13.6.jar:1.13.6] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_202] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_202] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202] Caused by: org.apache.flink.util.SerializedThrowable: Could not materialize checkpoint 145808 for operator Join(joinType=[InnerJoin], where=[(beneficiary_account_id = beneficiary_account_id0)], select=[created_time, status, beneficiary_account_id, withdraw_request_id, withdraw_amount, withdraw_currency, fee_amount, firm_id, beneficiary_account_id0, encrypted_beneficiary_account_no, swift_code], leftInputSpec=[NoUniqueKey], rightInputSpec=[NoUniqueKey]) -> Calc(select=[created_time, status, withdraw_request_id, withdraw_amount, withdraw_currency, fee_amount, firm_id, encrypted_beneficiary_account_no, swift_code, SUBSTR(CAST(created_time), 1, 10) AS $f59]) (1/1)#2. at org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.handleExecutionException(AsyncCheckpointRunnable.java:257) ~[flink-dist_2.11-1.13.6.jar:1.13.6] ... 4 more Caused by: org.apache.flink.util.SerializedThrowable: java.io.IOException: Could not flush to file and close the file system output stream to hdfs:/user/yarn/checkpoints/e5f40fb2170c4fff9461f5eacea40ca9/315916e1be6a757f393973f31df8fc1c/shared/49480ebe-c5fa-459d-a65f-557ce902ff69 in order to obtain the stream state handle at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_202] at java.util.concurrent.FutureTask.get(FutureTask.java:192) ~[?:1.8.0_202] at org.apache.flink.runtime.concurrent.FutureUtils.runIfNotDoneAndGet(FutureUtils.java:636) ~[flink-dist_2.11-1.13.6.jar:1.13.6] at org.apache.flink.streaming.api.operators.OperatorSnapshotFinalizer.<init>(OperatorSnapshotFinalizer.java:54) ~[flink-dist_2.11-1.13.6.jar:1.13.6] at org.apache.flink.streaming.runtime.tasks.AsyncCheckpointRunnable.run(AsyncCheckpointRunnable.java:128) ~[flink-dist_2.11-1.13.6.jar:1.13.6] ... 3 more Caused by: org.apache.flink.util.SerializedThrowable: Could not flush to file and close the file system output stream to hdfs:/user/yarn/checkpoints/e5f40fb2170c4fff9461f5eacea40ca9/315916e1be6a757f393973f31df8fc1c/shared/49480ebe-c5fa-459d-a65f-557ce902ff69 in order to obtain the stream state handle at org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory$FsCheckpointStateOutputStream.closeAndGetHandle(FsCheckpointStreamFactory.java:373) ~[flink-dist_2.11-1.13.6.jar:1.13.6] at org.apache.flink.contrib.streaming.state.RocksDBStateUploader.uploadLocalFileToCheckpointFs(RocksDBStateUploader.java:143) ~[flink-statebackend-rocksdb_2.11-1.13.3.jar:1.13.3] at org.apache.flink.contrib.streaming.state.RocksDBStateUploader.lambda$createUploadFutures$0(RocksDBStateUploader.java:101) ~[flink-statebackend-rocksdb_2.11-1.13.3.jar:1.13.3] at org.apache.flink.util.function.CheckedSupplier.lambda$unchecked$0(CheckedSupplier.java:32) ~[flink-dist_2.11-1.13.6.jar:1.13.6] at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) ~[?:1.8.0_202] ... 3 more Caused by: org.apache.flink.util.SerializedThrowable: java.lang.OutOfMemoryError: unable to create new native thread at org.apache.hadoop.hdfs.ExceptionLastSeen.set(ExceptionLastSeen.java:45) ~[dlink-client-hadoop-XT-0.6.1-SNAPSHOT.jar:?] at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:829) ~[dlink-client-hadoop-XT-0.6.1-SNAPSHOT.jar:?] Caused by: org.apache.flink.util.SerializedThrowable: unable to create new native thread at java.lang.Thread.start0(Native Method) ~[?:1.8.0_202] at java.lang.Thread.start(Thread.java:717) ~[?:1.8.0_202] at org.apache.hadoop.hdfs.DataStreamer.initDataStreaming(DataStreamer.java:633) ~[dlink-client-hadoop-XT-0.6.1-SNAPSHOT.jar:?] at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:717) ~[dlink-client-hadoop-XT-0.6.1-SNAPSHOT.jar:?] 2023-03-27 171232,807 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Trying to recover from a global failure. org.apache.flink.util.FlinkRuntimeException: Exceeded checkpoint tolerable failure threshold. at org.apache.flink.runtime.checkpoint.CheckpointFailureManager.handleCheckpointException(CheckpointFailureManager.java:98) ~[flink-dist_2.11-1.13.6.jar:1.13.6] at org.apache.flink.runtime.checkpoint.CheckpointFailureManager.handleTaskLevelCheckpointException(CheckpointFailureManager.java:84) ~[flink-dist_2.11-1.13.6.jar:1.13.6] at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.abortPendingCheckpoint(CheckpointCoordinator.java:1931) ~[flink-dist_2.11-1.13.6.jar:1.13.6] at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.receiveDeclineMessage(CheckpointCoordinator.java:991) ~[flink-dist_2.11-1.13.6.jar:1.13.6] at org.apache.flink.runtime.scheduler.ExecutionGraphHandler.lambda$declineCheckpoint$2(ExecutionGraphHandler.java:103) ~[flink-dist_2.11-1.13.6.jar:1.13.6] at org.apache.flink.runtime.scheduler.ExecutionGraphHandler.lambda$processCheckpointCoordinatorMessage$3(ExecutionGraphHandler.java:119) ~[flink-dist_2.11-1.13.6.jar:1.13.6] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_202] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_202] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_202] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[?:1.8.0_202] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_202] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_202] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_202] But the environment of my computer is normal, it has this error

Christophe Bornet

03/27/2023, 12:54 PM

Hi, I see that

flink-avro

deserializes to SQL Tables using the position in the avro record. Since with avro we have the schema and the field names, is there a possibility to map by field names instead ? The current converter leads to issues when the schema evolves.

Tsering

03/27/2023, 3:33 PM

hi sorry to ask this again but can some one please tell me how to load test kinesis data analytics flink app ?

Abolfazl Ghahremani

03/27/2023, 7:13 PM

how to access past events in stateful stream processing

craig fan

03/28/2023, 3:52 AM

Hello, I'm running into a weird error and not sure what's going on. Does anybody have any idea?

Copy code

org.apache.flink.runtime.io.network.partition.PartitionNotFoundException: Partition 512d1b5e6860de484fc7792beb466764#0@0b1dc40d062b680b0258b6b4cb90041d not found.
	at org.apache.flink.runtime.io.network.partition.ResultPartitionManager.createSubpartitionView(ResultPartitionManager.java:70)
	at org.apache.flink.runtime.io.network.partition.consumer.LocalInputChannel.requestSubpartition(LocalInputChannel.java:135)
	at org.apache.flink.runtime.io.network.partition.consumer.LocalInputChannel$1.run(LocalInputChannel.java:185)
	at java.base/java.util.TimerThread.mainLoop(Unknown Source)
	at java.base/java.util.TimerThread.run(Unknown Source)

My flink job is on 1.15

Deepyaman Datta

03/28/2023, 4:54 AM

Hi! I'm using

pyiceberg

, and I have a configuration file like:

Copy code

# ~/.pyiceberg.yaml
catalog:
  default:
    uri: <http://localhost:8189>
    s3.endpoint: <http://localhost:9100>
    s3.access-key-id: admin
    s3.secret-access-key: password

I actually wanted to use environment variables instead, but is it possible to have the S3-related keys using the

pyiceberg

environment variable syntax?

Ashwin Kolhatkar

03/28/2023, 6:07 AM

Hi all, I am trying to use the s3 plugin in flink 1.16. I have deployed the standalone kubernetes cluster in session mode using a custom dockerfile as follows:

Copy code

FROM apache/flink:1.16.0-scala_2.12

RUN mkdir -p /opt/flink/plugins/flink-s3-fs-hadoop
RUN mv /opt/flink/opt/flink-s3-fs-hadoop-*.jar /opt/flink/plugins/flink-s3-fs-hadoop/.

When I try to submit a job (using the new delta lake read connector) - here is some of the code to get an idea:

Copy code

DeltaSource<RowData> deviceDeltaTableStream = DeltaSource
      .forContinuousRowData(
            new Path("<s3a://my-delta-table-location>"),
            new Configuration())
      .columnNames("id", "state")
      .build();

Since the delta lake file is stored in s3 - I need to be able to access it from there. But when I submit this job - I get this error

Copy code

org.apache.flink.runtime.rest.handler.RestHandlerException: Could not execute application.
	at org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$handleRequest$1(JarRunHandler.java:113)
	at java.base/java.util.concurrent.CompletableFuture.uniHandle(Unknown Source)
	at java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(Unknown Source)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(Unknown Source)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.util.concurrent.CompletionException: org.apache.flink.util.FlinkRuntimeException: Could not execute application.
	at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(Unknown Source)
	at java.base/java.util.concurrent.CompletableFuture.completeThrowable(Unknown Source)
	... 2 more
Caused by: org.apache.flink.util.FlinkRuntimeException: Could not execute application.
	at org.apache.flink.client.deployment.application.DetachedApplicationRunner.tryExecuteJobs(DetachedApplicationRunner.java:88)
	at org.apache.flink.client.deployment.application.DetachedApplicationRunner.run(DetachedApplicationRunner.java:70)
	at org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$handleRequest$0(JarRunHandler.java:107)
	... 2 more
Caused by: org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found
	at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:372)
	at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222)
	at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:98)
	at org.apache.flink.client.deployment.application.DetachedApplicationRunner.tryExecuteJobs(DetachedApplicationRunner.java:84)
	... 4 more
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2592)
	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3320)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3352)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3403)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3371)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:477)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361)
	at io.delta.standalone.internal.DeltaLogImpl$.apply(DeltaLogImpl.scala:260)
	at io.delta.standalone.internal.DeltaLogImpl$.forTable(DeltaLogImpl.scala:241)
	at io.delta.standalone.internal.DeltaLogImpl.forTable(DeltaLogImpl.scala)
	at io.delta.standalone.DeltaLog.forTable(DeltaLog.java:164)

What could be going wrong here? Also, attaching jobmanager logs in thread.

Jasper Dunker

03/28/2023, 7:46 AM

👀 1

Yubin Li

03/28/2023, 8:04 AM

Hi, everyone when I submit a batch job writing data to files in Flink 1.14.6, encountered the following issue, Could anyone help to take a view?

Copy code

Caused by: org.apache.flink.util.FlinkException: An OperatorEvent from an OperatorCoordinator to a task was lost. Triggering task failover to ensure consistency. Event: '[NoMoreSplitEvent]', targetTask: Source: paimon-f2b6dae9-74c2-40bc-895e-a616a551f409.default.ts_table -> Calc(select=[dt, k, v], where=[(dt < _UTF-16LE'2023-01-17')]) -> NotNullEnforcer(fields=[dt, k]) -> TableToDataSteam(type=ROW<`dt` STRING NOT NULL, `k` INT NOT NULL, `v` INT> NOT NULL, rowtime=false) -> Map (1/1) - execution #0
	... 33 more
Caused by: org.apache.flink.runtime.operators.coordination.TaskNotRunningException: Task is not running, but in state FINISHED
	at org.apache.flink.runtime.taskmanager.Task.deliverOperatorEvent(Task.java:1475)
	at org.apache.flink.runtime.taskexecutor.TaskExecutor.sendOperatorEventToTask(TaskExecutor.java:1249)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$handleRpcInvocation$1(AkkaRpcActor.java:316)
	at org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:83)
	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:314)
	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:217)
	... 21 more

Abolfazl Ghahremani

03/28/2023, 8:11 AM

why doc is not strightforward

Mali

03/28/2023, 9:44 AM

Hello everyone i have very noob question; I am trying to implement pyflink v1.16.0 to my pycharm for local development. I followed the instructions on documentation but i am getting; “Could not find valid FLINK_HOME(Flink distribution directory) in current environment.” Anyone has idea about that ? How can i define FLINK_HOME

Yemson Rose

03/28/2023, 1:49 PM

Hello please i need urgent help here, i'm runing apache1.17 on eclipse but getting this error

Yemson Rose

03/28/2023, 1:50 PM

*[1;34mINFO*[m] ------------------------------------------------------------- [*[1;31mERROR*[m] COMPILATION ERROR : [*[1;34mINFO*[m] ------------------------------------------------------------- [*[1;31mERROR*[m] Source option 5 is no longer supported. Use 7 or later. [*[1;31mERROR*[m] Target option 5 is no longer supported. Use 7 or later. [*[1;34mINFO*[m] 2 errors [*[1;34mINFO*[m] ------------------------------------------------------------- [*[1;34mINFO*[m] *[1m------------------------------------------------------------------------*[m [*[1;34mINFO*[m] *[1;31mBUILD FAILURE*[m [*[1;34mINFO*[m] *[1m------------------------------------------------------------------------*[m [*[1;34mINFO*[m] Total time: 0.791 s [*[1;34mINFO*[m] Finished at: 2023-03-28T143031+01:00 [*[1;34mINFO*[m] *[1m------------------------------------------------------------------------*[m [*[1;31mERROR*[m] Failed to execute goal [32morg.apache.maven.pluginsmaven compiler plugin3.1:compile[m *[1m(default-compile)*[m on project [36mwc[m: *[1;31mCompilation failure*[m: Compilation failure: [*[1;31mERROR*[m] Source option 5 is no longer supported. Use 7 or later. [*[1;31mERROR*[m] Target option 5 is no longer supported. Use 7 or later. [*[1;31mERROR*[m] -> *[1m[Help 1]*[m [*[1;31mERROR*[m] [*[1;31mERROR*[m] To see the full stack trace of the errors, re-run Maven with the *[1m-e*[m switch. [*[1;31mERROR*[m] Re-run Maven using the *[1m-X*[m switch to enable full debug logging. [*[1;31mERROR*[m] [*[1;31mERROR*[m] For more information about the errors and possible solutions, please read the following articles: [*[1;31mERROR*[m] *[1m[Help 1]*[m http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException

Liad Shachoach

03/28/2023, 4:01 PM

We are trying to configure checkpointing to Azure blob with flink 1.17 but it does not seem to work. We did get this working in version 1.16.1. is this a known issue?

Dheeraj Panangat

03/28/2023, 4:59 PM

Hi All, Is anyone familiar with this issue :

Copy code

Caused by: java.lang.ClassCastException: class org.apache.flink.table.data.vector.heap.HeapBytesVector cannot be cast to class org.apache.flink.table.data.vector.LongColumnVector (org.apache.flink.table.data.vector.heap.HeapBytesVector and org.apache.flink.table.data.vector.LongColumnVector are in unnamed module of loader 'app')
	at org.apache.flink.table.data.vector.VectorizedColumnBatch.getLong(VectorizedColumnBatch.java:86) ~[flink-table_2.12-1.14.6.jar:1.14.6]

Getting this when executing

Copy code

tableEnv.createStatementSet()
                .addInsertSql("stmt1")
                .addInsertSql("stmt2")
                .addInsertSql("stmt3")
                .addInsertSql("stmt4")
                .execute();

Vignesh Venkataraman

03/28/2023, 7:04 PM

Does flink still support scala? I tried going over the main website for latest docs - https://flink.apache.org/ and couldnt find any scala examples or Scala API. So going forward, is it possible to use Flink with scala? I also came across this blog - https://flink.apache.org/2022/02/22/scala-free-in-one-fifteen/

Copy code

Users can now leverage the Java API from any Scala version

Im finding it a little hard to understand the above message, please let me know if its possible process using Flink through scala? And i would highly appreciate it if the community could also share some Scala-Flink tutorials

Lydian Lee

03/29/2023, 6:38 AM

Hi, I am looking for a suggestion on Flink HA. We have multiple kubernetes cluster runs in different region, and for better HA, I am thinking of deploying the flink job in different k8s cluster. For Job manager, I can probably use zookeeper or custom

HighAvailabilityServicesFactory

to help me choose which Job manager to use. However, when it comes to Task managers, is there a way to specify only the TM with the same region as the JM will be used to run the job ? Thanks

Ron Ben Arosh

03/29/2023, 8:14 AM

Trying to set restore mode to

claim

via flink-conf (Running flink 1.16 as StandaloneApplication, checkpoints are stored in S3 bucket) Whenever Flink load checkpoints, a new checkpoint dir is created in addition to the existing one - so I think the default mode is used (and not claim) Any assistance will be appreciated 🙂

Mali

03/29/2023, 11:18 AM

Hello everyone, I am pretty new in flink and trying to use Python Table API, If i use stream api i am able to define “job_name” but i couldn’t do that with Table API. I saw something on stackoverflow but couldn’t be sure. Is it impossible to define job_name while using table api ?

Slackbot

03/29/2023, 12:07 PM

This message was deleted.

Jashwanth S J

03/29/2023, 12:22 PM

Hi Team, In session cluster deployment using Flink K8s operator, can we define and run more task manager pods to make sure we have available slots when we submit the job from UI? How can we do this? Attaching is the screenshot, we would like to have more available task slots before submitting job

Jashwanth S J

03/29/2023, 12:24 PM

Ha Long Do

03/29/2023, 2:18 PM

Hi everyone, I am having a trouble when deploying Apache Flink 1.16.1 on 2 Google Cloud instances with Docker Swarm. The JobManager is deployed on the manager node and the TaskManager is deployed in the worker node. The TaskManager seems to have a trouble to communicate with ResourceManager on JobManager via akka.tcp.

Copy code

flink_taskmanager.1.e7sxy43lsb49@workernode.gcp    | 2023-03-29 13:50:34,061 INFO  org.apache.flink.runtime.taskexecutor.DefaultJobLeaderService [] - Start job leader service.
flink_taskmanager.1.e7sxy43lsb49@workernode.gcp    | 2023-03-29 13:50:34,066 INFO  org.apache.flink.runtime.filecache.FileCache                 [] - User file cache uses directory /tmp/flink-dist-cache-705fdca5-c285-4875-9b68-556ccd1b56c3
flink_taskmanager.1.e7sxy43lsb49@workernode.gcp    | 2023-03-29 13:50:34,073 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Connecting to ResourceManager <akka.tcp://flink@jobmanager:6123/user/rpc/resourcemanager_*(00000000000000000000000000000000)>.
flink_taskmanager.1.e7sxy43lsb49@workernode.gcp    | 2023-03-29 13:50:34,420 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Resolved ResourceManager address, beginning registration
flink_taskmanager.1.e7sxy43lsb49@workernode.gcp    | 2023-03-29 13:55:34,086 ERROR org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Fatal error occurred in TaskExecutor <akka.tcp://flink@10.0.1.14:6127/user/rpc/taskmanager_0>.
flink_taskmanager.1.e7sxy43lsb49@workernode.gcp    | org.apache.flink.runtime.taskexecutor.exceptions.RegistrationTimeoutException: Could not register at the ResourceManager within the specified maximum registration duration PT5M. This indicates a problem with this instance. Terminating now.

About firewall rules, these instances are deployed in the same VPC, same subnet, so I suppose it can communicate without any troubles. In fact, I can ping or curl on both instances. Before deploying to Google Cloud, I have successfully deployed the same setup on my private Microstack cloud without any problem. Here is my docker-compose file that I used `docker stack deploy`:

Copy code

version: '3.8'

services:
  jobmanager:
    image: halo93/fixed-ports-flink-docker:1.16.1-scala_2.12-java11-custom
    deploy:
      replicas: 1
      placement:
        constraints: [node.hostname == managernode.gcp]
    ports:
      - "8081:8081"
      - "6123:6123"
      - "6124:6124"
      - "6125:6125"
    command: jobmanager
    environment:
      - FLINK_PROPERTIES=${FLINK_PROPERTIES}
    networks:
      - flink-network
  taskmanager:
    image: halo93/fixed-ports-flink-docker:1.16.1-scala_2.12-java11-custom
    deploy:
      replicas: 1
      placement:
        constraints: [node.hostname == workernode.gcp]
    depends_on:
      - jobmanager
    ports:
      - "6121:6121"
      - "6122:6122"
      - "6126:6126"
      - "6127:6127"
      - "6128:6128"
      - "5005:5005/udp"
    command:
      - taskmanager
    environment:
      - FLINK_PROPERTIES=${FLINK_PROPERTIES}
    networks:
      - flink-network

networks:
  flink-network:
    driver: overlay
    attachable: true

FLINK_PROPERTIES

Copy code

FLINK_PROPERTIES=$'\njobmanager.rpc.address: jobmanager\nparallelism.default: 2\n'

I am using a customized flink docker image to fix taskmanager.data.port and taskmanager.rpc.port to 6126 and 6127. I have tried to change jobmanager.rpc.address with the private IP and zonal DNS, ResourceManager can register the taskmanager. However, by doing so, flink-metrics is unable to work. Does anyone know how to fix this issue? I would be really appreciated

czchen

03/29/2023, 2:44 PM

We have Flink application that needs to use hadoop to enumerate GCS, and in document [0], we shall setup Flink with hadoop dependencies. Anyone know how to setup Flink with hadoop dependencies? [0] https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/dev/configuration/advanced/#hadoop-dependencies

Reme Ajayi

03/29/2023, 3:35 PM

Hi everyone, questions about stream joins: 1. When cascading join between multiple streams should the Windows be of the same length or progressively increase after each join? For example: joining two streams A and B every minute, then A+B and another stream C every minute and A + B + C and another stream D every minute. 2. What are the caveats for determing Window size for a stream join? Thank you!