Oscar Perez
12/20/2022, 10:30 AMOscar Perez
12/20/2022, 1:33 PMtestHarness.processElement(2L, 100L);
This pushes the element with processing time 100L ?
Then I see that the processWatermark method is used but this seems to refer to event time timers. Is this needed in this case?
//trigger event time timers by advancing the event time of the operator with a watermark
testHarness.processWatermark(100L);
In many examples I see that we exclusively use processWaterMark method and not setProcessingtime like in here:
https://github.com/apache/flink/blob/master/flink-streaming-java/src/test/java/org[…]k/streaming/runtime/operators/windowing/WindowOperatorTest.java
I am a bit confused on how to glue all these methods together and on what basis to use one or another. Thank youkingsathurthi
12/20/2022, 2:05 PMSergio Morales
12/20/2022, 3:09 PMFlinkKafkaProducer011
from
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-kafka-0.11_2.12</artifactId>
<version>1.11.6</version>
</dependency>
Now, when executing it fails at runtime because
switched from INITIALIZING to FAILED with failure cause: java.lang.NoClassDefFoundError: org/apache/flink/shaded/guava18/com/google/common/collect/Lists
Checking the code it still references to guava18, but other classes/dependencies seem to refer to guava30 instead, is it a bug or do you have any advice for this case?Tsering
12/20/2022, 3:13 PMException type is USER from filter results [UserClassLoaderExceptionFilter -> NONE, UserAPIExceptionFilter -> NONE, UserSerializationExceptionFilter -> USER, UserFunctionExceptionFilter -> SKIPPED, OutOfMemoryExceptionFilter -> NONE, TooManyOpenFilesExceptionFilter -> NONE, KinesisServiceExceptionFilter -> NONE].
can someone please enlight on this reason or where thing get wrong ?Jason Politis
12/20/2022, 5:46 PM[ERROR] Failed to execute goal on project flink-connector-hive_2.12: Could not resolve dependencies for project org.apache.flink:flink-connector-hive_2.12:jar:1.15.1: Failed to collect dependencies at org.apache.hive:hive-exec:jar:2.3.9 -> org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde: Failed to read artifact descriptor for org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde: Could not transfer artifact org.pentaho:pentaho-aggdesigner-algorithm:pom:5.1.5-jhyde from/to maven-default-http-blocker (<http://0.0.0.0/>): Blocked mirror for repositories: [<http://repository.jboss.org|repository.jboss.org> (<http://repository.jboss.org/nexus/content/groups/public/>, default, disabled), conjars (<http://conjars.org/repo>, default, releases+snapshots), apache.snapshots (<http://repository.apache.org/snapshots>, default, snapshots)] -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal on project flink-connector-hive_2.12: Could not resolve dependencies for project org.apache.flink:flink-connector-hive_2.12:jar:1.15.1:Failed to collect dependencies at org.apache.hive:hive-exec:jar:2.3.9 -> org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde
Is anyone else getting this?Sylvia Lin
12/20/2022, 11:30 PMraghav tandon
12/21/2022, 5:24 AMorg.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset commit failed with a retriable exception. You should retry committing the latest consumed offsets.
chankyeong won
12/21/2022, 5:58 AMio.confluent.kafka.serializers.subject.TopicRecordNameStrategy
for Schema Registry properties because I have multiple schema avro messages in same topic.
But ERROR java.lang.IllegalStateException: Expecting type to be a PojoTypeInfo
is occur.
Since I can’t stick to one schema, I used the SpecificRecordBase type.
How can I deserialize avro messages in my use case?习羞羞
12/21/2022, 6:06 AMSlackbot
12/21/2022, 6:22 AMkingsathurthi
12/21/2022, 7:06 AMAlexis Josephides (Contractor)
12/21/2022, 9:58 AMGerald Schmidt
12/21/2022, 11:51 AMTony Yeung
12/21/2022, 11:57 AMOtto Remse
12/21/2022, 2:16 PMcountry string,
address string
if I do a distinct collect of addresses grouped by country, I get a multiset back. Is there a way without UDFs to get an array back?
select country, COLLECT(DISTINCT address) AS addresses from myTable group by country
Emmanuel Leroy
12/21/2022, 4:39 PMRishabh Kedia
12/21/2022, 11:49 PMSai Sharath Dandi
12/22/2022, 1:14 AMselect ROW(1 as a) as b from myTable
Walker L
12/22/2022, 1:46 AMCaused by: org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Could not acquire the minimum required resources.
An example of Occlum running flink1.11 (only run taskmanager in occlum) can be found here: https://github.com/occlum/occlum/tree/v0.29.3/demos/cluster_serving; I just changed flink1.11 to flink1.14. I tried to increase the value of parameters jobmanager.memory.process.size, taskmanager.memory.process.size, etc., but I still got the error. Since I am also new to flink, I would like to ask if there are other settings that can solve the above error?Tsering
12/22/2022, 2:57 AMjava.io.IOException: Could not perform checkpoint 66 for operator Combine Broadcasted Rules with Events -> Timestamps/Watermarks (1/6)#65.
at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:1055)
at org.apache.flink.streaming.runtime.io.checkpointing.CheckpointBarrierHandler.notifyCheckpoint(CheckpointBarrierHandler.java:135)
at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.triggerCheckpoint(SingleCheckpointBarrierHandler.java:250)
at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler$ControllerImpl.triggerGlobalCheckpoint(SingleCheckpointBarrierHandler.java:431)
at org.apache.flink.streaming.runtime.io.checkpointing.AbstractAlignedBarrierHandlerState.barrierReceived(AbstractAlignedBarrierHandlerState.java:61)
at org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.processBarrier(SingleCheckpointBarrierHandler.java:227)
at org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.handleEvent(CheckpointedInputGate.java:180)
at org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:158)
at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:110)
at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:66)
at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:98)
at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:423)
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:204)
at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:681)
at org.apache.flink.streaming.runtime.tasks.StreamTask.executeInvoke(StreamTask.java:636)
at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:647)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:620)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:784)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:571)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.apache.flink.runtime.checkpoint.CheckpointException: Could not complete snapshot 66 for operator Combine Broadcasted Rules with Events -> Timestamps/Watermarks (1/6)#65. Failure reason: Checkpoint was declined.
at org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.snapshotState(StreamOperatorStateHandler.java:264)
at org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.snapshotState(StreamOperatorStateHandler.java:169)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:371)
at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.checkpointStreamOperator(SubtaskCheckpointCoordinatorImpl.java:706)
at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.buildOperatorSnapshotFutures(SubtaskCheckpointCoordinatorImpl.java:627)
at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.takeSnapshotSync(SubtaskCheckpointCoordinatorImpl.java:590)
at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.checkpointState(SubtaskCheckpointCoordinatorImpl.java:312)
at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$performCheckpoint$8(StreamTask.java:1099)
at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
at org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:1083)
at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:1039)
... 19 more
Caused by: java.lang.ClassCastException: class com.xxx.RateRule cannot be cast to class java.lang.String (com.xxx.RateRule is in unnamed module of loader org.apache.flink.util.ChildFirstClassLoader @6165164; java.lang.String is in module java.base of loader 'bootstrap')
at org.apache.flink.api.common.typeutils.base.StringSerializer.copy(StringSerializer.java:31)
at org.apache.flink.api.common.typeutils.base.MapSerializer.copy(MapSerializer.java:110)
at org.apache.flink.runtime.state.HeapBroadcastState.<init>(HeapBroadcastState.java:69)
at org.apache.flink.runtime.state.HeapBroadcastState.deepCopy(HeapBroadcastState.java:84)
at org.apache.flink.runtime.state.HeapBroadcastState.deepCopy(HeapBroadcastState.java:40)
at org.apache.flink.runtime.state.DefaultOperatorStateBackendSnapshotStrategy.syncPrepareResources(DefaultOperatorStateBackendSnapshotStrategy.java:88)
at org.apache.flink.runtime.state.DefaultOperatorStateBackendSnapshotStrategy.syncPrepareResources(DefaultOperatorStateBackendSnapshotStrategy.java:36)
at org.apache.flink.runtime.state.SnapshotStrategyRunner.snapshot(SnapshotStrategyRunner.java:77)
at org.apache.flink.runtime.state.DefaultOperatorStateBackend.snapshot(DefaultOperatorStateBackend.java:230)
at org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.snapshotState(StreamOperatorStateHandler.java:226)
... 29 more
to simplify, it was caused by ClassCastException, can I know how to resolve this issue? i been stuck here for 3 days already 🙏Abdelhakim Bendjabeur
12/22/2022, 11:07 AM[ERROR] Could not execute SQL statement. Reason:
org.apache.flink.table.api.TableException: StreamPhysicalOverAggregate doesn't support consuming update and delete changes which is produced by node ChangelogNormalize(key=[id, accountId])
I did a simple join without any window functions, and it works "fine". I checked the changelog of the resulting table (when simpling joining), and it does include updates and delete operations -D
), even though my topics are append-only by design. Is this a normal behaviour in Flink?
How can I use window functions such as
ROW_NUMBER() OVER (PARTITION BY accountId, ticketId ORDER BY ticketMessageCreatedDatetime ASC) AS rownum,
COUNT(*) OVER (PARTITION BY accountId, ticketId) AS count_messages
on a joined stream?Tiansu Yu
12/22/2022, 11:47 AMsomeStream.sinkTo(FileSink.from…)
java.lang.NoSuchMethodError: 'org.apache.flink.streaming.api.datastream.DataStreamSink org.apache.flink.streaming.api.datastream.DataStream.sinkTo(org.apache.flink.api.connector.sink2.Sink)'
Could be some dependency issue?Suriya Krishna Mariappan
12/22/2022, 11:57 AMOscar Perez
12/22/2022, 1:10 PMSlidingProcessingTimeWindows.of(Time.seconds(10), Time.seconds(2)
I want to insert events at times from 0 to 4 seconds and I dont want the ones being inserted after 2 seconds to be processed in my first window, for that I do the following:
testHarness.processElement(event1, 1000 );
testHarness.processElement(event2, 1500 );
testHarness.processElement(event3, 1600 );
testHarness.processElement(event4, 1700 );
testHarness.processElement(event5, 1800 );
testHarness.processElement(event6, 2000 );
testHarness.processElement(event7, 3000 );
testHarness.processElement(event8, 3100 );
testHarness.processElement(event9, 3500 );
testHarness.processWatermark(2001); //Set event time in order to trigger the window
testHarness.setProcessingTime(2001);
According to my understanding setting processing time and watermark to 2001 will trigger the window but only 6 events will be processed right? the event7 having the timestamp at 3 seconds will not be consumed.
The problem that I am facing is that all events up to event9 are being processed even though I have set up processing time just 2.001 seconds. What am I doing wrong? Thanks!Suparn Lele
12/22/2022, 3:32 PMJason Politis
12/22/2022, 6:15 PMClaudia Kesslau
12/23/2022, 9:58 AMDataStream<Row> inputStream = StreamExecutionEnvironment.getExecutionEnvironment()
.fromSource(getKafkaSource(flinkJob.getKafkaTopic(),
evironment.getKafkaBootstrapUrl()), WatermarkStrategy.noWatermarks(), "kafka")
.map(flinkJob.getMapper());
FlinkSink.forRow(inputStream, FlinkSchemaUtil.toSchema(flinkJob.getSchema()))
.tableLoader(getIcebergTableLoader(environment.getIcebergMetaStoreUrl(),
flinkJob.getIcebergTable()))
.append();
env.execute(job);
If those jobs are run individually, they run just fine. If multiple jobs run simultaneously on the cluster (1 jobmanager, 3 taskmanagers, 12 taskslots run on k8s) some of the jobs (most times the one with the most columns and data rate) get the following error and fails to create any successful checkpoints:
com.esotericsoftware.kryo.KryoException: java.lang.ArrayStoreException
See thread for the full stacktrace.
I had no luck searching for this error online. Neither adding more resources to the cluster nor changing the checkpoint interval had any effect on the error. Can you shed some light on the problem? Is this a problem with cluster configuration/resources or the flink job itself?Emmanuel Leroy
12/23/2022, 5:56 PM饶俊
12/24/2022, 3:22 PM