https://flink.apache.org/ logo
Join Slack
Powered by
# troubleshooting
  • s

    Sucheth Shivakumar

    05/11/2023, 3:22 AM
    I was checking the CoProcessFunction example mentioned here -- https://nightlies.apache.org/flink/flink-docs-stable/docs/learn-flink/etl/#connected-streams For some reason it doesnt filter for me and prints everything out But If I set env.setParallelism(1); It works as expected. Can someone point out what am i missing here ? Why doesn't t work if parallelism is not 1.
    šŸ‘€ 1
    d
    • 2
    • 4
  • a

    Abhinav Ittekot

    05/11/2023, 5:46 AM
    hello, is it possible to change the compression algorithm for RocksDB state via flink config? I believe it's snappy by default.
  • m

    Michael Kreis

    05/11/2023, 11:06 AM
    Hi everyone, I'm trying to setup the flink in standalone mode on openshift with the k8s operator. However I'm getting errors on the job manager when starting a job:
    ....is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on
    According to this documentation an additional permission to
    deployments/finalizers
    is needed, which is not mentioned in you documentation. Is that just missing on the documentation or am I doing something wrong?
  • a

    Amenreet Singh Sodhi

    05/11/2023, 11:20 AM
    Hi team, Is flink compatible with jdk 17?
    m
    • 2
    • 5
  • k

    Kirill Lyashko

    05/11/2023, 11:33 AM
    Hi everyone. During investigation FlinkOperator’s functionally I’ve faced follow issue: Upgrade mode is set to
    last-state
    . However when I changed the parallelism job master has failed once during startup, because it couldn’t reschedule state from latest checkpoint and then the job has been started without any state by operator. And I don’t see anything in logs what would explain such behaviour. Does anyone faced anything similar? I would be curious to know if it’s a feature or bug. And could be the behaviour configured to fail job in such cases?
  • a

    Adam Fleishaker

    05/11/2023, 1:52 PM
    Hi all! My team has some M1 Macs for our developers, and we’re using Flink 1.13 for our applications. We use Flink Docker images for testing, but they crash intermittently with the M1s trying to run the AMD image. 1.15 is the ultimate goal since they have ARM images among other improvements, but would be a lift to cut our applications over to. Is there any way to get a native ARM image for 1.13 or has anyone made their own custom that we can use?
    m
    • 2
    • 5
  • e

    El Houssine Talab

    05/11/2023, 2:35 PM
    Hey folks, We are looking into performing deduplication + over aggregation (flink v1.15)? We tried the below but fell into error:
    StreamPhysicalOverAggregate doesn't support consuming update and delete changes which is produced by node Deduplicate(keep=[LastRow], key=[home, room, sn], order=[ROWTIME])
    Any thoughts? Thanks.
    Copy code
    WITH t1 AS (
        /* DEDUPLICATION */
        SELECT 
            home, 
            room, 
            sn, 
            temp,
            event_ts, 
            arrival_ts
        FROM ( 
            SELECT 
                *, 
                ROW_NUMBER() OVER (PARTITION BY home, room, sn ORDER BY event_ts DESC) AS row_num
            FROM telemetry
            WHERE 
                event_ts >= CURRENT_WATERMARK(event_ts)
                AND ABS(TIMESTAMPDIFF(SECOND, event_ts, arrival_ts)) <= 60
        )
        WHERE row_num = 1
    )
    
    /* OVER AGGREGATION */
    SELECT 
        home, 
        room, 
        sn, 
        event_ts, 
        arrival_ts, 
        AVG(temp) OVER ( 
            PARTITION BY home, room
            ORDER BY event_ts
            RANGE BETWEEN INTERVAL '1' MINUTE PRECEDING AND CURRENT ROW
        ) as avg_temp
    FROM t1
  • s

    Sucheth Shivakumar

    05/11/2023, 2:56 PM
    https://apache-flink.slack.com/archives/C03G7LJTS2G/p1683775361922889 Can someone please help me on what is happening here
  • c

    Conor McGovern

    05/11/2023, 3:02 PM
    Hi everyone, the flink docs for kafka state the following regarding upgrading the flink connector:
    Copy code
    Do not upgrade Flink and the Kafka Connector version at the same time.
    What is the problem with upgrading both at the same time? Thanks for your help. (Link to the docs in question: https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/kafka/#upgrading-to-the-latest-connector-version)
    m
    • 2
    • 1
  • m

    Mingliang Liu

    05/11/2023, 11:09 PM
    Hi team, currently the
    Job Manager -> Configuration
    page shows different configurations under different sections. This is good for categorizing them. But the height of each section is too short, and we need to scroll too much to find a specific configurations. Also searching is not showing all of the matches unless we hit
    Enter
    multiple times or scroll. Is it a good idea to put different sections of JM configuration to different tabs, similar to
    Running Jobs -> Checkpoints
    ?
  • a

    Adam Richardson

    05/12/2023, 4:45 AM
    Hi there -- I was wondering if there was any roadmap items, design docs, or general thinking around support for Iceberg merge-on-read positional deletes in Flink. If I understand correctly Flink supports equality deletes only for Iceberg merge-on-read updates (whereas Spark supports positional deletes only). I'm trying to understand if that's just a feature gap in the current state for Flink or a more fundamental platform-level limitation. I'm investigating a solution for streaming ingestion into an Iceberg-based data lake, but have concerns about the read side performance for equality deletes
    m
    • 2
    • 3
  • e

    El Houssine Talab

    05/12/2023, 8:06 AM
    Any insights on how to do this?
  • d

    Dheeraj Panangat

    05/12/2023, 10:47 AM
    Hi Team, Getting following error when trying to join 2 different tables with same table.
    Copy code
    Hash collision on user-specified ID "uid_split_monitor_tableRoot". Most likely cause is a non-unique ID. Please check that all IDs specified via `uid(String)` are unique.
    	at org.apache.flink.streaming.api.graph.StreamGraphHasherV2.generateNodeHash(StreamGraphHasherV2.java:185)
    	at org.apache.flink.streaming.api.graph.StreamGraphHasherV2.traverseStreamGraphAndGenerateHashes(StreamGraphHasherV2.java:110)
    Our Eg:
    Copy code
    table1 = tableEnv.sqlQuery("");
    table2 = tableEnv.sqlQuery("");
    
    tableRoot = tableEnv.sqlQuery("");
    
    table3= table1.join(tableRoot);
    table4= table2.join(tableRoot)
    Has anyone faces this ? Anything we are doing wrong here ? Appreciate any help. Thanks
  • s

    Sumit Singh

    05/12/2023, 11:11 AM
    How to register JDBC MySQL Catalog ? I am using flinkSQL Client
  • a

    aswin jose

    05/12/2023, 11:59 AM
    Apache flink tumbling window or any window always applying grouping and subgrouping as part of aggregation, but i have a situation to collect data in a tumbling window without grouping or aggregation in flink sql client. Is this possible in apache flink?
    m
    n
    a
    • 4
    • 10
  • k

    Kaushalya Samarasekera

    05/12/2023, 2:06 PM
    Hi, this is a very novice question. A basic operation is failing for me therefore seeking guidance. I'm creating the following table in Flink with kafka as a sink. I'm running the sql statements in sql-client in interactive mode. The topic
    transactions
    (1 partition, 1 replica) exists in kafka, running on a wurstmeister/kafka docker container.
    Copy code
    CREATE TABLE transactions
    (
        `account_id` BIGINT,
        `amount`     BIGINT
    ) WITH (
          'connector' = 'kafka',
          'topic' = 'transactions',
          'properties.bootstrap.servers' = 'localhost:59092',
          'properties.group.id' = 'testGroup',
          'scan.startup.mode' = 'earliest-offset',
          'format' = 'json');
    Then I run
    Copy code
    insert into transactions  values (1, 23);
    After the above insert, I expected the record to appear as an event in the
    transactions
    topic but it does not. If I run select * from transactions, I can see the record in the table. Just not in kafka. Any pointers would be much appreciated.
    • 1
    • 1
  • s

    Simon Dahlbacka

    05/12/2023, 2:13 PM
    I have a problem with flink sql function
    to_date
    , docs say
    TO_DATE(string1[, string2])
    , I am calling it using
    Copy code
    TO_DATE(CAST(`OAORDT` as string), 'yyyyMMdd') AS `orderDate`,
    Invalid function call:\nTO_DATE(STRING, STRING NOT NULL) Expected signatures are:\nTO_DATE(*). Any Ideas what is wrong? flink 1.16.1
    • 1
    • 2
  • n

    Nizar Hejazi

    05/14/2023, 6:38 AM
    Hello, I have few questions about Flink’s dynamic tables: • Flink’s dynamic tables can be converted into an upsert stream (requires a unique key). ā—¦ But docs mentioned: ā€œPlease note that only append and retract streams are supported when converting a dynamic table into a `DataStream`ā€. • Flink’s dynamic tables can be written to an external system. ā—¦ From slide 17 here, looks like we can save ~50% of traffic if downstream system supports upserting. ā—¦
    WITH ('connector'='kafka', 'format' = 'debezium-json')
    supports: ā–ŖļøŽ insertion, update before, update after, and delete ā—¦ What does the community use as an external system for dynamic tables? • There are query restrictions (state size for continuous queries evaluated on unbounded streams that need to update perv. emitted results.) ā—¦ Does Flink has to maintain this state even when dynamic table is written to external system? ā—¦ Any benchmark for state size for different queries? Thanks
  • n

    Nicholas Erasmus

    05/15/2023, 8:11 AM
    Hi everyone, I have a stateless application (Flink 1.16.1) that consumes at a rate of about 12 records per second and uses flatmap to write n records to one (Delta Table) sink and m records to another (Delta Table) sink. My Task Manager's heap memory usage seems incredibly high for such a simple task. Does anyone know why this would be? Or how I can go about decreasing this? I would think that a stateless application would have a way smaller memory footprint. Here's a pic of the memory usage
    a
    d
    • 3
    • 12
  • b

    Barisa Obradovic

    05/15/2023, 9:21 AM
    We are having problems upgrading from flink 1.13 to flink 1.17, since we are running older version of zookeeper ( 3.4.x). Starting with Flink 1.15 , flink is using Apache Curator library from 2.x to 5.2. ā€œApache Curator is a Java/JVM client library for Apache ZooKeeper, What that means is described in https://curator.apache.org/breaking-changes.html, but short version is that Zookeeper 3.4.x is no longer supported from flink 1.15. Did anyone manage to use older version of apache curator in latest flink, so we can still use our old zookeeper ( which we can't upgrade at this time )
    šŸ‘€ 1
  • r

    Ron Ben Arosh

    05/15/2023, 10:47 AM
    Hi šŸ™‚ I’m running Flink 1.17. I’m trying to use both reactive mode and unaligned checkpoints. When flink restart itself it fail to start with job manager exception:
    Copy code
    Cannot rescale the given pointwise partitioner.\nDid you change the partitioner to forward or rescale?
    The filnk job currently contain only ā€˜forward’ arrows between operators. This job contain no broadcast and no explicit shuffles. How this issue can be solved? or more broadly: how can flink be both autoscaled and contain stable checkpoints?
    d
    • 2
    • 27
  • a

    Abhishek Joshi

    05/15/2023, 11:27 AM
    Hi Team, We are using flink table APIs to create a temporary table in flink using jdbc connector for the connection with the postgres. We came across the scenario where we want to update that temporary flink table at runtime as soon as we added a new entry into the underlying postgres table. And then perform some operations on that table. can anyone help us regarding this? Thanks
  • r

    Rashmin Patel

    05/15/2023, 12:09 PM
    Hii all I want to understand savepoint v/s checkpoint in little more detail. What all extra things are stored in savepoint compared to checkpoint ? I can see there is a significant difference in their sizes.
    d
    • 2
    • 2
  • u

    ē”°ę˜Žåˆš

    05/15/2023, 12:11 PM
    Hi all
  • u

    ē”°ę˜Žåˆš

    05/15/2023, 12:14 PM
    Hi all: I checkpoint data to hdfs ,run for 2 months : Caused by: org.apache.flink.util.SerializedThrowable: The directory item limit of /flink/checkpoints/da0e3d92f6eb46a18190463abafe1af2/9ea0a05029b62b35e2b98eebb53e4bd9/shared is exceeded: limit=1048576 items=1048576 at org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:1307)
  • j

    Jean-Baptiste PIN

    05/15/2023, 1:42 PM
    Hi, is it possible that the flink-connector-mongodb is not publish on maven repo ?
  • j

    Jean-Baptiste PIN

    05/15/2023, 1:42 PM
    I got 404 https://repo.maven.apache.org/maven2/org/apache/flink/flink-connector-mongodb/1.0.0-1.17/flink-connector-mongodb-1.0.0-1.17.pom
    h
    • 2
    • 5
  • d

    David Wisecup

    05/15/2023, 3:07 PM
    Is doing local development with Kafka possible? When I have the flink dependencies in my pom marked as provided I get this runtime error:
    Caused by: java.lang.NoClassDefFoundError: org/apache/flink/api/common/serialization/DeserializationSchema
    When I change the flink deps as a normal scope I get further, but get this runtime error:
    Caused by: java.lang.reflect.InaccessibleObjectException: Unable to make field private final java.lang.Object[] java.util.Arrays$ArrayList.a accessible: module java.base does not "opens java.util" to unnamed module @6366ebe0
    • 1
    • 2
  • j

    Jiahao Wang Zhuo

    05/15/2023, 4:12 PM
    Hi, I am trying to create a
    FileSink
    using the parquet BulkFormat (on 1.16). I wanted to create a custom rolling policy based on the file size and processing time. I see on FLINK-13027 that this was supported for
    StreamingFileSink
    since Flink-1.10. However, when I try to extend the
    CheckpointRollingPolicy
    , it still rolls only on checkpoint times or
    org.apache.flink.util.SerializedThrowable: java.lang.UnsupportedOperationException: Bulk Part Writers do not support "pause and resume" operations.
    (if I set
    shouldRollOnCheckpoint
    to false). Would I be correct to say that custom rolling policies are not supported for
    FileSink
    but only for
    StreamingFileSink
    instead (which is marked as deprecated from the docs)? If so, do you have any plan on supporting that? Is there any blocker/challenge on that being possible?
  • t

    Trystan

    05/15/2023, 5:29 PM
    if i’m creating a source for use in batch mode, do i need a
    Split
    as well as a
    SplitState
    ? if i understand correctly, the
    SplitState
    would be the thing maintaining state as the
    SourceReader
    reads through the split… but if we’re in Batch execution mode, would it matter? if the source is not checkpointed (batch mode), wouldn’t the
    SourceReader
    need to resume from zero and reread the entire split? i feel like i’m missing something. concrete example: a dynamo scan. segments map to splits, but within a segment the results may be paginated. i could store that pagination key… but does it matter? if the task fails, wouldn’t all data in the immediate downstream persistence layer be wiped anyway? and if so i’d need to resume from zero, not the partially paginated result. the connection between batch recovery and the persisted intermediate storage isn’t super clear to me in this mode
1...798081...98Latest