Apache Flink #random

Tejansh Rana

12/04/2024, 4:43 PM

Hi everyone, we have a few openings in Autodesk for the Data Streaming team in Dublin, Ireland. If you are interested in learning more about it, you can check out this post. I would be happy to answer any questions as well so feel free to reach out.

👏 3

Masha A

12/11/2024, 4:10 PM

CFP for Current Bengaluru 2025: The Data Streaming Event - is officially OPEN If you’re deep in the world of Flink, building real-time data processing magic, or have a cool story to share, Current is a great opportunity to get on stage and inspire the community. It can be cutting-edge tech, lessons learned, exciting FLIP or creative solutions; your voice matters! 🔗 Submit your proposal: https://current.confluent.io/bengaluru 📅 Deadline: December 19, 2024 Hope to hear from you soon!

👏 1

Anirudh

01/25/2025, 3:47 AM

Slightly off topic question, let me know if this belongs in some other channel. I am a newcomer to Flink, and would like to contribute to the code. I found a recently accepted FLIP that I was interested in. What is the best way to try and get involved in the development process?

Hunter

01/31/2025, 7:19 PM

Has anyone ever seen a way to work with Broadcast State that's larger than working memory? Would it be terrible to use rocks directly if I'm not too concerned about data redundancy/durability?

Ron Ben Arosh

02/17/2025, 12:18 PM

Hi, does Flink 1.19.2 is out? I can find mvn repo of it, but not release docs

Chiara

03/04/2025, 4:46 PM

Hi all 👋! Do you have an open source project that you think could turn into a business? You might like this talk that is talking place tomorrow, Wednesday 5 March. It takes examples from three companies (Percona, DBeaver, and Altinity) that built profitable businesses selling, supporting, and running open source software. Register here: https://altinity.com/events/build-a-great-business-on-open-source-without-selling-your-soul

rmoff

03/06/2025, 10:41 AM

is anyone using Zeppelin and Flink? Love the idea but seems it doesn't support recent versions, e.g. https://github.com/apache/zeppelin/pull/4864

Raghavendra Rao

03/19/2025, 4:40 PM

Hey everyone! We’re evaluating different vendor solutions to run stateful IoT data processing workloads on Flink in our GCP environment. We’re a small team with limited ops experience, so a managed or low-overhead option would be ideal. Does anyone have insights or experiences with the following (or other) solutions? • Google Dataproc for Flink (and any tips on managing stateful workloads) • Ververica BYOC (with upcoming GCP support) • Cloudera Stream Processing • Confluent Cloud - Flink How well do these options integrate with GCP services, and how much operational overhead do they typically require? Any guidance on the best fit (preferably a managed service that could integrate with GCP) would be greatly appreciated! Thanks

Jacob Jona Fahlenkamp

03/28/2025, 12:17 PM

Hi is it a bad idea to have a large accumulator when doing a windowed aggregation with an AggregateFunction? Is the accumulator serialized/deserialized on every event that comes in or only on checkpoints?

Maciej Tułaza

04/04/2025, 7:01 AM

hey 👋 is JDBC connector supported for Flink 1.20.1? if not - is this planned? if not yet supported - is it ok to use Flink

1.20.1

and flink-connector-jdbc eg.

3.2.0-1.19

? are they compatible? thanks!

Gert Humphris

04/10/2025, 1:29 AM

Hi Everyone Does anyone have some examples of managing Secrets in Flink Sql. The Connectors like Kafka generally require secrets for connecting and I want to avoid placing it in Sql Scripts? For example can you reference an Env Var inside a SQL script? Thanks

Kaiqi Dong

04/17/2025, 3:16 PM

Hi everyone, I wonder if anyone bought early-bird ticket for Flink Forward in Barcelona? I made the purchase successfully to Ververica, but I don’t receive any confirmation/invoice nor ticket from Ververica. Is it normal? 🤔

L P V

04/25/2025, 7:31 AM

hi, any one know about Arroyo https://www.arroyo.dev/ ? Look like another stream processing engine

Sandeep Devarapalli

04/25/2025, 1:48 PM

And this is why OLake (Open Source) is fast! Here's something for your weekend read: Exploring OLake's Architecture. If you're diving into real-time data replication or building modern data lakehouse architectures with Apache Iceberg, we've just shared an in-depth look at how OLake actually works behind the scenes. Whether your stack includes MongoDB, PostgreSQL, or MySQL, and you're targeting formats like Apache Iceberg or Parquet, this article has practical insights on designing scalable, efficient data pipelines. OLake is an open-source tool specifically built for high-speed data ingestion. Key Highlights: ⚡ Speed: Load data 4x to 10x faster compared to traditional ETL tools. 🕒 Real-Time CDC: Minimal-lag Change Data Capture from MongoDB, PostgreSQL, and MySQL. 🧩 Plug-and-Play Architecture: Cleanly separated core, drivers, and writers make extending OLake straightforward. 📊 Schema Flexibility: Seamlessly handles schema evolution and type changes compatible with Apache Iceberg. 🔄 Reliable Syncs: Built-in state management means your sync operations can resume effortlessly if interrupted. https://olake.io/blog/olake-architecture-deep-dive

George Leonard

04/30/2025, 3:38 PM

hi hi all, anyone use the flink/prometheus connector that can assist. need the required jar file and then how to package data via flink using flink sql. got json payloads inbound that I need to reshape into the correct format and send out to the sink connector.

<https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/prometheus/>

Derya Aydede

05/05/2025, 10:50 PM

there's a lot of talk of ai stuff in flink but the only pytorch or tensorflow connector i can find is the old alibaba dl-on-flink repo, which is not maintained / out of date. hasn't been updated in 3 years, wants you to use a similarly old pytorch version, flink version, etc flink's own ml library doesn't have these features and also isn't on flink 2.0 is there something out there i don't know about or what?

Slackbot

05/08/2025, 10:00 AM

This message was deleted.

MohammadReza Shahmorady

05/10/2025, 3:26 PM

Hi everyone, i have an scenario that i want in flink but i dont sure flink have any solution for it i want consume a topic that contains rules, and join it with our products, and change price based on that rules and return product. but i have an issue that when we run flink for first time, we not sure that we get all rule before get any product, and maybe some product come faster than they rules and we dont know how manage it is there any way to sure we get all message that in rules topic then start processing product stream and join it with rules

rmoff

05/15/2025, 8:59 AM

Who's going to be at Current London 2025 next week? Would be great to say hi if you're there 🙂

🙋 2

George Leonard

05/17/2025, 10:00 AM

hi hi all. I installed the jupyter packages as per below into my flink container.

Copy code

RUN echo "--> Install apache-flink && jupyter package" \
    && /usr/bin/pip3 install jupyter

no errors. I then start my cluster and execute 'jupyter notebook' resulting in below

Copy code

docker compose exec jobmanager /bin/bash
flink@jobmanager:~$ jupyter notebook
[I 2025-05-17 09:56:20.988 ServerApp] jupyter_lsp | extension was successfully linked.
[I 2025-05-17 09:56:20.990 ServerApp] jupyter_server_terminals | extension was successfully linked.
[I 2025-05-17 09:56:20.993 ServerApp] jupyterlab | extension was successfully linked.
[I 2025-05-17 09:56:20.995 ServerApp] notebook | extension was successfully linked.
[I 2025-05-17 09:56:20.996 ServerApp] Writing Jupyter server cookie secret to /opt/flink/.local/share/jupyter/runtime/jupyter_cookie_secret
[I 2025-05-17 09:56:21.166 ServerApp] notebook_shim | extension was successfully linked.
[I 2025-05-17 09:56:21.181 ServerApp] notebook_shim | extension was successfully loaded.
[I 2025-05-17 09:56:21.183 ServerApp] jupyter_lsp | extension was successfully loaded.
[I 2025-05-17 09:56:21.184 ServerApp] jupyter_server_terminals | extension was successfully loaded.
[I 2025-05-17 09:56:21.186 LabApp] JupyterLab extension loaded from /usr/local/lib/python3.10/dist-packages/jupyterlab
[I 2025-05-17 09:56:21.186 LabApp] JupyterLab application directory is /usr/local/share/jupyter/lab
[I 2025-05-17 09:56:21.186 LabApp] Extension Manager is 'pypi'.
[I 2025-05-17 09:56:21.243 ServerApp] jupyterlab | extension was successfully loaded.
[I 2025-05-17 09:56:21.246 ServerApp] notebook | extension was successfully loaded.
[I 2025-05-17 09:56:21.247 ServerApp] Serving notebooks from local directory: /opt/flink
[I 2025-05-17 09:56:21.247 ServerApp] Jupyter Server 2.16.0 is running at:
[I 2025-05-17 09:56:21.247 ServerApp] <http://localhost:8888/tree?token=df91905d3836954ab07a4b60691c6b6dd367c9bfba57d841>
[I 2025-05-17 09:56:21.247 ServerApp]     <http://127.0.0.1:8888/tree?token=df91905d3836954ab07a4b60691c6b6dd367c9bfba57d841>
[I 2025-05-17 09:56:21.247 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[W 2025-05-17 09:56:21.250 ServerApp] No web browser found: Error('could not locate runnable browser').
[C 2025-05-17 09:56:21.251 ServerApp] 
    
    To access the server, open this file in a browser:
        file:///opt/flink/.local/share/jupyter/runtime/jpserver-1461-open.html
    Or copy and paste one of these URLs:
        <http://localhost:8888/tree?token=df91905d3836954ab07a4b60691c6b6dd367c9bfba57d841>
        <http://127.0.0.1:8888/tree?token=df91905d3836954ab07a4b60691c6b6dd367c9bfba57d841>
[I 2025-05-17 09:56:21.262 ServerApp] Skipped non-installed server(s): bash-language-server, dockerfile-language-server-nodejs, javascript-typescript-langserver, jedi-language-server, julia-language-server, pyright, python-language-server, python-lsp-server, r-languageserver, sql-language-server, texlab, typescript-language-server, unified-language-server, vscode-css-languageserver-bin, vscode-html-languageserver-bin, vscode-json-languageserver-bin, yaml-language-server

Issue, when I go to the above url i can't get to the page. and yes I got

Copy code

ports:
   8888:8888

as part of my docker-compose in the jobmanager service. I do have 'export 8888' as part of the Dockerfile.

George Leonard

05/17/2025, 12:22 PM

got it working. created the following file. nano ~/.jupyter/jupyter_notebook_config.py

Copy code

c.NotebookApp.ip = '0.0.0.0'  # Listens on all network interfaces
c.NotebookApp.open_browser = False

George Leonard

05/17/2025, 1:02 PM

able to paste python commands in notebook up to the below.

Copy code

t_env.execute_sql("""
        CREATE CATALOG fluss_catalog WITH (
            'type'              = 'fluss',
            'bootstrap.servers' = 'coordinator-server:9123'
        )
    """)

this fails badly. error stack pretty much says don't know about type = fluss.

umar farooq

05/19/2025, 2:37 PM

Hi Everyone , Is there a way we can deploy batch jobs using the Flink kubernetes operator ?

Sandeep Devarapalli

05/22/2025, 5:15 AM

Hi all, want to understand this, does Flink CDC use Debezium internally?

André Casimiro

06/03/2025, 2:51 PM

Hi, is there any initiative aiming at running Flink with Java 21 virtual threads? My particular use case is running Flink in a single JVM not a distributed cluster, and seems thread waiting/switching is the next bottleneck for me.

Masha A

06/03/2025, 5:37 PM

Your story belongs on stage. The Call for Papers for Current 2025 – New Orleans is officially open! If you’ve wrangled data in production, shipped something cool under pressure, or helped a team unlock real-time Flink magic! ✅ Engineers, architects, OSS contributors — this stage is yours. 📅 CFP Deadline: June 15, 2025 🧠 Office Hours: June 13, 8–9 AM PDT — come get feedback or bounce ideas at #speakers-office-hours No fluff. Just honest talks from people doing the work. 👉 Submit your story through sessionize

Chiara

06/06/2025, 2:56 PM

Hi all, got a cool open-source analytics project? Come speak at OSA Con 2025! We’re looking for devs & data folks building with open source tools. Last year we had 2,000+ devs register for the conference, this year we are hoping for more. Join us :) 👉 Submit your talk: https://sessionize.com/osacon-2025/

Pedro Mázala

06/12/2025, 3:27 PM

Hey there, folks! 👋 I’m on a mission to learn how cool kids weave metadata + policy enforcement into their data stacks. If you’ve ever built or run stuff like: • A metadata catalog that actually talks to a policy engine • Data-lineage views that spell out where a rule was applied (or ignored 🥸) • Any clever tricks that make auditors smile instead of sigh …I’d love to swap stories. Drop a 👍 below or ping me in DMs if you’re up for a quick chat.

George Leonard

06/15/2025, 1:20 PM

guys, are there a specific channel for connectors ?

George Leonard

06/25/2025, 5:31 AM

curious. has anyone ever done a api endpoint "connector" in a flink, I mean a endpoint running as one or more jobs on a flink cluster, each on it's own port, to which data is pushed.