Hi Team i am trying to implement datastream and table API bo Apache Flink #random

Hi Team, i am trying to implement datastream and t...

Vishal bharatbhai Vanpariya

09/13/2023, 9:12 AM

Hi Team, i am trying to implement datastream and table API both. i am converting datastream to table and again to datastream. this datastream is realtime stream, so data will continuously insert to table. so i want to know will this occure memory lickege problem? https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/data_stream_api/

Jane Chan

09/13/2023, 9:17 AM

There shouldn't be a memory leak issue if you are properly handling the conversion. Have you encountered any specific problems?

Vishal bharatbhai Vanpariya

09/13/2023, 9:21 AM

Thanks for reponse @Jane Chan, i didn't encountered any issue but we are testing it local level as of now, but once we go prod it might occure issue. i dont know how flink is storing tables? i have 5m data per data in pipeline, table will continuously store this data and it will occure some memory issue right?

Vishal bharatbhai Vanpariya

09/13/2023, 9:22 AM

because it will up when we re-deploy without checkpoint

Vishal bharatbhai Vanpariya

09/13/2023, 9:26 AM

can we apply window with the table?

Jane Chan

09/13/2023, 9:55 AM

i dont know how flink is storing tables?

Flink is primarily a compute engine and does not directly store data. The actual data storage for tables is typically managed by connectors to external systems or storage solutions integrated with Flink, such as Apache Kafka, Apache Cassandra, or other databases. For stateful computation, Flink uses internal state backend to store intermediate result. The state backend can be configured to store the state in various ways, you can take a reference to https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/state/ nightlies.apache.org Working with State Working with State # In this section you will learn about the APIs that Flink provides for writing stateful programs. Please take a look at Stateful Stream Processing to learn about the concepts behind stateful stream processing. Keyed DataStream # If you want to use keyed state, you first need to specify a key on a DataStream that should be used to partition the state (and also the records in the stream themselves).

Vishal bharatbhai Vanpariya

09/13/2023, 10:01 AM

yeah, i got it that flink store data through state-backend but at the end this state-backend will consume so much memory i think, because we had implemented process function with statevalues which was occuring memory issue.

Vishal bharatbhai Vanpariya

09/13/2023, 10:05 AM

For how long will it store the

statevalues

for table since I am processing 1 million records of 10KB each within 6 hours. Is there any way we can define/change the

statevalue

for tables preservation time which can behave like sliding window?

Jane Chan

09/13/2023, 10:10 AM

For window-based stateful operations, it's using watermark to keep state relative small. You can take a look at https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/concepts/overview/#state-management

Open in Slack

Previous Next