I have a use case where I am moving some data from source Ka Apache Flink #random

I have a use case where I am moving some data from...

Mukesh Kumar

10/17/2023, 12:33 PM

I have a use case where I am moving some data from source Kafka topics to Postgres tables. I created two Flink Jobs as Job 1 takes data from JOIN of topic A and topic B and sinks in TABLE 1. Job2 takes data from JOIN of topic A and topic C and sinks in TABLE2. I want to ensure that JOB2 only processes and sinks events once JOB1 is done processing incoming events. is it possible ?

Alexey Seleznev

10/17/2023, 12:59 PM

Kafka is an unbound stream of data, when job1 should finish(done processing) in this case?

Mukesh Kumar

10/18/2023, 4:39 AM

Both Job1 and Job2 will be running in streaming mode. I want to control that if some events occur in TOPIC A, first it should be processed by JOB1 and later JOB2. For more clear example - Topic 1 ---> applications ( application ID, application details) Topic 2 ---> students ( application ID, student ID, student details) Topic 3 ---> payment ( application ID, payment details) JOB1 is JOINING both "applications" and "students" and sinking in APPLICATION_STUDENT_TABLE ( a_s_id, details) JOB2 is JOINING both "applications" and "payment" and sinking in APPLICATION_PAYMENT_DETAILS ( a_s_id, payment_details) Now, I want to control that if new applications come in the "applications" topic they should be processed by "JOB1" first and later by JOB2 as a_s_id will be used in JOB2. Something like a data pipeline that has a sequential component.

🤷 1

2 Views

Open in Slack

Previous Next