I have a use case where I am moving some data from...
# random
m
I have a use case where I am moving some data from source Kafka topics to Postgres tables. I created two Flink Jobs as Job 1 takes data from JOIN of topic A and topic B and sinks in TABLE 1. Job2 takes data from JOIN of topic A and topic C and sinks in TABLE2. I want to ensure that JOB2 only processes and sinks events once JOB1 is done processing incoming events. is it possible ?
a
Kafka is an unbound stream of data, when job1 should finish(done processing) in this case?
m
Both Job1 and Job2 will be running in streaming mode. I want to control that if some events occur in TOPIC A, first it should be processed by JOB1 and later JOB2. For more clear example - Topic 1 ---> applications ( application ID, application details) Topic 2 ---> students ( application ID, student ID, student details) Topic 3 ---> payment ( application ID, payment details) JOB1 is JOINING both "applications" and "students" and sinking in APPLICATION_STUDENT_TABLE ( a_s_id, details) JOB2 is JOINING both "applications" and "payment" and sinking in APPLICATION_PAYMENT_DETAILS ( a_s_id, payment_details) Now, I want to control that if new applications come in the "applications" topic they should be processed by "JOB1" first and later by JOB2 as a_s_id will be used in JOB2. Something like a data pipeline that has a sequential component.
🤷 1