Hi team, I have a question which didn't manage to...
# troubleshooting
i
Hi team, I have a question which didn't manage to find the answer. Assume we have a Flink SQL job in streaming mode and defined a table on top of Kafka, and after that use this table multiple times in the query. For instance like this:
Copy code
SELECT * FROM myKafkaStream WHERE col1 = "a"
UNION ALL
SELECT * FROM myKafkaStream WHERE col1 = "b"
Is it a valid usage? I was under impression that it's not valid because it would effectively move the Kafka pointer twice. However tried it recently and it seems working, so I'm confused a bit
i
It doesn't answer - all the examples use different tables, not the same one. And UNION ALL is just an example. The question is of the usage of the same stream multiple times in subqueries of the same query
I mean I can create multiple tables on top of Kafka of course (this is what we're doing now). The question is what happens when we use the same table
m
If you run an
EXPLAIN PLAN
you will see the generated query plan. My expectation is that it will include both filters and then union these results
i
Hmmm interesting. So this is the snipped of the query plan. These represent 2
SELECT
from the same Kafka: the first one is
TableSourceScan
, but the second one is
Reused(reference_id=[1])
. So looks like it scans the stream once, but passes the events to both subqueries independently. Which means referencing the same table in the subqueries is correct - is it the right understanding @Martijn Visser? I don't remember why, but I have a strong memory it didn't work before and that's why we ended up creating multiple table definitions on top of the same source (and using multiple independent consumer groups, correspondingly) - can it be that this behavior was fixed / changed in the recent Flink versions?