Hi Team, i am using datastream to table api bridge...
# random
v
Hi Team, i am using datastream to table api bridge. i amusing group by in query and converting it to datastream. there are three functions are there "toRetractStream", "toDataStream" and "toChangelogStream" which should i use to convert back to data stream? i've tried toChangelogStream and toRetractStream these functions are returning multiple outputs and toDataStream is not working on operation like count.
@Jane Chan can you please help me with that?
j
You can take a reference to the Javadoc: https://nightlies.apache.org/flink/flink-docs-master/api/java/. The reason why
toDataStream
not work is that
Since the DataStream API does not support changelog processing natively, this method assumes append-only/insert-only semantics during the table-to-stream conversion.
While the group-by-and-count operation under streaming mode itself does produce retractions.
v
can we do something for deduplication ?
j
Not sure I understand your issue correctly, but if your requirement is to group by some key and count the value under streaming processing, it's necessary for retraction to happen to ensure the final result is correct. I suggest you read the doc of Dynamic Table. It's by design that "i've tried toChangelogStream and toRetractStream these functions are returning multiple outputs" because the outputs contain (-U, +U) to refine the previously computed result. If you don't want to receive -U, you can use the overloaded
toChangelogStream
to specify the
ChangelogMode
to be upsert.