Another question about r2o task. I found count(*) ...
# troubleshooting
a
Another question about r2o task. I found count(*) is much larger than the actual number in my table when r2o task is enabled and “segmentPushFrequency” is set “HOURLY”. For example, for a same sql like the following one is run two times. The first run is before the offline segment is generated and the second run is after the offline segment is generated. How’s the difference generated? select count(*), count(request_id), distinctcount(request_id) from table_test where “timestamp” >= FromDateTime(‘2022-06-29T141500’,’yyyy-MM-dd’’T’‘HHmmss’) and “timestamp” < FromDateTime(‘2022-06-29T142300’, ’yyyy-MM-dd’’T’‘HHmmss’)
m
Do you have upsert enabled? If so, the could explain the additional counts, since offline table does not support upsert, and sees those as additional rows.
a
no upsert in realtime table config and r2o task enabled dedup.