Hi team! I am using Flink TableAPI to build live f...
# troubleshooting
e
Hi team! I am using Flink TableAPI to build live feature. The feature consists of hopping window to calculate last order state and then grouping on top of windows. Roughly it looks like this:
Copy code
with group_window_get_last_state AS (
        SELECT
        COUNTRY_CODE, KEY_ORDER_ID, CITY_ID, 
        last_value_by_ts(DELIVERY_STATUS, `TIMESTAMP`) as last_status,
        window_start, window_end
        FROM TABLE(HOP(TABLE delivery_events, DESCRIPTOR(`TIMESTAMP`), INTERVAL '5' MINUTE, INTERVAL '5' HOURS))
        group by COUNTRY_CODE, KEY_ORDER_ID, CITY_ID,  window_start, window_end
    )
    SELECT 
    COUNTRY_CODE,
    CITY_ID,
    WINDOW_END,
    COUNT(*) as total_cnt, 
    COUNT(1) FILTER (WHERE last_status='pending') AS pending_cnt,
    COUNT(1) FILTER (WHERE last_status='scheduled') AS scheduled_cnt,
    COUNT(1) FILTER (WHERE last_status='queued') AS queued_cnt,
    COUNT(1) FILTER (WHERE last_status='courier_notified') AS courier_notified_cnt,
    COUNT(1) FILTER (WHERE last_status='accepted') AS accepted_cnt,
    COUNT(1) FILTER (WHERE last_status='near_pickup') AS near_pickup_cnt,
    COUNT(1) FILTER (WHERE last_status='picked_up') AS picked_up_cnt,
    COUNT(1) FILTER (WHERE last_status='left_pickup') AS left_pickup_cnt,
    COUNT(1) FILTER (WHERE last_status='near_dropoff') AS near_dropoff_cnt,
    COUNT(1) FILTER (WHERE last_status='completed') AS completed_cnt,
  
    FROM group_window_get_last_state
    GROUP BY  COUNTRY_CODE, CITY_ID window_end
So problem is that this query produces upsert stream instead of append-only stream due to group aggregation in the end. My question: is there a way to transform this to append-only stream? What is a general rule to write feature so that it will be append-only? Problem with upsert stream for us is that it generate too many writes to kafka which we don’t need.