Flink SQL join performance For testing, i created ...
# troubleshooting
s
Flink SQL join performance For testing, i created two tables
sampledata
and
sampledata2
both with the same schema
Copy code
create temporary table sampledata (
  id INT primary key not enforced,
  my_value STRING,
  updated_at TIMESTAMP
) with (
  'connector' = 'filesystem',
  'path' = 'file:///home/sharath/data2/sample_data.csv',
  'format' = 'csv',
);
each containing the same data - 1 million rows of 2 kb each row. I am trying to measure join speed with the following
Copy code
create table sampledatasink as
select s1.id, s2.my_value, s1.updated_at from sampledata s1 join sampledata2 s2 on (s1.id = s2.id);
Some more details about my setup • single node deployment 2 cores 8 GB ram, 40GB ssd • rocksdb backend, incremental mode checkpoint • checkpointing set to 10 mins and MinPauseBetweenCheckpoints = 10 mins • mode = streaming The join speed in very slow (of the order of 100 rows per second) I read some blog posts from ververica which seemed to indicate join speeds of ~89,000 rows per second. Plz let me know if i am doing anything wrong.. what speeds can i expect here (ballpark)