Shivam Kapoor
11/03/2022, 12:59 PMShivam Kapoor
11/03/2022, 1:54 PMShivam Kapoor
11/04/2022, 10:18 AMShivam Kapoor
11/05/2022, 12:09 PMShivam Kapoor
11/06/2022, 7:20 PMSunny Hashmi (Airbyte)
11/08/2022, 11:49 PMShivam Kapoor
11/09/2022, 7:25 AMShivam Kapoor
11/09/2022, 7:25 AMShivam Kapoor
11/09/2022, 7:26 AMShivam Kapoor
01/04/2023, 11:23 AMAxel Waserman
03/22/2023, 7:13 AM2023-03-21 20:07:06 INFO i.a.w.g.DefaultReplicationWorker(lambda$readFromSrcAndWriteToDstRunnable$7):385 - Records read: 1000 (1 MB)
2023-03-21 20:09:57 INFO i.a.w.g.DefaultReplicationWorker(lambda$readFromSrcAndWriteToDstRunnable$7):385 - Records read: 2000 (3 MB)
2023-03-21 20:12:48 INFO i.a.w.g.DefaultReplicationWorker(lambda$readFromSrcAndWriteToDstRunnable$7):385 - Records read: 3000 (5 MB)
2023-03-21 20:15:39 INFO i.a.w.g.DefaultReplicationWorker(lambda$readFromSrcAndWriteToDstRunnable$7):385 - Records read: 4000 (7 MB)
2023-03-21 20:18:34 INFO i.a.w.g.DefaultReplicationWorker(lambda$readFromSrcAndWriteToDstRunnable$7):385 - Records read: 5000 (9 MB)
2023-03-21 20:21:30 INFO i.a.w.g.DefaultReplicationWorker(lambda$readFromSrcAndWriteToDstRunnable$7):385 - Records read: 6000 (11 MB)
Just to make sure it wasn’t a scaling issue, I scaled up my nodegroup to 3 nodes and tried syncing again.
Even with the lever worker pod being the only pod running on an m5.large instance, the issue was still there.
I think I identified the root cause. I caught this in the logs :
2023-03-22 01:19:40 INFO i.a.w.g.DefaultReplicationWorker(lambda$readFromSrcAndWriteToDstRunnable$7):385 - Records read: 81000 (86 MB)
2023-03-22 01:20:47 source > Backing off _send(...) for 5.0s (airbyte_cdk.sources.streams.http.exceptions.DefaultBackoffException: Request URL: <https://api.lever.co/v1/opportunities/783f1562-1f8a-4d98-96bc-d09a1ef960b3/applications?limit=50>, Response Code: 500, Response Text: Internal Server Error)
2023-03-22 01:20:47 source > Caught retryable error 'Request URL: <https://api.lever.co/v1/opportunities/783f1562-1f8a-4d98-96bc-d09a1ef960b3/applications?limit=50>, Response Code: 500, Response Text: Internal Server Error' after 1 tries. Waiting 5 seconds then retrying...
The url has limit=50
in it so it seems to be pulling records 50 per 50. Therefore, pulling 1000 records requires sending 20 HTTP requests.
3min = 180sec so this means 1 call takes 180/20 = 9s
That’s still pretty bad performance …
Did anyone have a similar issue with any source connector ?