My source-lever hiring connector is reading data a...
# ask-community-for-troubleshooting
s
My source-lever hiring connector is reading data at 1000 records every 5 mins. Is there anyway to speed this up ?
I am running this setup on k8s. I tried increasing the replicas for worker from 1->3 and limits of CPU and memory are ~ 3gigs. But I still see 1000 records being read every 5 mins. The destination is Redshift and it is using the COPY strategy via s3.
Has anyone used this lever-hiring connector before ? not really able to find the bottleneck. Same result on my local docker setup as well.
@Saj Dider (Airbyte) do you know someone who can help here?
2022-11-06 104527 INFO i.a.w.g.DefaultReplicationWorker(lambda$getReplicationRunnable$6):354 - Records read: 61000 (42 MB) 2022-11-06 110643 INFO i.a.w.g.DefaultReplicationWorker(lambda$getReplicationRunnable$6):354 - Records read: 62000 (43 MB) 2022-11-06 113424 INFO i.a.w.g.DefaultReplicationWorker(lambda$getReplicationRunnable$6):354 - Records read: 63000 (44 MB) 2022-11-06 115604 INFO i.a.w.g.DefaultReplicationWorker(lambda$getReplicationRunnable$6):354 - Records read: 64000 (44 MB) 2022-11-06 121802 INFO i.a.w.g.DefaultReplicationWorker(lambda$getReplicationRunnable$6):354 - Records read: 65000 (45 MB) 2022-11-06 124354 INFO i.a.w.g.DefaultReplicationWorker(lambda$getReplicationRunnable$6):354 - Records read: 66000 (46 MB) 2022-11-06 125318 INFO i.a.w.g.DefaultReplicationWorker(lambda$getReplicationRunnable$6):354 - Records read: 67000 (47 MB) These are some ridiculous timestamps…
s
Hey there, did you check out the Scaling Airbyte document? If you attach the full sync log I can look into any bottlenecks.
s
Untitled.txt
@Sunny Hashmi (Airbyte) this is the log ^^
I have read the scaling doc & after reading it I have increased the replicas for worker and individual memory/cpu for every worker.
Hi folks, I am still trying to debug this issue, any help would be appreciated :)
a
Hi Team, I’m running Airbyte OSS on EKS and have the exact same problem as @Shivam Kapoor with the Lever hiring connector. The worker pod takes almost 3min to pull 1000 records. See logs below :
Copy code
2023-03-21 20:07:06 INFO i.a.w.g.DefaultReplicationWorker(lambda$readFromSrcAndWriteToDstRunnable$7):385 - Records read: 1000 (1 MB)
2023-03-21 20:09:57 INFO i.a.w.g.DefaultReplicationWorker(lambda$readFromSrcAndWriteToDstRunnable$7):385 - Records read: 2000 (3 MB)
2023-03-21 20:12:48 INFO i.a.w.g.DefaultReplicationWorker(lambda$readFromSrcAndWriteToDstRunnable$7):385 - Records read: 3000 (5 MB)
2023-03-21 20:15:39 INFO i.a.w.g.DefaultReplicationWorker(lambda$readFromSrcAndWriteToDstRunnable$7):385 - Records read: 4000 (7 MB)
2023-03-21 20:18:34 INFO i.a.w.g.DefaultReplicationWorker(lambda$readFromSrcAndWriteToDstRunnable$7):385 - Records read: 5000 (9 MB)
2023-03-21 20:21:30 INFO i.a.w.g.DefaultReplicationWorker(lambda$readFromSrcAndWriteToDstRunnable$7):385 - Records read: 6000 (11 MB)
Just to make sure it wasn’t a scaling issue, I scaled up my nodegroup to 3 nodes and tried syncing again. Even with the lever worker pod being the only pod running on an m5.large instance, the issue was still there. I think I identified the root cause. I caught this in the logs :
Copy code
2023-03-22 01:19:40 INFO i.a.w.g.DefaultReplicationWorker(lambda$readFromSrcAndWriteToDstRunnable$7):385 - Records read: 81000 (86 MB)
2023-03-22 01:20:47 source > Backing off _send(...) for 5.0s (airbyte_cdk.sources.streams.http.exceptions.DefaultBackoffException: Request URL: <https://api.lever.co/v1/opportunities/783f1562-1f8a-4d98-96bc-d09a1ef960b3/applications?limit=50>, Response Code: 500, Response Text: Internal Server Error)
2023-03-22 01:20:47 source > Caught retryable error 'Request URL: <https://api.lever.co/v1/opportunities/783f1562-1f8a-4d98-96bc-d09a1ef960b3/applications?limit=50>, Response Code: 500, Response Text: Internal Server Error' after 1 tries. Waiting 5 seconds then retrying...
The url has
limit=50
in it so it seems to be pulling records 50 per 50. Therefore, pulling 1000 records requires sending 20 HTTP requests. 3min = 180sec so this means 1 call takes 180/20 = 9s That’s still pretty bad performance … Did anyone have a similar issue with any source connector ?