Danielle Murdock
02/10/2025, 2:48 PM05:02:05.134
Starting pod sweeper cycle:
05:02:05.137
Will sweep succeeded pods from before 2025-02-10T04:52:05,135555682+00:00
05:02:05.140
Will sweep unsuccessful pods from before 2025-02-10T03:02:05,138479147+00:00
05:02:05.150
10.50.26.30 - - [10/Feb/2025:05:02:05 +0000] "GET /index.html HTTP/1.1" 200 25279 "-" "kube-probe/1.30+" "-"
05:02:05.232
Completed pod sweeper cycle. Sleeping for 60 seconds...
05:02:06.495
2025-02-10 05:02:06,495 [scheduled-executor-thread-1] INFO i.a.c.j.WorkloadMonitor(cancelRunningForTooLongNonSyncWorkloads):137 - Checking for workloads running for too long with timeout value PT4H
05:02:06.495
2025-02-10 05:02:06,495 [scheduled-executor-thread-13] INFO i.a.c.j.WorkloadMonitor(cancelRunningForTooLongSyncWorkloads):157 - Checking for sync workloads running for too long with timeout value PT720H
05:02:06.495
2025-02-10 05:02:06,495 [scheduled-executor-thread-9] INFO i.a.c.j.WorkloadMonitor(cancelNotHeartbeatingWorkloads):110 - Checking for non heartbeating workloads.
05:02:06.495
2025-02-10 05:02:06,495 [scheduled-executor-thread-16] INFO i.a.c.j.WorkloadMonitor(cancelNotClaimedWorkloads):84 - Checking for not claimed workloads.
05:02:06.496
2025-02-10 05:02:06,495 [scheduled-executor-thread-8] INFO i.a.c.j.WorkloadMonitor(cancelNotStartedWorkloads):59 - Checking for not started workloads.
kapa.ai
02/10/2025, 2:49 PMDanielle Murdock
02/10/2025, 2:52 PMkapa.ai
02/10/2025, 2:52 PMkapa.ai
02/10/2025, 2:53 PMINIT_RETRY_TIMEOUT_MINUTES
if this is the case. 2. Resource constraints: If your Kubernetes cluster is under-resourced, it might cause delays in pod creation and initialization. This could lead to timeouts. Ensure that your cluster has sufficient resources to handle the workload. 3. Pod sweeper frequency: While the pod sweeper shouldn't prevent jobs from starting, if it's running too frequently, it might be consuming resources needed for job execution. You mentioned that the sweeping cycles are running continuously, which is not the intended behavior. The pod sweeper is typically configured to run every 2 hours, as mentioned in the [GitHub issue #38187](https://github.com/airbytehq/airbyte/issues/38187). To address these issues, you could try the following: 1. Check your Airbyte configuration, particularly the pod sweeper settings. Ensure it's not set to run too frequently. 2. Increase the INIT_RETRY_TIMEOUT_MINUTES
as suggested in the [GitHub issue #22907](https://github.com/airbytehq/airbyte/issues/22907). 3. Review your Kubernetes cluster resources and scaling settings to ensure they can handle the Airbyte workload. 4. Check the full logs of a failing sync job to see if there are more specific error messages that could point to the root cause. If these steps don't resolve the issue, you may want to open a new issue on the [Airbyte GitHub repository](https://github.com/airbytehq/airbyte) with detailed logs and your configuration settings. The continuous running of pod sweeping cycles and failure of sync jobs to start is not normal behavior and may require further investigation by the Airbyte team.