Samikshan Bairagya
09/13/2022, 2:50 PM(2h 12m 37s)
to complete the data sync. The Slack source was configured with Join all channels
set to true
and one channel name in the Channel name filter
list.
From the logs we could see that it took around 1 hour for the Syncing stream: channel_messages
step to be completed:
...
2022-09-13 10:03:40 [44msource[0m > Syncing stream: channel_messages
2022-09-13 11:04:52 [43mdestination[0m > 2022-09-13 11:04:52 [32mINFO[m i.a.i.d.r.SerializedBufferingStrategy(lambda$addRecord$0):55 - Starting a new buffer for stream channel_members (current state: 0 bytes in 0 buffers)
2022-09-13 11:04:52 [43mdestination[0m > 2022-09-13 11:04:52 [32mINFO[m i.a.i.d.r.SerializedBufferingStrategy(lambda$addRecord$0):55 - Starting a new buffer for stream channel_messages (current state: 0 bytes in 1 buffers)
2022-09-13 11:05:06 [44msource[0m > Read 66 records from channel_messages stream
2022-09-13 11:05:06 [44msource[0m > Finished syncing channel_messages
2022-09-13 11:05:06 [44msource[0m > SourceSlack runtimes:
...
After this it took further 1 hour for the Syncing stream: threads
step to error out (503 Service Unavailable
), before retrying and completing the sync in a total time of ~ 1h 10m
. You can see portion of the logs here:
...
2022-09-13 11:05:06 [44msource[0m > Finished syncing channels
2022-09-13 11:05:06 [44msource[0m > SourceSlack runtimes:
Syncing stream channel_members 0:00:00.564468
Syncing stream channel_messages 1:01:25.944632
Syncing stream channels 0:00:00.049431
2022-09-13 11:05:06 [44msource[0m > Syncing stream: threads
2022-09-13 11:05:06 [44msource[0m > Syncing replies {'channel': <channel_id>, 'oldest': 341884800.0, 'latest': 341971200.0}
2022-09-13 11:05:06 [44msource[0m > Syncing replies {'channel': <channel_id>, 'oldest': 341971200.0, 'latest': 342057600.0}
2022-09-13 11:05:07 [44msource[0m > Syncing replies {'channel': <channel_id>, 'oldest': 342057600.0, 'latest': 342144000.0}
2
.
<logs snipped>
.
.
2022-09-13 12:05:41 [44msource[0m > Syncing replies {'channel': <channel_id>, 'oldest': 1480291200.0, 'latest': 1480377600.0}
2022-09-13 12:05:41 [44msource[0m > Syncing replies {'channel': <channel_id>, 'oldest': 1480377600.0, 'latest': 1480464000.0}
2022-09-13 12:05:42 [44msource[0m > Syncing replies {'channel': <channel_id>, 'oldest': 1480464000.0, 'latest': 1480550400.0}
2022-09-13 12:05:44 [44msource[0m > Retry-after header not found. Using default backoff value
2022-09-13 12:05:44 [44msource[0m > Backing off _send(...) for 0.0s (airbyte_cdk.sources.streams.http.exceptions.UserDefinedBackoffException: Request URL: <https://slack.com/api/conversations.history?limit=100&channel=><channel_id>&oldest=1480464000.0&latest=1480550400.0, Response Code: 503, Response Text: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>503 Service Unavailable</title>
</head><body>
<h1>Service Unavailable</h1>
<p>The server is temporarily unable to service your
request due to maintenance downtime or capacity
problems. Please try again later.</p>
</body></html>)
2022-09-13 12:05:50 [44msource[0m > Retrying. Sleeping for 5 seconds
2022-09-13 12:05:50 [44msource[0m > Syncing replies {'channel': <channel_id>, 'oldest': 1480550400.0, 'latest': 1480636800.0}
2022-09-13 12:05:50 [44msource[0m > Syncing replies {'channel': <channel_id>, 'oldest': 1480636800.0, 'latest': 1480723200.0}
...
<snipped>
...
2022-09-13 12:15:56 [44msource[0m > Syncing replies {'channel': <channel_id>, 'oldest': 1662940800.0, 'latest': 1663027200.0}
2022-09-13 12:15:56 [44msource[0m > Syncing replies {'channel': <channel_id>, 'oldest': 1663027200.0, 'latest': 1663113600.0}
2022-09-13 12:15:56 [44msource[0m > Read 71 records from threads stream
2022-09-13 12:15:56 [44msource[0m > Finished syncing threads
2022-09-13 12:15:56 [44msource[0m > SourceSlack runtimes:
Syncing stream channel_members 0:00:00.564468
Syncing stream channel_messages 1:01:25.944632
Syncing stream channels 0:00:00.049431
Syncing stream threads 1:10:50.295063
It would be great if anyone could help us figure out why sync on a single slack channel is taking so much time. Thanks!