https://linen.dev logo
#troubleshooting
Title
# troubleshooting
l

Leonardo de Almeida

03/24/2022, 9:21 PM
Is this your first time deploying Airbyte: yes OS Version / Instance: Mac M1 Deployment: Kubernetes Airbyte Version: 0.35.30-alpha Source/version: PostgreSQL 12.8 ==> Airbyte source connector version 0.4.4 (downgraded from 0.4.9 to show schemas) Description: Hello guys, I'm testing airbyte sync from postgres to s3 and is taking too long to finish and we notice that has an issue on github to discuss about modify fetchSize that is hard coded to 1000. We want to change this value but the only way that we find to do that is changing from source code and build our image but we don't want to maintain an exclusive image, and the github issue has not received any updates since 11/08/2021. Do you guys have any update about this feature? Or any alternative to make this sync faster?
o

Octavia Squidington III

03/25/2022, 1:08 PM
loading...
a

Augustin Lafanechere (Airbyte)

03/25/2022, 4:01 PM
Hi @Leonardo de Almeida you're right, you have to build the image yourself if you want to customize this settings. You can definitely post on the issue to share that it's a concern for you. I can share that we'll focus on making the database connectors better in Q2.
l

Leonardo de Almeida

03/25/2022, 4:15 PM
Thank you for you answer @Augustin Lafanechere . I have another question: can I build only postgres source imagem or I have to build the entire airbyte image? If I can build just postgres source, where can I set the image? Sorry but I couldn't find this by searching in docs.
I've been facing an issue with that sync that I've mentioned above. There's a large table with 600m+ records in sync for about 24 hours and 30 minutes ago I notice that sync has an error but I can't see logs in UI (probably because has too many files). Searching for logs in min.io storage, I found that log
Copy code
==> 20220325191840_airbyte-worker-659595d44b-hf75l_8ad9b3dbb16742709addf3143b05ca46 <==
2022-03-25 19:18:04 source > 2022-03-25 19:18:04 INFO i.a.i.s.p.PostgresSource(main):358 - completed source: class io.airbyte.integrations.source.postgres.PostgresSource
2022-03-25 19:18:34 destination > 2022-03-25 19:18:34 INFO i.a.i.b.FailureTrackingAirbyteMessageConsumer(close):65 - Airbyte message consumer: succeeded.
2022-03-25 19:18:34 destination > 2022-03-25 19:18:34 INFO i.a.i.d.s.w.BaseS3Writer(close):113 - Uploading remaining data for stream 'billing_entry'.
2022-03-25 19:18:34 destination > 2022-03-25 19:18:34 INFO a.m.s.MultiPartOutputStream(close):158 - Called close() on [MultipartOutputStream for parts 1 - 10000]
2022-03-25 19:18:34 destination > 2022-03-25 19:18:34 INFO a.m.s.MultiPartOutputStream(close):158 - Called close() on [MultipartOutputStream for parts 1 - 10000]
2022-03-25 19:18:34 destination > 2022-03-25 19:18:34 WARN a.m.s.MultiPartOutputStream(close):160 - [MultipartOutputStream for parts 1 - 10000] is already closed
2022-03-25 19:18:36 destination > 2022-03-25 19:18:36 INFO a.m.s.StreamTransferManager(uploadStreamPart):558 - [Manager uploading to prd-ifood-data-products-bronze/logistics_driver_income/2driver_income_ifood_fleet_driver_income/billing_entry/2022_03_24_1648147061634_0.jsonl with id 4aZ9YMbAk...HU1b8sA--]: Finished uploading [Part number 2878 containing 65.38 MB]
2022-03-25 19:18:36 destination > 2022-03-25 19:18:36 INFO a.m.s.StreamTransferManager(complete):395 - [Manager uploading to prd-ifood-data-products-bronze/logistics_driver_income/2driver_income_ifood_fleet_driver_income/billing_entry/2022_03_24_1648147061634_0.jsonl with id 4aZ9YMbAk...HU1b8sA--]: Completed
2022-03-25 19:18:36 destination > 2022-03-25 19:18:36 INFO i.a.i.d.s.w.BaseS3Writer(close):115 - Upload completed for stream 'billing_entry'.
2022-03-25 19:18:36 destination > 2022-03-25 19:18:36 INFO i.a.i.b.IntegrationRunner(runInternal):154 - Completed integration: io.airbyte.integrations.destination.s3.S3Destination

==> 20220325191936_airbyte-worker-659595d44b-hf75l_706f4a9e551b4183bdaf10359fa4a4a1 <==
        at io.airbyte.workers.process.KubePodProcess.getReturnCode(KubePodProcess.java:677)
        at io.airbyte.workers.process.KubePodProcess.exitValue(KubePodProcess.java:704)
        at java.base/java.lang.Process.hasExited(Process.java:584)
        at java.base/java.lang.Process.isAlive(Process.java:574)
        at io.airbyte.workers.protocols.airbyte.DefaultAirbyteSource.isFinished(DefaultAirbyteSource.java:98)
        at io.airbyte.workers.DefaultReplicationWorker.lambda$getReplicationRunnable$5(DefaultReplicationWorker.java:281)
        ... 4 more
,retryable=<null>,timestamp=1648235914333]]]
2022-03-25 19:19:04 INFO i.a.w.t.TemporalUtils(withBackgroundHeartbeat):234 - Stopping temporal heartbeating...
2022-03-25 19:19:04 INFO i.a.c.p.ConfigRepository(updateConnectionState):545 - Updating connection b435ec25-bec9-41b5-8a41-baef9baa780f state: io.airbyte.config.State@4f016904[state={"cdc":false,"streams":[{"stream_name":"billing_entry","stream_namespace":"ifood_fleet_driver_income","cursor_field":["last_modified_date"],"cursor":"2022-03-24T18:37:45Z"}]}]
There's no ERROR in logs but has a stack trace and a line showing that replication was completed and after that another sync was starting right after fails. There's another way to find logs about that?
4 Views