https://linen.dev logo
Join Slack
Powered by
# ask-community-for-troubleshooting
  • m

    Mike Braden

    11/10/2025, 6:22 PM
    I seem to have fixed the resource requests/limits for my replication job source container so it doesn't OOM, however I think the previous "silent failures" have left things in a bad state. The sync's all say they are successful, but there is not any data in the target namespace. Furthermore, there are only 3 streams that are getting queued/ran, the rest say "Queued for next sync". Finally, it seems like the airbyte_internal.namespace_raw__stream_* tables are persisting after the sync says it is complete and no replication is job is running. These still have data in them and end up getting about the same number of rows added to them. I tried clearing the data and then doing a sync, but it again only does 3 of 7 streams, the first sync took 1hr 17 minutes, and the second sync is still "running" going on 1.5 hours and seems to be stuck with the three streams still in "syncing" state but saying "X records loaded" and the number of rows in the raw stream tables are not changing. Should I be manually wiping any (other) tables that the "Clear Data" button didn't do to get it to properly sync everything?
    k
    • 2
    • 1
  • s

    Simon Schmitke

    11/10/2025, 6:57 PM
    @kapa.ai For EKS, I found the sizings of the pods to be:
    resources:
    limits:
    cpu: 500m
    memory: 900Mi
    requests:
    cpu: 300m
    memory: 600Mi
    I'm trying to sync multiple tables with multi billion rows. Are these resource sizes too small?
    k
    m
    • 3
    • 6
  • m

    Mike Braden

    11/10/2025, 7:18 PM
    Should an incremental stream constantly grow the raw stream table by the total number of rows? For example, I have a stream "molecules" that is set up to be (async) incremental. It has 18424 rows in the target table. The raw stream table, however, has 773695 rows (!). Is that raw stream table going to continue to grow every time the sync is run?
    k
    • 2
    • 2
  • h

    Harsh Kumar

    11/10/2025, 7:49 PM
    Hello all, We want to get the connectionId and connectionName in the destination code(replication pod). Can someone please help us on this ?
    k
    • 2
    • 1
  • z

    Zack Roberts

    11/10/2025, 8:45 PM
    hey guys... im on the airbyte cloud standard plan but cannot create more than 1 workspace?
    k
    • 2
    • 1
  • s

    Slackbot

    11/10/2025, 8:45 PM
    This message was deleted.
    k
    • 2
    • 1
  • l

    Lucas Segers

    11/10/2025, 8:59 PM
    Hey guys! after a bit of a painful upgrade from 1.7.2 to 2.0.1 yesterday, I'm noticing that every 10 seconds or so the "airbyte-cron" deployment logs "RESOURCE_EXHAUSTED" when trying to "listClosedWorkflowExecutions"
    Copy code
    io.grpc.StatusRuntimeException: RESOURCE_EXHAUSTED: namespace rate limit exceeded
    	at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:351)
    	at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:332)
    	at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:174)
    	at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.listClosedWorkflowExecutions(WorkflowServiceGrpc.java:5903)
    	at io.airbyte.commons.temporal.WorkflowServiceStubsWrapped.blockingStubListClosedWorkflowExecutions$lambda$0(WorkflowServiceStubsWrapped.kt:38)
    	at dev.failsafe.Functions.lambda$toCtxSupplier$11(Functions.java:243)
    	at dev.failsafe.Functions.lambda$get$0(Functions.java:46)
    	at dev.failsafe.internal.RetryPolicyExecutor.lambda$apply$0(RetryPolicyExecutor.java:74)
    	at dev.failsafe.SyncExecutionImpl.executeSync(SyncExecutionImpl.java:187)
    	at dev.failsafe.FailsafeExecutor.call(FailsafeExecutor.java:376)
    	at dev.failsafe.FailsafeExecutor.get(FailsafeExecutor.java:112)
    	at io.airbyte.commons.temporal.RetryHelper.withRetries(RetryHelper.kt:57)
    	at io.airbyte.commons.temporal.WorkflowServiceStubsWrapped.withRetries(WorkflowServiceStubsWrapped.kt:63)
    	at io.airbyte.commons.temporal.WorkflowServiceStubsWrapped.blockingStubListClosedWorkflowExecutions(WorkflowServiceStubsWrapped.kt:37)
    	at io.airbyte.commons.temporal.TemporalClient.fetchClosedWorkflowsByStatus(TemporalClient.kt:137)
    	at io.airbyte.commons.temporal.TemporalClient.restartClosedWorkflowByStatus(TemporalClient.kt:113)
    	at io.airbyte.cron.jobs.SelfHealTemporalWorkflows.cleanTemporal(SelfHealTemporalWorkflows.kt:39)
    	at io.airbyte.cron.jobs.$SelfHealTemporalWorkflows$Definition$Exec.dispatch(Unknown Source)
    	at io.micronaut.context.AbstractExecutableMethodsDefinition$DispatchedExecutableMethod.invoke(AbstractExecutableMethodsDefinition.java:456)
    	at io.micronaut.inject.DelegatingExecutableMethod.invoke(DelegatingExecutableMethod.java:86)
    	at io.micronaut.context.bind.DefaultExecutableBeanContextBinder$ContextBoundExecutable.invoke(DefaultExecutableBeanContextBinder.java:152)
    	at io.micronaut.scheduling.processor.ScheduledMethodProcessor.lambda$scheduleTask$2(ScheduledMethodProcessor.java:160)
    	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
    	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:358)
    	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305…
    Everything seems to be kind of fine, but is there any way to fiddle with temporal params? Anyone noticed the same after upgrading to v2? Have some frequent syncs in here (maybe 10 simultaneous max)
    k
    • 2
    • 4
  • f

    Fabrizio Spini

    11/11/2025, 9:12 AM
    Ciao everyone, I'm deploying airbyte v 2.0.1 on kubernetes via terraform. The deploy goes smoothly but I'm not able to force the internal DB disk space. I've tried with the different keys here reported 1. `postgresql.persistence.size`Standard Helm Chart key (used in
    airbyte-values.yaml
    ) 2. `postgresql.primary.persistence.size`Direct access key used by Bitnami PostgreSQL sub-charts. 3. `postgresql.volumeClaimTemplates[0].spec.resources.requests.storage`*Direct Kubernetes template path* (Discovered in live GKE YAML). setting those keys in the following section
    Copy code
    resource "helm_release" "airbyte" {
      name             = "airbyte-v2"
      repository       = "<https://airbytehq.github.io/charts>"
      chart            = "airbyte"
      namespace        = "airbyte"
      create_namespace = true
    
      set {
        name  = "<<Keys reported above>>"
        value = "50Gi"
      }
    But all failed to set 50G of disk space for internal postgres and this is leading to a "no disk space left" on postgres soon. I have tried to read the documentation but I haven't found any evidence of which key I have to use. Does anyone had success to set the storage for the internal DB?
    k
    • 2
    • 1
  • v

    Victor K.

    11/11/2025, 1:43 PM
    Hi Team, I am getting following error on my AirByte connection: HTTP Status Code: 404. Error: Not found. The requested resource was not found on the server. I tested everything and source, builders are working just fine.
    k
    • 2
    • 4
  • r

    Rob Kwark

    11/11/2025, 4:53 PM
    After doing some testing in order to use the s3 parquet destination, I need an incredibly high RAM (I set the request for 6 gb and limit to 10gb) otherwise I'd get a PIPE BROKEN error just to load a 1.3 gb file from snowflake to s3 parquet. Is this normal??
    k
    • 2
    • 7
  • m

    Matt Monahan

    11/11/2025, 5:26 PM
    Hey y'all, We have a clickhouse instance that is exposed over https and I cannot for the life of me get the clickhouse destination to connect to it. We have various other tools connecting to it and inputting data but even with admin credentials I just get the following error:
    Could not connect with provided configuration. Error: Failed to insert expected rows into check table. Actual written: 0
    Our server is behind a load balancer (traefik) so its exposed on port
    443
    but i specified that to no avail 🤔
    k
    • 2
    • 2
  • r

    Rob Kwark

    11/11/2025, 7:00 PM
    Is it normal for airbyte to read all data into memory? I have these settings:
    Copy code
    2025-11-11 09:36:00 platform INFO [source] image: airbyte/source-snowflake:1.0.8 resources: ResourceRequirements(claims=[], limits={memory=2Gi, cpu=2}, requests={memory=1Gi, cpu=2}, additionalProperties={})
    and basically the job fails with PIPE BROKEN error (Java OOMKILLED) because it can't load the entire table into RAM memory. When I set the source connection to have higher limits, I notice that it has to pull in the ENTIRE table into memory, instead of in chunks This seems like a big issue - does this mean that the worker resource needs to scale with table size?
    k
    • 2
    • 4
  • m

    Mike Braden

    11/11/2025, 7:08 PM
    I think incremental sync query parameter injection for async streams is broken in 1.8.5. While it seems like the
    {{ stream_interval['start_time'] }}
    and
    {{ stream_interval['end_time'] }}
    variables are getting set correctly (seen in logs), it doesn't appear to actually be inserting these into the query parameters and every sync always pulls everything. If I have both incremental sync enabled and configured with query parameter injection AND configure additional query parameters under Request Options then it does appear to work:
    j
    • 2
    • 1
  • p

    Pranay S

    11/12/2025, 7:19 AM
    hello, i am running the synv endpoint for my job
    <https://api.airbyte.com/v1/jobs>
    now the sync has started but i also want some loading or atleast an estimate timer, or anything of that sort to keep the ui dynamic. is there a way i can do that? im aware of the endpoint
    <https://api.airbyte.com/v1/jobs/jobId>
    which will give me current status of the job, but calling it again and again until it shows completed is abit hectic. is there a more elegant way to deal with it?
    k
    • 2
    • 1
  • k

    Komal Kumari

    11/12/2025, 12:42 PM
    2025-11-12 165940.474 | 2025-11-12 112940,473 [io-executor-thread-10] ERROR i.a.c.c.ConfigReplacer(getAllowedHosts):70 - All allowedHosts values are un-replaced. Check this connector's configuration or actor definition - [${subdomain}.atlassian.net] does any one face this error ?
    k
    • 2
    • 1
  • b

    bollo

    11/12/2025, 3:36 PM
    hello im using version 1.5.3 of the S3 destination connector we are using the path format to partition our ingestion like
    ingested_at=${YEAR}-${MONTH}-${DAY}-${HOUR}/stream=${STREAM_NAME}/client=${NAMESPACE}/
    and then using
    ingested_at
    as bookmark to process the data in our pipeline problem is that airbyte is putting all the data from an ingestion in the same partition no matter if it takes 1 hour or several, so our pipeline drops data is there a workaround for this? is it a known bug?
    k
    • 2
    • 1
  • m

    Mahmoud Khaled

    11/12/2025, 4:26 PM
    Has anyone used airbyte to read data from Google Play and app store such as number of app installs?
    k
    • 2
    • 1
  • a

    aidatum

    11/12/2025, 4:51 PM
    Hi, I am facing challenge deploying Airbyte 1.8.5 Workload-Launcher Deployment on OpenShift Environment, I generated menefest file yet its failing • Platform: OpenShift Container Platform 4.18 (Kubernetes 1.31.11) • Airbyte Version: 1.8.5 OSS (Open Source) Problem Workload-launcher pod fails to start with authentication errors despite running OSS version which shouldn’t require authentication. Error Messages 1. "Could not resolve placeholder ${DATAPLANE_CLIENT_ID}" 2. "Could not resolve placeholder ${CONTROL_PLANE_TOKEN_ENDPOINT}" 3. "Failed to heartbeat - baseUrl is invalid" 4. "Failed to obtain or add access token - token request failed" any idea?
    k
    • 2
    • 1
  • s

    Steve Ma

    11/12/2025, 9:47 PM
    HI, I am getting an error when set up postgres source connector like
    We detected XMIN transaction wraparound in the database..
    . Looks like it is introduced in this PR https://github.com/airbytehq/airbyte/pull/38836/files. I understand the concern here about
    XMIN transaction wraparound
    but could we consider to raise a warning message instead of throwing an error? For my case, I am just planning to sync some regular tables not those really large tables
    k
    • 2
    • 1
  • a

    Akshata Shanbhag

    11/13/2025, 7:59 AM
    i am noticing a behaviour where there are more job attempts per job than usual for the airbyte version 2.0.8 as compared to previous version 1.7.1. How this affects us is that, the cursor progresses in some of the failed attempts and the final successful attempt usually picks up the updated cursor. Since the cursor already progressed in a previous attempt, we lose the records emitted during that attempt. here is a screenshot where records are emitted but not committed and the cursor has progressed in this failed attempt. any recommendations on how to ensure this does not happen.
    k
    • 2
    • 1
  • s

    Santoshi Kalaskar

    11/13/2025, 1:20 PM
    Hi #C021JANJ6TY I recently started using Airbyte as a data integration platform to fulfill a client requirement. Our objective is to transfer data from SharePoint (specifically from site pages or users’ personal drives) to a Google Cloud Storage (GCS) bucket as the destination. During the configuration of the SharePoint source in Airbyte, we encountered issues while performing OAuth authentication. The error occurred during the redirect URL step, even though we had configured the redirect URL in the Azure App Registration as per the available documentation. Need some help, while setting up the redirect url. thanks in advance! https://docs.airbyte.com/integrations/sources/microsoft-sharepoint
    k
    • 2
    • 2
  • a

    aidatum

    11/13/2025, 2:41 PM
    I found several issue while deploying airbyte community edition 1.8.5 in openshift as it has several helm-chart issue haven anyone faced similar issue, sever worker fine yet woker and workload-laucher has serious config issues
    k
    • 2
    • 1
  • v

    Valeria Tapia

    11/13/2025, 4:25 PM
    Hi! Quick question — does anyone know if it’s possible to do an incremental sync (append) when using Airtable as a source? So far we’ve only been able to get it working with full refresh, but that’s not what we’re aiming for. Any ideas? 🙏
    k
    j
    • 3
    • 3
  • y

    yanndata

    11/13/2025, 6:15 PM
    Hi! I want to install airbyte with abctl on my droplet (digita ocean with ubuntu) , so i failed
  • p

    Pranay S

    11/14/2025, 6:35 AM
    hello, i have made a connection with airbyte and my shopify store. but while syncing im getting this error can anyone please help me? what ive tried : 1. using new pgsql connection 2. using new store 3. clearing airbyte data and syncing again none of these helped.
    • 1
    • 1
  • a

    Alejandro De La Cruz LĂłpez

    11/14/2025, 9:24 AM
    Hey! I am ingesting Hubspot data with Airbyte. From the 7th of November, our full-refresh syncs don't ingest all data available in HS. Nothing changed, but our daily run ingest a random number of rows instead of the total that appear in HS. Any ideas?
    👍 1
    k
    m
    • 3
    • 2
  • a

    Alessio Darmanin

    11/14/2025, 11:02 AM
    Hi. I was successfully using Airbyte version 1.8.4. Today I upgraded to 2.0.1 and now I get
    Airbyte is temporarily unavailable. Please try again. (HTTP 502)
    when trying to retest a previously working source. From the pods view, the control-plane looks healthy; everything is shown as Running. Going through the server.log I can see the contents shown below, but SSH Tunnel Method is set to "No Tunnel" in the source definition, and the JSON also reflects this.
    Copy code
    JSON schema validation failed.
    errors: $.tunnel_method: must be the constant value 'SSH_KEY_AUTH',
    required property 'tunnel_host' not found,
    required property 'tunnel_port' not found,
    required property 'tunnel_user' not found,
    required property 'ssh_key' not found
    Source JSON:
    Copy code
    "tunnel_method": { "tunnel_method": "NO_TUNNEL" }
    What could the issue be please?
    k
    • 2
    • 1
  • d

    Diego Quintana

    11/14/2025, 11:34 AM
    Hi! I'm getting a weird error on my clickhouse -> postgres connection
    Copy code
    source connector ClickHouse v0.2.6
    destination Postgres v2.2.1
    airbyte version 0.50.31 (I know, I know)
    The error appears after a bit syncing incremental models, and it seems to be
    Copy code
    java.sql.SQLException: java.io.IOException: Premature EOF
    A full refresh takes around 1.22H and it does not disconnect though. I've set
    socket_timeout=300000
    in my connection with no success. What can it be?
    k
    • 2
    • 13
  • a

    Ashok Pothireddy

    11/14/2025, 11:35 AM
    hello team, Question regarding the sharepoint connector, currently we have multiple excel files in a folder which we are trying to ingest. But we need to ingest in seperate s3 buckets. problem is that it shows as 1 excel file when wanting to choose the file via ingestion and all columns in all files are clubbed into 1 big file, thats issue one and second one is that even though we select only few columns , in destination it still has multiple columns and sometimes all columns. how to fix the issue?
    k
    • 2
    • 1
  • r

    Rafael Santos

    11/14/2025, 12:34 PM
    Hello, guys. Currently I'm facing an issue while trying to build the connection shown in this tutorial, where I get a Broken Pipe with not many further details during the replication phase. Also, is there a way to find Pinecone's environment value as of today? The interface for their indexes' information changed and no longer includes such a field, which is required on Airbyte's destination, although the initial test to check if Pinecone is working seems to be successful. I don't know exactly what the source of this issue I'm facing, but since I get some logs with a generic Destination Writer error, the mismatch on this env value could be the culprit. The error logs start like this: replication-orchestrator INFO Stream status TRACE received of status: STARTED for stream issues replication-orchestrator INFO Sending update for issues - null -> RUNNING replication-orchestrator INFO Stream Status Update Received: issues - RUNNING replication-orchestrator INFO Creating status: issues - RUNNING replication-orchestrator INFO Stream status TRACE received of status: RUNNING for stream issues replication-orchestrator INFO Workload successfully transitioned to running state destination INFO Writing complete. replication-orchestrator ERROR DestinationWriter error: replication-orchestrator ERROR DestinationWriter error: replication-orchestrator INFO DestinationReader finished. replication-orchestrator WARN Attempted to close a destination which is already closed. replication-orchestrator INFO DestinationWriter finished. replication-orchestrator INFO MessageProcessor finished. replication-orchestrator ERROR SourceReader error: replication-orchestrator INFO SourceReader finished. replication-orchestrator ERROR runJobs failed; recording failure but continuing to finish. replication-orchestrator INFO Closing StateCheckSumCountEventHandler
    k
    • 2
    • 1