https://linen.dev logo
Join Slack
Powered by
# ask-community-for-troubleshooting
  • s

    Sergei Kapochkin

    03/30/2023, 3:01 PM
    Hi all - I’m trying to run my dbt transformation over Clickhouse I set my local container image url and it’s correct
    2023-03-30 14:56:55 INFO i.a.c.i.LineGobbler(voidCall):114 - container was found locally.
    but when it start running model it’s failed
    2023-03-30 14:56:55 dbt > docker: invalid reference format.
    and my dbt run command is simple
    run --models runfolders.raw.raw_network
    How it could be fixed and what could be possible reason for this error?
    • 1
    • 1
  • s

    Saul Burgos

    03/30/2023, 3:32 PM
    Is there a way to test the airbyte default normalization before run the sync? I mean I want to test if a csv pass the normalization rules before uploaded to a S3. I know that by default is not possible... I am trying to know if there is some hacky way
    • 1
    • 1
  • m

    Mike B

    03/30/2023, 4:24 PM
    Is there a recommended workaround for the Temporal crash loop? I've had Airbyte core running for a couple of months now, and just went to install it on another server. I followed the new directions for getting started (pull git repo, then run the bootstrap shell script). Most of the containers start up nicely, but the Temporal container doesn't seem to -- I get constant exceptions complaining that the Temporal container didn't respond in time, and the UI doesn't start up.
    • 1
    • 1
  • w

    Walker Philips

    03/30/2023, 4:47 PM
    I have a custom connector built off of python, I currently emit the records from a dataframe using the to_dict("record"...) command. Two of the streams generate a tremendous amount of Log data but fail to emit any records into the actual destination. Whereas the third stream generates no "RECORD" logs at all but successfully lands all of the data into the destination. All streams use the same, custom read_data method and I use the boiler plate read_streams, read_full, etc. methods . Any ideas as to why Airbyte is sending the emitted records to the log as opposed to the destination?
    m
    • 2
    • 6
  • u

    Umar H

    03/30/2023, 5:17 PM
    hello has anyone had errors using the
    shopify
    source with
    postgres
    ? I get intermittent failures for larger loads of historical data.
    m
    • 2
    • 5
  • b

    brett

    03/30/2023, 5:33 PM
    Can the S3 source process files that are gzipped? E.g.
    my_data.gz
    is posted to S3, but we want to process a .csv inside the gzip.
    m
    • 2
    • 1
  • j

    James Rosenthal

    03/30/2023, 6:39 PM
    Is anyone else having sync issues? I am having several syncs (Postgres CDC --> Snowflake) end in a warning and retry, but the log says Completed Successfully:
    Done. PASS=244 WARN=0 ERROR=0 SKIP=0 TOTAL=244
    . Is this a false warning?
  • g

    Gabriel Medina Braga

    03/30/2023, 9:10 PM
    What's the best way to change the webserver's port exposed to the host? Right now I'm changing the docker-compose.yaml file directly, but it would be better if I could just change an env variable.
  • g

    Gabriel Medina Braga

    03/30/2023, 9:11 PM
    Also, I'm using Octavia to automate the creation of Airbyte resources. It's great, however I need to fill the
    port
    value with an environment variable, which I can't since the value comes as a string. Any way I can cast the env variable to an integer?
  • a

    Albert Wong

    03/31/2023, 2:22 AM
    anyone able to get airbyte running as a systemd? I'm getting this, not sure why:
    • 1
    • 2
  • s

    Slackbot

    03/31/2023, 3:25 AM
    This message was deleted.
  • s

    Sudarshana T

    03/31/2023, 4:29 AM
    Hello Guys i started to do CDC from postgres(VM instance in gcp) to Bigquery using airbyte . I have seen 2 documentation for CDC in both documentation they are using Loading methods as GCS staging. Is it necessary to use that for CDC can I use standard inserts?
    🙏 1
  • j

    Jake Yoon

    03/31/2023, 6:19 AM
    Hey there, how are you all doing? My company heavily relies on airbyte for our ETL process and we're happy with its performance. However, it's a single point of failure in our system, so we're trying to implement HA on a Kubernetes cluster. Our current deployment structure is listed below and I'm wondering how to change the number of replicas for each pod. I would be very grateful if you could assist me with this.
    Copy code
    NAME                                                READY   STATUS      RESTARTS        AGE
    airbyte-airbyte-bootloader                          0/1     Completed   0               11d
    airbyte-connector-builder-server-68c644d47b-sdcqn   1/1     Running     0               18d
    airbyte-cron-78b45856c8-cxhf5                       1/1     Running     0               18d
    airbyte-db-0                                        1/1     Running     0               18d
    airbyte-pod-sweeper-pod-sweeper-674cfd76d-nq7dz     1/1     Running     0               18d
    airbyte-server-74d8ff4d4b-vl75f                     1/1     Running     1 (2d13h ago)   18d
    airbyte-temporal-67b89d5d66-5m8md                   1/1     Running     0               18d
    airbyte-webapp-6c44d5459b-c2t6m                     1/1     Running     0               18d
    airbyte-worker-56f449c796-r6x84                     1/1     Running     0               18d
    Thanks in advance.
  • m

    Muhammad Bilal Anees

    03/31/2023, 7:49 AM
    Hi there, Can anyone share Grafana dashboard for airbyte?
    t
    i
    • 3
    • 2
  • g

    Guillaume Desbuquois

    03/31/2023, 7:52 AM
    Hello, I set up a Mixpanel -> Bigquery connection to retrieve all my history (1 billion events). I was able to retrieve 100 million events in 2 days, but since last night, no more lines have been added (see screenshot), and the CPU usage on my EC2 dropped to 3%. However, the connection still shows "Sync Running." I have a parallel connection with Zendesk that worked very well last night. Can you please help me?
    ➕ 1
  • s

    Simon Thelin

    03/31/2023, 8:51 AM
    Is there any way to trigger a manual sync via
    octavia
    or the
    airbyte api
    ? I can’t seem to find it in the docs. Is this what I should use? https://airbyte-public-api-docs.s3.us-east-2.amazonaws.com/rapidoc-api-docs.html#post-/v1/connections/sync When I try this I get
    Copy code
    <html>
    <head><title>301 Moved Permanently</title></head>
    <body>
    <center><h1>301 Moved Permanently</h1></center>
    <hr><center>nginx</center>
    </body>
    </html>
    Is there a requirement on
    airbyte
    version for this to work?
  • s

    Shruti Kulkarni

    03/31/2023, 11:08 AM
    Hi Team, I am trying to deploy Airbyte onto EKS and I am facing issues on the last stage which is "helm install %release_name% airbyte/airbyte" [cloudshell-user@ip-10-2-55-21 ~]$ helm install airbyte airbyte/airbyte Error: INSTALLATION FAILED: failed pre-install: timed out waiting for the condition
  • p

    Professor Shabangu

    03/31/2023, 12:06 PM
    Hy team, I'm trying manually define a nested schema, but I'm not sure of the format. can someone help? from this: {"column_1": "number", "column_2": "string", "column_3": "array", "column_4": "object", "column_5": "boolean"} how do I add the properties of the object?
    m
    • 2
    • 3
  • y

    Yeong Heo

    03/31/2023, 1:35 PM
    Hello, I am getting this error message when I run
    python main.py check --config secrets/config.json
    Copy code
    {
      "type": "CONNECTION_STATUS",
      "connectionStatus": {
        "status": "FAILED",
        "message": "Config validation error: 'Client' was expected"
      }
    }
    I have
    auth_type = "Service"
    in the config.json file for my google sheet connection so I am not sure why it’s expecting ‘Client’.
    Copy code
    {
      "spreadsheet_id": "<https://docs.google.com/spreadsheets/d/173zPC-SQS_loqSBChXV9DP9Mvzmo5_dKrCHGrHdNI/edit#gid=0>",
      "credentials": {
        "auth_type": "Service",
        "service_account_info": {
          "type": "service_account",
          "project_id": "xxxxxx",
          "private_key_id": "xxxxxx",
          "private_key": "xxxxxx",
          "client_email": "xxxxxx",
          "client_id": "xxxxxx",
          "auth_uri": "xxxxxx",
          "token_uri": "xxxxxx",
          "auth_provider_x509_cert_url": "xxxxxx",
          "client_x509_cert_url": "xxxxxx"
        }
      }
    }
  • s

    Shreepad Khandve

    03/31/2023, 1:54 PM
    Hi team , is this really bug or we are doing something wrong while data sync ? https://github.com/airbytehq/airbyte/issues/16641
    Copy code
    cursor = column_names[self.cursor_field[0]][0]
    2023-03-31 12:27:50 normalization > KeyError: 'since_updated_at'
  • t

    Thiago Villani

    03/31/2023, 2:02 PM
    Hello, when I'm running a sync, there are times when my local server's CPU usage exceeds 100%, I'm using a VM with 2VCPU, running via docker. Does anyone have any tips on what the recommended setting, or limiting CPU usage, or what the recommendation is? The sync gave an error on the first attempt, and in the end it has this error in the log: 2023-03-31 135146 WARN i.t.i.w.ActivityWorker$TaskHandlerImpl(logExceptionDuringResultReporting):365 - Failure during reporting of activity result to the server. ActivityId = 7de1a63f-dbf0-3c74-81f2-57170468ce4d, ActivityType = Replicate, WorkflowId=sync_143, WorkflowType=SyncWorkflow, RunId=9a4e11d9-a94b-46da-afb3-e506cf5aad11 io.grpc.StatusRuntimeException: NOT_FOUND: workflow execution already completed at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:271) ~[grpc-stub-1.51.1.jar:1.51.1] at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:252) ~[grpc-stub-1.51.1.jar:1.51.1] at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:165) ~[grpc-stub-1.51.1.jar:1.51.1] at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.respondActivityTaskFailed(WorkflowServiceGrpc.java:3866) ~[temporal-serviceclient-1.17.0.jar:?] at io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.lambda$sendReply$1(ActivityWorker.java:320) ~[temporal-sdk-1.17.0.jar:?] at io.temporal.internal.retryer.GrpcRetryer.lambda$retry$0(GrpcRetryer.java:52) ~[temporal-serviceclient-1.17.0.jar:?] at io.temporal.internal.retryer.GrpcSyncRetryer.retry(GrpcSyncRetryer.java:67) ~[temporal-serviceclient-1.17.0.jar:?]
  • p

    Pol Monsó

    03/31/2023, 2:14 PM
    Hello everybody, auth here. How can I deal with an api that uses a short-lived token but no Oauth2? that is token = login(user,password)?
  • e

    Edward J. Stembler

    03/31/2023, 3:47 PM
    Anyone have experience connecting Airbyte to Google Cloud SQL proxy?
    s
    m
    t
    • 4
    • 29
  • s

    Sean Zicari

    03/31/2023, 3:50 PM
    General worker process question: …how does one keep it stable? How much RAM does it need, and at what point will it reach an equilibrium of memory usage? I’ve got Airbyte running on 2 different clusters, 2 different hosting providers (both Kubernetes), and in both, the worker process ends up being restarted regularly. One of the instances has an “every minute” sync job that never starts again after the worker is restarted.
    • 1
    • 1
  • j

    JPG

    03/31/2023, 5:13 PM
    When I delete a workspace, what really happens behind the scene? what happens to all sources, destinations and connections previously setup?
  • m

    Mike B

    03/31/2023, 5:31 PM
    Asked yesterday but didn't see any answers: are there any recommendations for getting around the Temporal timeouts/crashes when starting up the plain Docker hosted version of Airbyte core? I saw some issues related to it on GitHub, and am getting a crash loop when following the basic "Getting Started" guide on an Ubuntu server: "259 - Waiting for namespace default to be initialized in temporal... airbyte-server | 2023-03-31 172744 WARN i.a.c.t.TemporalUtils(getTemporalClientWhenConnected):269 - Ignoring exception while trying to request Temporal namespace: airbyte-server | io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: Deadline CallOptions will be exceeded in 9.999128307s." The only solutions I saw listed on Discourse were to make sure the server had enough resources. I'm running this on one that's around ~4% utilization for both memory/CPU, so that shouldn't be the issue. No hard caps set in the docker settings either. Any other thoughts on how to get it running?
  • a

    Albert Wong

    03/31/2023, 6:16 PM
    Anyone know how to filter airbyte streams at source? One sync job is failing due to a timeout. Too many records streaming out db will kill the connection after some period of time. I would like to filter the records by date. Is that possible?
    s
    • 2
    • 3
  • a

    anni

    03/31/2023, 7:38 PM
    Hi there, I am having some issues with pulling Google Ads data from Airbyte since March 29. • with Google Ads source connector’s version 0.2.12 ◦ keep getting the error message:
    Failure Origin: source, Message: Checking source connection failed - please review this connection's configuration to prevent future syncs from failing
    • but if I upgrade the Google Ads source connector’s version to 0.2.13 ◦ having error message:
    Failure Origin: normalization, Message: Normalization failed during the dbt run. This may indicate a problem with the data itself.
    Copy code
    Database Error in model ad_groups_stg (models/generated/airbyte_incremental/google_ads_raw/ad_groups_stg.sql)
      syntax error at or near "."
      LINE 6:                add column ad_group.optimized_targeting_enabl...
                                                ^
    Database Error in model campaigns_stg (models/generated/airbyte_incremental/google_ads_raw/campaigns_stg.sql)
      syntax error at or near "."
      LINE 6:                add column campaign.target_cpm.__quency_goal....
                                                ^
    Database Error in model ad_group_ads_stg (models/generated/airbyte_incremental/google_ads_raw/ad_group_ads_stg.sql)
      syntax error at or near "."
      LINE 8:                 drop column ad_group_ad.ad.gmail_ad.header_i...
                                                     ^
  • d

    Dharshan Viswanathan

    03/31/2023, 7:50 PM
    last year when i tried postgres to postgres cdc (logical replication ) it didn't work well. Does anyone use postgres logical replication now in prod environment ?
  • p

    Peter Doherty

    03/31/2023, 10:03 PM
    I'm also seeing an issue with Postgres (CDC) albeit while using the Weaviate connector. Full refresh (both variants) works without issue but Incremental | Append results in:
    ...
    "message":"Internal Server Error: Cannot invoke \"io.airbyte.protocol.models.AirbyteStateMessage.getGlobal()\" because the return value of "io.airbyte.config.StateWrapper.getGlobal()" is null"
    ...
    Incremental | Append seems to work fine when using a local JSON destination. So, I think the issue is with the Weaviate destination. Hopefully it's just bad config on my end (happy to share) but I'd be curious to know if anyone else has this setup working.
    👍 1
    a
    s
    • 3
    • 18
1...172173174...245Latest