https://linen.dev logo
Join Slack
Powered by
# ask-community-for-troubleshooting
  • h

    Human

    12/23/2022, 12:45 AM
    Issue: Check_connection (and Sync) fails for File source
    Copy code
    requests.exceptions.SSLError: HTTPSConnectionPool(host='<http://storage.googleapis.com|storage.googleapis.com>', port=443): Max retries exceeded with url: /covid19-open-data/v2/latest/epidemiology.csv (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1129)')))
    Cause: Self signed cert in chain for File HTTPS source Ask: How do I add the CA cert on the worker that runs the source?
    n
    w
    • 3
    • 2
  • h

    Hai Huynh

    12/23/2022, 12:59 AM
    Hi everyone.i am new bie postgres. Any one help me please I have a question: i have 1 connection between postgres source and postgres destination. My sync mode is increnental dedup. Can i wipe historical data in table with prefix _stg and _scd. Because the record in that 2 table grow up every job but i dont need historical data in the 2 table above.
    n
    • 2
    • 1
  • m

    Michael

    12/23/2022, 5:39 AM
    Hi Team, I've submitted a PR to resolve a bug in Source Okta Stream [https://github.com/airbytehq/airbyte/pull/20833]. Can someone help me what to do next? (First time submitting a PR)
    n
    • 2
    • 2
  • ö

    Özgür Sallancı

    12/23/2022, 6:13 AM
    Hi guys. Thanks for the great app. I followed to tutorial to create a custom connection, but i can't. I joined the office hours but still couldn't make it work.
    u
    s
    • 3
    • 3
  • v

    Vu Le Hoang

    12/23/2022, 10:05 AM
    Hi all, my Airbyte node’s disk is getting full. So I want to clean the
    airbyte_workspace
    volume. Is it safe to clear all its content? If it not, which kind of files should I delete? Thank you airbyte rocket
    ✅ 1
    n
    • 2
    • 4
  • n

    Nahid Oulmi

    12/23/2022, 10:28 AM
    • Is this your first time deploying Airbyte?: No • OS Version / Instance: Debian • Memory / Disk: 32GB memory / 50GB disk • Deployment: docker-compose • Airbyte Version: 0.40.26 • Source name/version: elastic-search custom connector • Destination name/version: bigquery • Step: schema discovery in webapp • Description: I developped a custom source connector for Elastic Search using the Python CDK for version 0.39.42 because the standard one was not fit for my use case (needed to do incremental updates). I created a static schema file located in the file
    catalog/configured_catalog.json
    that looks like this :
    Copy code
    {
      "streams": [
        {
          "stream": {
            "name": "my_stream",
            "json_schema": {},
            "supported_sync_modes": [
              "full_refresh",
              "incremental"
            ],
            "source_defined_cursor": "True",
            "default_cursor_field": [
              "date"
            ]
          },
          "sync_mode": "incremental",
          "destination_sync_mode": "overwrite"
        }
    ]
    I deployed it on Airbyte version 0.39.42 and it was working fine. Now, we updated our Airbyte version to 0.40.26 and the connector is not working fine anymore. The problem is located at the schema discovery step. When I go to “replication” to see my schema :

    image▾

    1340×512 42.9 KB▾

    I get this error message :

    image▾

    2448×1096 268 KB▾

    There is no error log on server side. The only error log I get is on the browser side which says (as on the screenshot) :
    Copy code
    TypeError: Cannot convert undefined or null to object
        at Function.keys (<anonymous>)
        at or (CatalogSection.tsx:101:41)
        at sa (react-dom.production.min.js:157:137)
        at qa (react-dom.production.min.js:180:154)
        at Ba (react-dom.production.min.js:178:169)
        at ja (react-dom.production.min.js:177:178)
        at Gs (react-dom.production.min.js:274:126)
        at Au (react-dom.production.min.js:250:347)
        at Ou (react-dom.production.min.js:250:278)
        at Cu (react-dom.production.min.js:250:138)
    ls @ react-dom.production.min.js:216
    n.payload @ react-dom.production.min.js:217
    ho @ react-dom.production.min.js:130
    Wa @ react-dom.production.min.js:184
    Gs @ react-dom.production.min.js:269
    Au @ react-dom.production.min.js:250
    Ou @ react-dom.production.min.js:250
    Cu @ react-dom.production.min.js:250
    _u @ react-dom.production.min.js:243
    (anonymous) @ react-dom.production.min.js:123
    t.unstable_runWithPriority @ scheduler.production.min.js:18
    Vi @ react-dom.production.min.js:122
    Ki @ react-dom.production.min.js:123
    M @ scheduler.production.min.js:16
    b.port1.onmessage @ scheduler.production.min.js:12
    The local tests (
    check
    ,
    spec
    ,
    discover
    ,
    read
    ) work fine. Is there a thing I need to modify/update in the connector Python code to get it to work with version 0.40.26 ? I am sure it is an issue with the version since it is working fine on 0.39.42. Thanks,
    m
    u
    +3
    • 6
    • 10
  • b

    Bruno Agresta González

    12/23/2022, 1:34 PM
    Hello everyone, I have a connection from Postgres (AWS) to BigQuery. In this connection I am experiencing problems with some tables that have “Incremental Deduped + History” as sync configuration. Cursor Field “Updated_at” and Primary key “id”. The problem is that in the destination table the rows that are updates appear duplicated. This is not the behavior I expect for the deduped configuration. I’m using the
    0.40.24
    version of Airbyte, BigQuery connector
    1.2.9
    and Postgres connector
    0.3.26
    . Is anyone experiencing the same?
    m
    a
    • 3
    • 8
  • n

    Nivedita Baliga

    12/23/2022, 4:36 PM
    Hello everyone. I am an Airbyte newbie (started using the open-source version just 3 days ago!!). As a test, I was trying to get data from a 145M row table in BigQuery into Snowflake and I got a "responseTooLarge" error. As per my search on the internet, this is an open issue and can only be resolved by creating a view on the source DB to chunk the data into the destination. Is it true?
    n
    • 2
    • 1
  • n

    Nivedita Baliga

    12/23/2022, 4:44 PM
    Another question - can't I pick and choose what columns from source I want in destination?
    m
    n
    • 3
    • 3
  • d

    Dany Chepenko

    12/23/2022, 5:43 PM
    Any hints on passing the verification process for Facebook ads? I feel so confused describing the app functionality as it's hardly considered an app in my pov. That's for
    business_management feedback
    Copy code
    We were unable to approve your request for this permission because the explanation of your app's use case was unclear.
    To resolve this issue, please provide a valid use case with a revised screencast or notes that explain the following items:
    1. Which app function requires the requested permission.
    2. How the requested permission will enhance your app's functionality and integration.
    3. How the requested permission will enhance the end user's experience.
    You should also make sure that the screencast submitted is the correct video for the app before you re-submit for review.
    For more information, you can also view our App Review introduction video and App Review Rejection Guide.
    and that's for
    ads_read feedback
    Copy code
    We found that your app's test credentials did not allow us to fully review the content of the app or there were no test credentials provided for us to use during our review.
    To resolve this issue:
    - If your test credentials do allow access, check that the account is set up properly to provide us with full access and to allow us to reproduce your use case steps.
    - Otherwise, please consider including any applicable test credentials and passwords for our team to use. If a non-facebook user account is required to log into your app, please include those credentials when you re-submit.
    For more information, please visit our App Review Rejection Guides.
    Notes from your reviewer:
    Unfortunately we have not been able to verify the permissions due to unclear use cases and being unable to link ad account.
    
    For the ads_read permission, please for the next submission, include that the app is internal in the use case, and show ads metrics in the screencast.
    
    The use cases for business_management and leads_retrieval are unclear. Please, for the next submission, include the word 'leads' along with a relevant use cases.Please rectify these matters for the next submission.
    s
    l
    • 3
    • 3
  • t

    Tamas Foldi

    12/23/2022, 7:06 PM
    when I am trying to use
    octavia apply
    I got the following error message:
    Copy code
    airbyte_api_client.exceptions.ApiTypeError: Invalid type for variable 'non_breaking_changes_preference'. Required value type is NonBreakingChangesPreference and passed type was str at ['non_breaking_changes_preference']
    CLI and server versions are the same, trying to apply back an import to another airbyte server. any clue what could be wrong?
    j
    • 2
    • 6
  • r

    Rishabh Jain

    12/23/2022, 11:16 PM
    I am trying to setup replication slot using wal2json plugin. I have created the slot and publication in Postgres. And provided the values to Airbyte. But when I try to test the connection I get an error saying that the “_Expected exactly one replication slot but found 0_”. But I do have replication slot created in Postgres. Not sure why Airbyte is unable to find it. pgoutput plugin works perfectly fine with Airbyte. Below screenshot for wal2json.
    q
    l
    +2
    • 5
    • 9
  • s

    Shay Rubach

    12/25/2022, 7:45 AM
    Hello all and happy xmas. A question. Can I run a query on a returned catalog from a source (say mysql)? Or directly query the source and get a "queried" (filtered) catalog? [edit] I've ran into this post and this post and realized it is not supported. What could be my alternatives? Is there a way to write a generic Transformation that would get a query and return a queried catalog? Thanks.
    m
    • 2
    • 4
  • m

    Mickaël Andrieu

    12/26/2022, 4:32 AM
    Hi, I have the "so famous" java.sql.SQLException: YEAR error (using MySQL connector). I know you wont or cant fix it, but I'm wondering how I can reproduce with more information : I want to know which line(s) and which column(s) are responsible for this error : any idea ? (My skills in Java are ... close to none)
    n
    • 2
    • 6
  • k

    Kevin Noguera

    12/26/2022, 8:30 AM
    Anyone here has setup the Zendesk Support connector and faced issues with the incremental streams actually doing a full refresh? Current hypothesis is that for some streams (the ones working) they do increments as expected due to their cursor_field being the correct data type (timestamp) but others failing do not (integer, string).
    r
    m
    • 3
    • 10
  • p

    Pablo Morales

    12/26/2022, 4:07 PM
    Hi everyone! We started a thread about Shopify on github about a month ago. It is as follows: https://github.com/airbytehq/airbyte/issues/19348 We have problems since in the Orders stream, each struct in the line_items array has an attribute called discount_allocations, which is always empty. We are connecting with BigQuery (denormalized). Anyone with the same problem or who can offer us a solution? Thanks!
    s
    • 2
    • 3
  • t

    Temidayo Azeez

    12/26/2022, 4:30 PM
    That's the log file. Thank you!
  • t

    Timam

    12/26/2022, 6:18 PM
    Hi Everyone, I am just getting started with airbyte. Installed airbyte on my eks cluster following https://docs.airbyte.com/deploying-airbyte/on-kubernetes-via-helm/ where can i find default values.yaml for helm chart ?
  • i

    Ignacio Alasia

    12/26/2022, 6:52 PM
    Hi team! We deployed Airbyte (v0.40.17) on kubernetes using Helm, and make some test with S3 --> Snowflake and PG --> Snowflake and its worked. When we try to transfer a big table From PG to SF using CDC, PG (v1.0.34), the first batch of 324.43 GB, and 54,933,278 rows flowed well. But when the connector run again, the workers failed:
    Copy code
    ERROR i.a.w.g.DefaultReplicationWorker(run):196 - Sync worker failed.
    java.util.concurrent.ExecutionException:io.airbyte.workers.general.DefaultReplicationWorker$DestinationException: Destination process message delivery failed.
    And this other log:
    Copy code
    2022-12-24 01:05:55 ERROR i.a.w.g.DefaultReplicationWorker(run):196 - Sync worker failed.
    java.util.concurrent.ExecutionException: io.airbyte.workers.general.DefaultReplicationWorker$SourceException: Source cannot be stopped
    We running this over a m6i.xlarge. So, first, someone have any idea of whats wrong? second, I would like to know how Airbyte works behind when make the CDC or the behavior of Airbyte in this process. How use the workers and compare the data what already are in the destination and the new data. Best, Ignacio.
    👍 1
    s
    g
    +2
    • 5
    • 12
  • i

    Igor Safonov

    12/26/2022, 7:22 PM
    Hi, I am new to Airbyte and created a toy example of copying data between
    google ads
    and
    databricks lakehouse
    (deployed on minikube with helm, v0.40.25) Unfortunately, it causes an error during an execution of an SQL statement in the absence of column types (I edited it a bit for readability):
    Copy code
    CREATE TABLE <table> (_airbyte_ab_id string, _airbyte_emitted_at string, `campaign.id` , `metrics.clicks` , `segments.date`) USING csv LOCATION '<location>' options ("header" = "true", "multiLine" = "true")
    Is there something wrong with my configuration? Could you please advise what could I look into? UPD: I've made some research. Here is the json schema from my logs:
    Copy code
    Json schema for stream usr_igsaf.google_ads_test: {"type":"object","$schema":"<http://json-schema.org/draft-07/schema#>","properties":{"campaign.id":{"type":["integer","null"]},"campaign.name":{"type":["string","null"]},"segments.date":{"type":["string","null"],"format":"date"},"metrics.clicks":{"type":["integer","null"]},"metrics.conversions":{"type":["number","null"]},"metrics.cost_micros":{"type":["integer","null"]},"metrics.impressions":{"type":["integer","null"]},"user_location_view.country_criterion_id":{"type":["integer","null"]}},"additionalProperties":true}
    Looks like the code in the databricks integration is not ready to see arrays in the
    type
    field:
    Copy code
    final String type = node.get("type").asText();
          schemaString.append(", `").append(header).append("` ").append(type.equals("number") ? "double" : type);
    If I contribute a fix, would it take long until it gets released?
    n
    u
    • 3
    • 4
  • t

    Timam

    12/26/2022, 7:34 PM
    Hi everyone, Hope you are doing great. I am new to Airbyte and just installed Airbyte on EKS with helm. How do we manage users and authentication on Airbyte*?*
    s
    • 2
    • 3
  • s

    Sujith Kumar.S

    12/27/2022, 6:00 AM
    Any plan in pipeline for Kafka connector in GA ? If so any expected time frame we have ?
    n
    • 2
    • 1
  • g

    Georges Stephan

    12/27/2022, 7:20 AM
    Hey everyone, I am trying to connect to a GitLab repo as a data source. Unfortunately, the repo I am connecting to uses HTTP, not HTTPS. I edited the
    streams.py
    file and changed the return string from the function
    def url_base(self) -> str:
    to return a URL that starts with
    http
    instead of
    https.
    However, by examining the logs, I see that Airbyte still uses HTTPS to connect, although the URL starts with
    http
    . Is there anything else I need to change? I am using Airbyte version 0.40.26 running under Docker compose. Thank you!
  • a

    Akilesh V

    12/27/2022, 7:41 AM
    Hi All, I am having issue to upgrade Airbyte version from v0.40.3 to v0.40.23 after upgrading sync is not working and workspace doesn't show all connector belong to the workspace.
    m
    • 2
    • 6
  • n

    Nils de Bruin

    12/27/2022, 8:16 AM
    Hey everyone! I have a Postgres source with incremental syncing (no CDC), which failed after updating the source connector to a version larger than 1.0.30. I am seeing this message in the log:
    Copy code
    Stack Trace: org.postgresql.util.PSQLException: ERROR: syntax error at or near "FROM"
      Position: 30
    and
    Copy code
    "failureOrigin" : "source",
      "failureType" : "system_error",
      "internalMessage" : "org.postgresql.util.PSQLException: ERROR: syntax error at or near \"FROM\"\n  Position: 30",
      "externalMessage" : "Something went wrong in the connector. See the logs for more details.",
      "metadata" : {
        "attemptNumber" : 2,
        "jobId" : 392,
        "from_trace_message" : true,
        "connector_command" : "read"
      },
    I can revert to version 1.0.30, and then the error disappears. Does anyone have the same issue or know what this could be? Thanks!
    n
    s
    • 3
    • 6
  • l

    laila ribke

    12/27/2022, 8:27 AM
    Hi all, I´m still with the nordigen API.. This is the example of the data I will receive as a response from the transactions endpoint.
    Copy code
    "transactions": {
        "booked": [
          {
            "transactionId": "string",
            "debtorName": "string",
            "debtorAccount": {
              "iban": "string"
            },
            "transactionAmount": {
              "currency": "string",
              "amount": "328.18"
            },
            "bankTransactionCode": "string",
            "bookingDate": "date",
            "valueDate": "date",
            "remittanceInformationUnstructured": "string"
          },
          {
            "transactionId": "string",
            "transactionAmount": {
              "currency": "string",
              "amount": "947.26"
            },
            "bankTransactionCode": "string",
            "bookingDate": "date",
            "valueDate": "date",
            "remittanceInformationUnstructured": "string"
          }
        ],
        "pending": [
          {
            "transactionAmount": {
              "currency": "string",
              "amount": "float"
            },
            "valueDate": "date",
            "remittanceInformationUnstructured": "string"
          }
        ]
      }
    }
    It´s line two objects : "booked" and "pending", which each one of them contains an array of object, that each object is a transaction.. I think I´ll start only with the booked ones, but I´m curious how the schema should look like How
    ✅ 1
    m
    • 2
    • 3
  • d

    Dimitriy Ni

    12/27/2022, 12:15 PM
    Hi Everyone, hope you have a great Christmas time 🙂 I have a question regarding Facebook Marketing Connector in Airbyte Cloud. The Consumption seems a bit too high. I set up an incremental load, yet its extracting every time 14k rows wit 37mb. Seems like its looking back way back than just the recent days? Any one experience with that and how I could change that? Thanks in advance!
    m
    u
    • 3
    • 3
  • n

    Nandhakumar M

    12/27/2022, 1:40 PM
    Hi team,
  • n

    Nandhakumar M

    12/27/2022, 1:42 PM
    Hi Team, I am looking to create a pipeline from mysql to S3/Blob(with Hive catalog). Does airbyte support dedup with the incremental sync? or CDC is the way to go for such use cases? Thanks in advance!
    n
    • 2
    • 2
  • n

    Noah Selman

    12/27/2022, 2:18 PM
    Reposting because I think this may have gotten buried over the holiday… Hello! A couple weeks ago we had a very strange occurrence with our sync from mysql 5.6 to bigquery. After a particular “full refresh - overwrite” connection failed to sync 3 consecutive times, subsequent syncs of the same connection began taking double the time to complete. Looking at the logs, the extra time was due to an unexplained gap during normalization. After all the sql scripts were written, there would be an hour of nothing before any of them would start executing. However, eventually dbt would successful complete. Even more strangely, a later sync of the same connection failed 3 consecutive times - afterward subsequent syncs once again took the original amount of time. The sync in question is relatively large (5.73 GB) and copies >200 tables. I’ve included a copy of the logs for one run during the period that there was an hour gap in normalization. We’re on airbyte version 0.40.17 running on GKE. We’re happy the this problem fixed itself but we would like help identifying what happened here and how we can prevent it from happening again. Thanks!! https://files.slack.com/files-pri/T01AB4DDR2N-F04FX3KT9E3/download/0df738fa_6de8_432d_9aa9_a77e48a47a5f_logs_111_txt.txt?origin_team=T01AB4DDR2N
    0df738fa_6de8_432d_9aa9_a77e48a47a5f_logs_111_txt.txt
    m
    • 2
    • 3
1...114115116...245Latest