https://linen.dev logo
Join Slack
Powered by
# ask-community-for-troubleshooting
  • r

    Rachel Bryant

    09/03/2021, 10:09 PM
    Hey Airbyte Team, I am trying to use Airbyte to bring in several HubSpot tables but no matter how I configure the connector I get errors — the connector has tried to run three different times (3 attempts each, so 9 times total) and not one has succeeded. Here is the most recent failure logs. Am I missing something? Is there anyone who can give some guidance on this? Thank you in advance for your help! P.S. I tried originally to normalize the data by using that option in the configuration and got normalization errors so I removed it. But theoretically, if normalization is possible that is what I would prefer.
    logs-347-2.txt
    o
    • 2
    • 11
  • d

    Daniil

    09/04/2021, 6:23 AM
    Hello everyone. Does anyone know what are the plans for the schema evolution automation? This feature is so very important for us, because we have frequent changes in the internally developed source, with weekly releases. It would be very important not to do the full refresh, especially on CDC, SCD2 types of tables. https://github.com/airbytehq/airbyte/issues/4891
    d
    • 2
    • 2
  • g

    gunu

    09/05/2021, 9:27 AM
    I am using the API to add a new table to an existing CDC connection without applying a reset. however when i go to sync, it will read in this new table from the timestamp of the last sync. can i override this, can each table have it’s own pointer or does the connection have one timestamp from which to read all tables on each sync. can i override this?
    ✅ 1
    j
    • 2
    • 2
  • o

    Osinachi Chukwujama

    09/05/2021, 1:59 PM
    Hey folks Does Airbyte require an external staging area for files before carrying out migration?
    ✅ 1
    d
    • 2
    • 2
  • v

    Vyacheslav Voronenko

    09/06/2021, 7:21 AM
    Good day, Any hint, how to troubleshoot flow Postgres A => Postgres B, when I see new records appearing in a system _airbyte table for the target, but not appearing in the target table itself? Latest airbyte.
    g
    • 2
    • 4
  • j

    Jonas Bolin

    09/06/2021, 7:45 AM
    Hi, New to Airbyte, doing some feasibility study for a client. Apologies if this has been asked a 1000 times or in the wrong channel. Can anyone comment on what the monthly cost would be for a running it in a
    n1-standard-2
    instance recommended in the documentation? The GCP cost estimator says around $50 a month. Does this sound correct to you? We plan to run like 15-20 daily jobs getting marketing data from Google Ads, Analytics, Bing, Facebook etc and dumping it in Bigquery.
    ✅ 1
    d
    • 2
    • 2
  • j

    Jonas Bolin

    09/06/2021, 8:03 AM
    Also, can anyone comment on whether or not the Google Analytics connector supports using filters,segments etc, out of the box? Was searching looking through the pipelinewise connector's Github and couldn't find any mentions of filters or segments
    ✅ 1
    d
    s
    • 3
    • 13
  • o

    Osinachi Chukwujama

    09/06/2021, 8:20 AM
    Hello folks Could someone please point me to some documentation on migrating partitioned tables from Postgres using Airbyte?
    u
    • 2
    • 3
  • p

    Phil Marius

    09/06/2021, 10:11 AM
    Where does Airbyte store its logs? It’s maxed out the storage on the EC2 cluster and I would like to clear them out
    g
    h
    +2
    • 5
    • 17
  • m

    Marc

    09/06/2021, 2:36 PM
    Hi, we need to extract data from our MariaDB instance that is configured with mutual ssl authentication (client and server certificates). The MySQL connector works well but fails to connect with ssl activated. The connector UI does not provide any inputs for SSL certificates (like for instance Dbeaver, MysqlWorkbench, ODBC). Is there a way to handle this ssl connection ? maybe, adding manually the certificates to the keystore of the source connector ?
    u
    • 2
    • 1
  • j

    Jiyuan Zheng

    09/06/2021, 6:27 PM
    Hi, our company (Bolt) is considering to use Airbyte to replicate data from Postgres to BigQuery on an ongoing basis. (roughly 500GB of data, ~50 tables) Ideally, I would love to keep the destination up-to-date as much as possible, so I am setting the sync frequency to be 5 minutes. So far I was testing Airbyte using 1 connection for the entire job and noticed a few pain points. • The sync takes long time: 25GB/h, so 500GB would take more than 20h I assume • I don’t have flexibility to add and change the tables to replicate. Every time I do any changes to the existing connection, I would need to restart the whole replication from scratch • Sometimes, sync jobs fail without a clear reason. I found it’s very hard to debug in general and resetting the connection would fix the issue most of the time, but it would again take very long time to make the destination up to date. Because of above mentioned reasons, I am thinking about setting up multiple connections, but my concerns are: • operation: I need to manage a lot of replication slots myself. To create a new connection, I need to manually go to the production DB and create a new replication slot and use that replication slot during setup. Any good solution for this? (jenkins, cicd, terraform or anything else?) • monitoring - when replication lags and the WAL logs build up in the source DB, it is hard to identify which connection was problematic. Also it only supports Slack webhook now. (ideally oncall engineers should be alerted) What/How do you do monitoring? Thank you so much for spending time reading my long question and please let me know what your thoughts and recommendations are. 🙏
    u
    d
    +2
    • 5
    • 11
  • v

    valentinmk

    09/06/2021, 8:53 PM
    Hi all! I believe it is a very noob question, but how to add some "constant"/"fixed" value to some data from source to store it in a destination (i.g. from REST API to Postgres). I receive from REST:
    Copy code
    [{"a":"1"},{"b":"2"}]
    I persist:
    Copy code
    {"a":"1", "super": "hot"}
    {"b":"2", "super": "hot"}
    or maybe Airbyte is not a good place to do this? Is there option to do it more beautiful than just hardcode it in
    parse_response
    ?
    ✅ 1
    👍 1
    u
    • 2
    • 2
  • o

    Osinachi Chukwujama

    09/06/2021, 11:15 PM
    Hi folks What happens when the destination does not support a particular data type from the source? E.g. Snowflake lacks support for PostgreSQL user-defined types. How can one migrate such data?
    u
    • 2
    • 1
  • d

    Davin Chia (Airbyte)

    09/07/2021, 12:42 AM
    @Cesarvspr here is the documentation for using Airflow + Airbyte: https://docs.airbyte.io/operator-guides/using-the-airflow-airbyte-operator
    • 1
    • 2
  • r

    Reddy Reddy

    09/07/2021, 5:22 AM
    Can any one help me on user Management and Login screen in Airbyte.
    d
    • 2
    • 1
  • p

    Peter Petrik

    09/07/2021, 8:52 AM
    for transformation, it is possible to have private repo (so I have to use token/user+pass)?
    c
    g
    • 3
    • 3
  • p

    Peter Petrik

    09/07/2021, 8:52 AM
    second question is do you have some marketplace with common transformations (e.g. mailchimp, github, etc)?
    c
    • 2
    • 9
  • s

    Sergey

    09/07/2021, 10:48 AM
    Copy code
    can airbyte combine data from different sources by key?
    s
    • 2
    • 1
  • v

    Vyacheslav Voronenko

    09/07/2021, 11:31 AM
    Following initial experiment, let me ask dummy question. Is it possible with airbyte to get result similar to native DB replication for set of tables. i.e. exact table definitions with datatypes for main fields, exact foreign keys, etc. Or should I look into native database replication instead ?
    s
    • 2
    • 1
  • g

    Gleidson Campos

    09/07/2021, 1:12 PM
    Hi everyone, i'm starting with airbyte collecting data from hubspot to postgres (and tested to mysql as well). With postgres i'm having error with 3 tables relacted to timezone:
    invalid input syntax for type timestamp with time zone: ""
    I'm only using basic normalization. There is something to resolve this without write my own normaization?
    ✅ 1
    s
    s
    • 3
    • 3
  • o

    Osinachi Chukwujama

    09/08/2021, 5:30 AM
    Hi folks I'm getting repeated failed sync from PostgreSQL to Snowflake. The logs do not show any error Please can you tell me what's wrong? Attached is the log file.
    failing-sync-logs.txt
    👀 1
    ✅ 1
    g
    • 2
    • 2
  • o

    Osinachi Chukwujama

    09/08/2021, 5:35 AM
    Also When adding new tables to be synced, is it advisable to leave the previous tables sync mode as
    Full refresh | Append
    ?
    ✅ 1
    d
    • 2
    • 2
  • o

    Osinachi Chukwujama

    09/08/2021, 6:36 AM
    Just to add When new sync occurs, the table are created on Snowflake, but no data transfer occurs.
    ✅ 1
    d
    g
    • 3
    • 5
  • a

    Andrey Morskoy

    09/09/2021, 6:52 AM
    Dear Team. When I am in k8s dashboard, inspecting logs for worker pods (in this case - normalization worker, but the same is with File CSV reader worker pod) - there is just single line
    Using existing AIRBYTE_ENTRYPOINT: /airbyte/entrypoint.sh
    . Is there any chance to find more logs for worker?
    ✅ 1
    d
    • 2
    • 5
  • d

    Devon Seitz

    09/09/2021, 10:34 PM
    Hey there, i think i remember reading at some point that airbyte does not dedup data based on a key, and strictly loads with nessesary columns to dedup later using something like DBT. I tried to find that reference again in the docs, but failing to. Am i remembering that wrong?
    ✅ 1
    • 1
    • 1
  • x

    Xander Porterfield

    09/10/2021, 8:51 PM
    Hello - I am trying to figure out the best way to grab JSON data from an http api -- the api requires url query parameters date_from and date_to Would I need to create a custom source connector, or is there another method I am missing?
    ✅ 1
    u
    • 2
    • 4
  • b

    Brad

    09/11/2021, 10:27 AM
    Hey - I'm just getting started with airbyte and can't quite seem to figure something out. I am using a custom dbt transform with a private git repo. In this repo I have updated the packages.yml file to use a git repo resembling the following (- git: "https://{{env_var('PERSONAL_ACCESS_TOKEN')}}@dev.azure.com/myproject/_git/awesome_repo"). What I can't figure out is how I can pass the PERSONAL_ACCESS_TOKEN as an environment variable to the docker image that is created to do the dbt transform. I really don't want to embed the token within my git repo. Any suggestions would be appreciated!
    c
    • 2
    • 1
  • m

    Martin Larsson

    09/11/2021, 11:00 PM
    Im stuck. I cant install dependencies for the Python CDK (https://docs.airbyte.io/connector-development/tutorials/cdk-tutorial-python-http/2-install-dependencies). I get an error:
    Copy code
    virt_env) martin@MacBook-Pro source-provet % pip install -r requirements.txt
    Obtaining file:///Users/martin/Repositories/airbyte/airbyte-integrations/bases/source-acceptance-test (from -r requirements.txt (line 1))
    Obtaining file:///Users/martin/Repositories/airbyte/airbyte-integrations/connectors/source-provet (from -r requirements.txt (line 2))
    Collecting airbyte-cdk~=0.1 (from source-acceptance-test==0.0.0->-r requirements.txt (line 1))
      Could not find a version that satisfies the requirement airbyte-cdk~=0.1 (from source-acceptance-test==0.0.0->-r requirements.txt (line 1)) (from versions: )
    No matching distribution found for airbyte-cdk~=0.1 (from source-acceptance-test==0.0.0->-r requirements.txt (line 1))
    I have upgraded Airbyte from 0.16 to 0.29 and deleted all images and volumes. Generated a brand new connecter from the Python REST API Source. I am using Visual Studio Code and running the pip install command in a terminal inside VS Code. I have not touched the requirements.txt file after generating the connector.
    ✅ 1
    d
    n
    • 3
    • 11
  • a

    Anatoliy Zhyzhkevych

    09/12/2021, 5:09 AM
    I just installed Airbyte 0.29.17-alpha on Azure Linux VM running RHEL 8.4 using podman-compose. Pod with services started and UI works but I can not define any source or destination. I'm getting errors similar to: io.airbyte.workers.WorkerException: Error while getting spec from image airbyte/source-mssql:0.3.5. "podman images" shows that image from docker.io. Any advice?
    👀 1
    d
    u
    c
    • 4
    • 15
  • m

    mukul singla

    09/13/2021, 5:55 AM
    OS Version / Instance: GKE Memory / Disk: 2 Nodes of 2 cPU 4 GB  Deployment: Kubernetes  Airbyte Version: 0.29.17-alpha  Source name/version: File   Destination name/version: big query  Step: Using dbt transformation  Description: I need help in sorting out the dbt transformation issue. I have just started using airbyte. I was successful in using connectors for sending datafrom csv to bigquery. I am facing issue in dbt transformation. Please find attached the log file. Not able to understand the issue. Any help will be highly appreciated. Slack Conversation
    u
    • 2
    • 1
1...8910...245Latest