https://linen.dev logo
Join Slack
Powered by
# ask-community-for-troubleshooting
  • n

    Nelson Rafael Perez

    10/24/2022, 3:47 PM
    Hello Team, we are starting with Aribyte and we are stuck about parallelism. We deployed Airbyte in Kubernetes and following this doc we added the those variables in the worker deployment, despite of this, we still have one pod for source and one pod for destination and the data transfer is taking always the same time there is not improvement on that. Our question is: How is parallelism handle by Airbyte? should we have more pods for source/destination?
    r
    • 2
    • 5
  • g

    Geoffrey Garcia

    10/24/2022, 5:21 PM
    Hi team. I am currently using Airbyte with the source File module to populate a Postgres destination. The sync mode chosen is 'Full refresh | Overwrite'. I note after each file sync that the views I have created on the db tables are removed. It makes me wonder if this sync mode is somehow proceeding to a drop table with cascades: if so, how can I proceed to get the job done ('cancel & replace') without droping the tables and my views? or if this feature should have to be implemented. Thx 👍
    plus1 1
    • 1
    • 1
  • k

    Kevin Phan

    10/24/2022, 6:32 PM
    Hey folks if we do S3 as a destination via irsa , what deployment would need access to S3? Intuition says airbyte-worker
    m
    n
    • 3
    • 9
  • r

    Rahul Borse

    10/24/2022, 7:17 PM
    Hi, If a record from data has been deleted in source (postgres). Can we normalize it in destination(s3) ? Just wanted to understand how this works
    m
    • 2
    • 3
  • d

    Dan Siegel

    10/24/2022, 7:38 PM
    what are the specific permissions the database user for Redshift target needs? I am able to connect with a super user, but my created user which has create schema, create tables is timing out with no log of what happened. It's running redshift queries but ultimately never connects. 100% it's user based because superadmin works.
    s
    • 2
    • 2
  • a

    Abhaya Shrestha

    10/24/2022, 7:40 PM
    Hello there, Based on the documentation, I can see that using alpha in production is strongly not recommended. However, I also see that these alpha release get early feedback and issues reported by early adopters. From that, I'm wondering if we can use some of the alpha connectors for production like we are the guinea pig? Also, is anyone using the alpha connectors as such? Thank you.
    m
    • 2
    • 1
  • a

    Aazam Thakur

    10/24/2022, 8:11 PM
    Hi folks, I was going through the Low Code Tutorial and got stuck at connecting to the exchange rates api. I do not understand what the error relates to exactly as i cross checked my yaml files. Any help would be greatly appreciated!
    m
    s
    • 3
    • 3
  • r

    Rahul Borse

    10/24/2022, 8:56 PM
    Hi Team, In destination S3 bucket I had a file where in the list last project id is 81. For this it is an incremental refresh on creation_date field of source table. Later on I added another project. So when I sync the connection again it should only contain project 82 in new bucket file. But somehow when I sync again it is containing previous project 81 along with new project 82. Please find below screenshot of both files. First screenshot of existing bucket file and second is new file. Same is happening with sequent files, it is containing previous project id record in new file. Please help me out to understand this issue
    m
    • 2
    • 2
  • k

    Kevin Phan

    10/24/2022, 9:11 PM
    Hey All 👋 I feel like ive asked this question in multiple increments so maybe its best if i just summarize the question in one post. So what I am trying to do is create S3 as a destination. The method now is to create an IRSA that inherits the permissions to access a specific bucket. This is linked to the OIDC provider that already exists. The next step was creating a PR in out forked airbyte reppo to link the pod and pod serviceaccount (airbyte-admin) to OIDC. Effectively if airbyte supports irsa this should work. My questions are: • I am running on
    0.35.28-alpha
    which asks for non-empty values for AWS Access ID and secret Keys. Is this version capable of supporting IRSA for S3 as a destination? • What deployment is making to call to S3? I would think it is the
    airbyte-worker
    . • Our prod instance is failing to connect to the S3 logging bucket. Would that be configured the same way? Ie allow access to the logging bucket to the K8s servuce Account. Thanks in advance!
    n
    • 2
    • 21
  • s

    Siddhant Singh

    10/24/2022, 9:26 PM
    hi. I have a question regarding basic normalization. I have a jsonb column which I need to flat. Basic Normalization have mentioned in the document that it uses the dbt underhood. So my question do we have to define schema to make this work? or without schema it can do it automatically ?
    m
    • 2
    • 1
  • j

    Jeff Skoldberg

    10/25/2022, 2:06 AM
    I was surprised that Airbyte does not have dbt packages listed here https://hub.getdbt.com/ Is Airbyte looking to get listed on the hub? Is there another place I should look?
    m
    • 2
    • 6
  • t

    Trevor Wuerfel

    10/25/2022, 4:13 AM
    Ok so I can't get a custom dbt transformation to work for replicating from postgres to big query... if anyone sees this I could use some help the transformation just keeps failing and the errors are just not helpful. Really new to dbt and airbyte so any help would be so much appreciated
    m
    s
    • 3
    • 6
  • k

    Kristina Ushakova

    10/25/2022, 4:37 AM
    Hello, team! TLDR: Replication constantly growing Our setup: Airbyte Open Source deployed on an EC2 cluster We have a main datasource - Postgres in RDS and a main destination - Redshift. We have already set this pipeline up in Fivetran, and now we are trying to connect Airbyte also. Fivetran connector has been set up using a replication slot with the ‘test-decoding’ plugin. Now for Airbyte, we have created a new replication slot using pgoutput plugin connected with a publication slot that has registered for all the tables, that are being synced to Fivetran. It’s recommended on Airbyte’s Github to set up frequent syncs so that the replication slot is cleared constantly, but however frequent the syncs are, the replication slot just keeps on growing constantly, and is not being cleared. From our investigation, the replication slot seems that has been stuck on the same LSN pointer (see screenshot) for the last several days, and as a workaround we have to clear the slot manually by recreating it 🙃 How can we avoid this workaround , maybe there is something we have to change in the connector setup? One of the suggestions that we have found is setting up the heartbeat ping from debezium to Postgres, so that was files are marked as processed in the database. How can we check/manipulate debezium parameters? Which file contains the setup? Please tag me for more info - happy to provide more details if necessary. Version info: Airbyte (4.0.17), Postgres connector (1.0.19) Tagging my colleague @Alexander Fligin octavia wave
    ✅ 1
    a
    a
    • 3
    • 7
  • v

    Vishwajeet Dabholkar

    10/25/2022, 6:06 AM
    Hey Guys I want to create the source and destination using APIs but I couldn't find any documentation for it I have this link : https://airbyte-public-api-docs.s3.us-east-2.amazonaws.com/rapidoc-api-docs.html#post-/v1/sources/create which I got from API documentation but there is no specific for to create specific source for example, API request to create S3 or any other source could someone please help me out here ??
    • 1
    • 1
  • v

    Vishwajeet Dabholkar

    10/25/2022, 6:07 AM
    the sample request which I tried was somthing like this :
    Copy code
    {
      "sourceDefinitionId": "69589781-7828-43c5-9f63-8925b1c1ccc2",
      "connectionConfiguration": {
        "s3_endpoint": "",
        "access_key_id": "xxxx",
        "s3_bucket_name": "abcd",
        "s3_bucket_path": "raw",
        "s3_bucket_region": "region",
        "secret_access_key": "xxxx"
      },
      "workspaceId": "7a2f3b32-a061-4e9d-a42e-75c879d1c7db",
      "name": "S3-API-TEST"
    }
  • v

    Vishwajeet Dabholkar

    10/25/2022, 6:08 AM
    but it gave me above error
    e
    • 2
    • 3
  • c

    Caleb Ejakait

    10/25/2022, 8:11 AM
    Hey! I am using Prefect 1.0 as my orchestration tool at the moment and I am keen to understand how I can integrate an Airbyte Instance on EC2 to Prefect Cloud deployment using ECS Fargate for execution. Inline with the security suggestions, it doesn’t allow inbound traffic from the internet. My only other idea was to create a ALB for Fargate so I can have an IP I can configure on the EC2 instance inbound rules. Would love to hear if there is a solution that doesn’t involve adding infra elements or if there i just a better way to handle this.
    s
    • 2
    • 6
  • g

    gunu

    10/25/2022, 10:54 AM
    Throwing a discussion out there - anyone doing anything interesting on the monitoring of airbyte jobs front?
  • s

    Slackbot

    10/25/2022, 10:58 AM
    This message was deleted.
  • r

    Rahul Borse

    10/25/2022, 12:00 PM
    Hi Team, How can I create a new workspace and switch between workspace? According to airbyte documentation I should see Access Management setting in settings tab. But I could not see any option like that. https://docs.airbyte.com/cloud/managing-airbyte-cloud/ Below is the screenshot of my local running airbyte setting.
    s
    • 2
    • 1
  • s

    Svatopluk Chalupa

    10/25/2022, 12:13 PM
    Hi, I'm building a postgres2postgres sync. If I synced some of our very large tables (tens of GB), even if I used incremental mode, the initialization would take several days, end every time we changed the underlying structure, the sync refresh would take several days again. If I have an already existing data on destination Postgres, is it possible to use them to "pre-fill" the target structures to avoid complete refresh of sync? In other words, can I initiate the incremental sync manually to let the Airbyte do just the future increments? Thanks!
    s
    • 2
    • 4
  • a

    Anton Peniaziev

    10/25/2022, 12:17 PM
    Hi Team, I’m trying to understand how airbyte leverages dbt to transfer data from one warehouse to another and whether there are changes made to dbt-core or is it used as is. Maybe someone could point me to the right place in code? Thanks 🙂
    j
    a
    c
    • 4
    • 4
  • n

    ni

    10/25/2022, 1:56 PM
    Hi. I have a couple connections configured on my airbyte instance, however they are not being triggered to run on their scheduled interval (1 hour). the docker logs do not show any errors or any otherwise indications that something is wrong. Any suggestions?
    s
    • 2
    • 1
  • d

    Dwayne Rudy

    10/25/2022, 2:11 PM
    Based on the descriptions in the MySQL connector I thought this might have been fixed but it hasn't. Airbyte has trouble processing invalid date formats... Airbyte version: 0.40.15 Snowflake connector: 0.4.38 MySQL connector: 1.0.6 The relevant parts of the log is attached - Airbyte throws an error when attempting to process a date with the wrong month, example, '2021-00-03'. Is there a fix for this? For any other invalid date format is appears to null the value (which is expected).
    a4aa250e_dd2f_4970_b578_82ae3935d63a_logs_3612_txt.txt
    s
    • 2
    • 4
  • d

    Dusty Shapiro

    10/25/2022, 2:25 PM
    I am curious; Is there value in using Octavia CLI to export workspace JSONS to check into version control if we’re deploying Airbyte with an external app DB?
    n
    a
    • 3
    • 5
  • b

    Brandon

    10/25/2022, 2:40 PM
    Am I missing an option to start pulling data after a period of time or is it always a full data sync? I would like to limit data to after 2016 or I can get the lowest ID within that range. In this particular case, I am using MySQL connection If there is currently no option, do you know if this is planned functionality
    s
    • 2
    • 3
  • m

    maxim

    10/25/2022, 2:47 PM
    Heeey! Can anyone tell why Airbyte defines Postgres Integer type as Number? Does any issues exists for this?
    s
    • 2
    • 3
  • s

    Stratos Giouldasis

    10/25/2022, 2:55 PM
    Hello I’m trying to connect a DigitalOcean Managed MongoDB which has a DNS Seed List connection string, I have tried: • connecting with standalone by passing the hostname of the first server in the replica set • connecting with Replica Set by passing the hostnames (I only have 1 server in the replica set, so only one server) • connecting with MongoDB Atlas and just providing hostname/dbname?tls=true&authSource=admin&replicaSet=db-mongodb-shared
    • 1
    • 12
  • g

    god830

    10/25/2022, 3:40 PM
    https://discuss.airbyte.io/t/hubspot-api-key-incoming-deprecation/3018 Please help 👋
    e
    y
    n
    • 4
    • 19
  • m

    Marissa Pagador

    10/25/2022, 3:58 PM
    Hi folks, I’m running into an issue I’ve never had before when adding a custom connector in the airbyte UI under settings > sources > new connector. For reference, I am on version 0.40.17 running on an AWS ec2 instance. I am filling in all the fields including the correct (public) docker repository name and docker image tag, but getting a “Internal Server error: get spec job failed” message. I included a screenshot and the server logs where you can see this error message at the bottom of the text file, but I am unsure what went wrong based on the message. Any clarification or insight into this would be appreciated!
    server-logs (1).txt
    m
    m
    • 3
    • 2
1...818283...245Latest