https://linen.dev logo
Join Slack
Powered by
# ask-community-for-troubleshooting
  • l

    laila ribke

    11/07/2022, 2:26 PM
    Hi all! is it possible to transform data when using normalization and incremental sync on a specific connection? If so, can someone help me?
    • 1
    • 2
  • l

    laila ribke

    11/07/2022, 2:29 PM
    Another question, where are normalization and custom transformations done? source or destination? Because my destination is Redshift, and it´s too pricy
    • 1
    • 9
  • a

    AJ

    11/07/2022, 3:02 PM
    Hi Can someone please help me which endpoints used by airbyte to download the connector images ? I need to submit a firewall request and it seems like I cannot download them in my closed environment. Plus is there a manual way to download the images and push it to airbyte to avoid such bockers? I was able to download the airbyte/connector image successfully and restarted airbyte but still it cannot recognize it .
    s
    • 2
    • 1
  • a

    Alexandre Voyer

    11/07/2022, 3:47 PM
    Hi there, just an odd behaviour I've noticed. The original sync worked well but the next syncs are showing
    0 bytes | no records | no records
    . This is odd since the database has for sure changed since then. Is there any documentation to troubleshoot? The syncs shows as success. Source: PostgreSQL Destination: Snowflake All tables have a PK.
    n
    • 2
    • 10
  • f

    Faris

    11/07/2022, 3:54 PM
    I am having issue to connect my Postgres (used as destination) all my infrastructure (airbyte, Postgres dwh, and production read replica) is in the same vpc. I have created the connection with the read replica straightforward but my destination which is a Postgres as well is facing this issue non-json response I don’t have ssh or jdbc required
    a
    s
    • 3
    • 7
  • a

    Ameer Hamza

    11/07/2022, 5:07 PM
    Hi there, i am running into error while running ./generate.sh command.
    s
    • 2
    • 2
  • a

    Andrés Muro

    11/07/2022, 5:10 PM
    Hi everyone, Quick question: We’re using Airbyte Cloud but we have realized that we need to sync data every 15 minutes. Right now, it seems like Cloud doesn’t allow schedules that sync more than once per hour. Any insights on any possible workaround for this? Looks like using the REST API or integrating with Airflow might be possible solutions but those are not supported for Cloud either. Thanks for your help!
    • 1
    • 1
  • l

    laila ribke

    11/07/2022, 5:21 PM
    Hi all, I have set a google ads S3 connection, which works ok (I see the file in the bucket). Then I set another connection with S3 source and Redshift destination. In this connection I intend to take the S3 file, normalize it and sent it to redshift. It succeeds but it syncs no records.. Which output format should I use?
    s
    • 2
    • 1
  • b

    Brian Castelli

    11/07/2022, 7:26 PM
    Hey, Team! I have AirByte deployed on kubernetes. how to I upgrade it with the latest AirByte code? I tried: •
    git pull
    to get the latest AirByte code •
    kubectl apply -k kube/overlays/stable
    to push the code out to the pods But I don't see the new sources and destinations in the GUI. What step am I missing?
    n
    • 2
    • 2
  • a

    Andrew Exlet

    11/07/2022, 8:36 PM
    I’m getting a time out error trying bring back the schema from an Amazon Aurora MySQL source when trying to setup a new connection airbyte-proxy | 2022/11/07 202452 [error] 13#13: *1 upstream timed out (110: Connection timed out) while reading response header from upstream, client: ..., server: , request: “POST /api/v1/sources/discover_schema HTTP/1.1”, upstream: “http://*...:80/api/v1/sources/discover_schema”, host: “...:8000”, referrer: “http://...*:8000/workspaces/*****-7216-4203-abbc-****/connections/new-connection” Is there anyway I can increase the timeout value on the connection?
    m
    s
    • 3
    • 9
  • a

    Avery Yip

    11/07/2022, 8:40 PM
    Any ideas on how I can debug this issue? If there are any airbyte contractors here, we're open to contracting folks to help us solve this problem in the short-term. https://github.com/airbytehq/airbyte/issues/18253
    • 1
    • 1
  • p

    Paul Rus

    11/07/2022, 9:00 PM
    Hello, I'm trying to sync a large table of about 6 GB from BigQuery to Postgres, however I've run into an error right near the end of the sync
    s
    • 2
    • 6
  • j

    João Larrosa

    11/07/2022, 9:01 PM
    Hi, guys! How can I do 'dbt deps' and 'dbt run' in airbyte custom transformation? It always returns to me that i have not installed the necessary packages that are in packages.yml. Thank you.
    m
    m
    • 3
    • 3
  • d

    Dan Cook

    11/07/2022, 9:07 PM
    The Google Search Console data in our warehouse, for the Airbyte custom report whose JSON appears below, has never amounted to more than 50,484 records in a single day. See attached screenshot. But we know from Google's API that our daily total for this set of dimensions has gone as high as 155K daily records. So I suspect there is a limitation in the Airbyte GSC connector which stops new requests soon after a daily limit of 50K rows has been hit. I've checked the source code for the connector and nothing obvious sticks out.
    Copy code
    {
      "name": "KEYWORD_PAGE_REPORT",
      "dimensions": [
        "date",
        "country",
        "device",
        "query",
        "page"
      ]
    }
    ----------- Software: Airbyte OS (0.40.15) Connector: Google Search Console (0.1.18) Destination: Snowflake (0.4.38)
    n
    u
    • 3
    • 7
  • x

    xiang chen

    11/08/2022, 12:44 AM
    Hi all, I am a beginner in airbyte. I would like to ask a question about deployment, if I have 3 vms, is there any way I can deploy a highly available airbyte service? I currently only see local-docker mode and k8s mode in the documentation.
    m
    s
    • 3
    • 4
  • a

    Alexandre Voyer

    11/08/2022, 1:32 AM
    Hi there, I'm trying to see if I can set the cursor for my postgres source as per: https://docs.airbyte.com/understanding-airbyte/connections/incremental-append/#inclusive-cursors But I'm not seeing a dropdown like the docs. Does this mean that the cursor is automatically the PK (which would be fine)
    • 1
    • 4
  • s

    Slackbot

    11/08/2022, 1:36 AM
    This message was deleted.
    • 1
    • 1
  • m

    Matt Webster

    11/08/2022, 2:06 AM
    Hi all, I’m a little confused on how the S3 source works under different sync modes. I have a connection set to “Full Refresh | Overwrite” that runs every hour and matches files .csv files in a certain bucket. It seems to load data from those CSV files regardless of whether or not they’ve been updated--it says it’s emitted 139K records every time. The files CAN and SHOULD be overwritten when the data needs to change. I’d like for the data in my target to be overwritten every time the file is overwritten. How can I do this? If I switch to “Incremental | Deduped” it’s also going to read all the file contents. I’m worried that this process won’t scale as the files pile up.
    s
    • 2
    • 4
  • a

    Agung Pratama

    11/08/2022, 4:37 AM
    Hi, I am currently exploring airbyte (the open source one) to sync various data into our data warehouse. I understand that it has
    manual
    and
    cron
    scheduling to schedule the sync job. I have question though, is there any API that I can use to trigger the sync job?
    n
    • 2
    • 4
  • l

    Liyin Qiu

    11/08/2022, 6:36 AM
    Hi team, do you have idea about how many connections the community airbyte version can support? Deployment is to AWS eks, kafka to s3 data sync every 10 minutes. Rough number is fine like hundreds or thousands? and if we want to scale to support more, any parameter tuning suggestions?
    s
    • 2
    • 1
  • h

    Home Kralych

    11/08/2022, 7:02 AM
    Hi everyone! I keep getting the “JSON Schema validation failed” when syncing a MySQL table with BigQuery using default settings • sync mode: Incremental | Append • normalization and transformation: Raw data, no normalization
    s
    • 2
    • 6
  • v

    Vrushank Kenkre

    11/08/2022, 7:36 AM
    Hello, we are using the gitlab source and running into
    ModuleNotFoundError
    .
    2022-11-08 07:22:07 *ERROR* i.a.c.i.LineGobbler(voidCall):114 -   File "/airbyte/integration_code/main.py", line 9, in <module>
    2022-11-08 07:22:07 *ERROR* i.a.c.i.LineGobbler(voidCall):114 -     from source_gitlab import SourceGitlab
    2022-11-08 07:22:07 *ERROR* i.a.c.i.LineGobbler(voidCall):114 -   File "/airbyte/integration_code/source_gitlab/__init__.py", line 25, in <module>
    2022-11-08 07:22:07 *ERROR* i.a.c.i.LineGobbler(voidCall):114 -     from .source import SourceGitlab
    2022-11-08 07:22:07 *ERROR* i.a.c.i.LineGobbler(voidCall):114 -   File "/airbyte/integration_code/source_gitlab/source.py", line 40, in <module>
    2022-11-08 07:22:07 *ERROR* i.a.c.i.LineGobbler(voidCall):114 -     from .streams import (
    2022-11-08 07:22:07 *ERROR* i.a.c.i.LineGobbler(voidCall):114 -   File "/airbyte/integration_code/source_gitlab/streams.py", line 13, in <module>
    2022-11-08 07:22:07 *ERROR* i.a.c.i.LineGobbler(voidCall):114 -     from airbyte_cdk.sources.utils.sentry import AirbyteSentry
    2022-11-08 07:22:07 *ERROR* i.a.c.i.LineGobbler(voidCall):114 - ModuleNotFoundError: No module named 'airbyte_cdk.sources.utils.sentry'
    Looks like a missing dependency issue here, can you please take a look?
    • 1
    • 1
  • l

    laila ribke

    11/08/2022, 8:52 AM
    Hi all, I´ve set an S3-> Redshift connection. For sure, the bucket contains a csv file. The connection detects a schema, and sync succeeds but I get 0 records. Any ideas?
    s
    • 2
    • 1
  • g

    Gopinath

    11/08/2022, 9:14 AM
    Hi Team, I am currently doing POC on airbyte and have deployed latest Airbyte Version: 0.40.17 on Docker version 20.10.20, build 9fdeb9c When i am trying to setup MySql source i am getting "The connection tests failed." in frontend UI. I could see below error logs from airbyte-proxy container logs airbyte-proxy | 2022/11/08 085113 [error] 13#13: *1 upstream timed out (110: Connection timed out) while reading response header from upstream, client: *.*..*, server: , request: "POST /api/v1/scheduler/sources/check_connection HTTP/1.1", upstream: "http://***.*..*:80/api/v1/scheduler/sources/check_connection", host: "localhost:8000", referrer: "http://localhost:8000/workspaces/68ef7d91-b53e-48a9-a40e-64cd63aa3cf2/onboarding" Can someone help me out here
    a
    • 2
    • 2
  • t

    Thomas Pedot

    11/08/2022, 10:08 AM
    Hello, I tried to sync two postgresql datbases (django ORM). The destination is empty (version 13) The source is version 12. I don't know if it is relevant. When login to the newone I have this error
    Copy code
    column site_sitesettings.automatically_fulfill_non_shippable_gift_card does not exist LINE 1: ...", "site_sitesettings"."gift_card_expiry_period", "site_site... ^ HINT: Perhaps you meant to reference the column "site_sitesettings.automatically_fulfil__n_shippable_gift_card".
    When I look, I have this column with
    Copy code
    automatically_fulfill_non_shippable_gift_card
    If you have good eyes 🧐, you can see that there is an fulfill_non is replaced by __n Do you think it is a bug ? Is it related to postgres or something ? In replication it is OK
    • 1
    • 1
  • y

    Yewon Kim

    11/08/2022, 11:40 AM
    Hello there, This is Kim. We would like to provide our customers flexible integration service between our service and the 3rd party service. At this time, I wonder how far Airbyte’s Connectors Feature provides. For example, I would like to know which of the following options is possible if our service and CRM tool are linked via Airbyte’s Connectors. 1. Is it possible for only the admin of our service to use it in conjunction with the CRM tool? 2. Can our customers also do log in and integrate our service account with their CRM account? Currently, I am using open source, and I wonder if the above function is provided only to SaaS or if there is a difference.
    • 1
    • 1
  • a

    Aazam Thakur

    11/08/2022, 12:40 PM
    Hi team, got a question in understanding CursorPagination I tried implementing it in the taskrouter api but fail to paginate as it defaults to returning only the first 50 records. What does the cursor field exactly point to in the tutorial when we pass header[]? I've attached the json output that the taskrouter api returns, here it indicates the next page uri and i tried to access it with my yaml file below. I can't seem to figure out how to access the nextpageurl and pass it so that it paginates. Any tips are much appreciated :)
    • 1
    • 2
  • m

    Muhammad Usman

    11/08/2022, 1:00 PM
    Hello, I am trying to connect facebook page and I am facing this error: HTTPError('400 Client Error: Bad Request for url . Has anyone else faced ? please guide.
    m
    n
    l
    • 4
    • 34
  • k

    kavi arasu

    11/08/2022, 3:03 PM
    Hello Team, Today I started to work on Airbyte, I'm facing issue while connecting Postgres Database as Source, which is already running in my local. Please find the below Screenshot FYR. Same time I can able to connect that DB using other tool such as DBeaver. Could anyone help me out here?
    a
    • 2
    • 7
  • g

    Gerard Clos

    11/08/2022, 3:20 PM
    Aloja 👋 Getting a constant stream of this error
    Copy code
    "query directly through matching on sticky timed out, attempting to query on non-sticky
    when running airbyte locally (docker). It makes reading logs basically impossible and it does not seem to affect functionality. Any clues? It seems to be related to Temporal.
    n
    • 2
    • 11
1...899091...245Latest