https://linen.dev logo
Join Slack
Powered by
# ask-community-for-troubleshooting
  • m

    Marcel Prothmann

    09/24/2021, 12:49 AM
    Hi, we are at the moment setting up our ETL stack (airflow,airbyte,dbt). We got it up and working on VMs already. Now we are moving everything to GKE and this comes with one question: Should we setup ONE GKE cluster (including.: (airbyte dbt & airflow) or 3 clusters (1 airbyte cluster, 1 dbt cluster, 1 airflow cluster) Happy to get your ideas and best practice advice on this... btw. this is an awesome community! only reading here helped us a lot already
    ✅ 1
    j
    u
    s
    • 4
    • 6
  • m

    Martin Carlsson

    09/24/2021, 7:53 AM
    Hi, I’m considering using Airbyte with a client, where I need to create a custom connector (to some old API) and install Airbyte all on my clients Azure. The documentation advices that I shouldn’t expose Airbyte to the internet:
    For security reasons, we strongly recommend to not expose Airbyte on Internet available ports. Future versions will add support for SSL & Authentication.
    The advice is to ssh into the azure VM and expose Airbyte to localhost. However, I think this is too technical for the client. Are there any other approaches that can securely expose Airbyte’s frontend to the client?
    j
    u
    j
    • 4
    • 13
  • t

    Tomas Peluritis

    09/24/2021, 3:34 PM
    Hey! Sorry if it's too noob question, but I'm probablymissing something: I want to use Airbyte to load NYC taxi data (open dataset information) to my local postgres DB. Have setup destination. I'm setting up: dateset: nyc-yellow-data path pattern: yellow_tripdata_.csv bucket (I'm using ARN, as I see it's an option and I can't find more information about buckets and etc online): arnawss3:::nyc-tlc Getting:
    Copy code
    The connection tests failed.
    ParamValidationError('Parameter validation failed:\nInvalid bucket name "arn:aws:s3:::nyc-tlc": Bucket name must match the regex "^[a-zA-Z0-9.\\-_]{1,255}$" or be an ARN matching the regex "^arn:(aws).*:(s3|s3-object-lambda):[a-z\\-0-9]*:[0-9]{12}:accesspoint[/:][a-zA-Z0-9\\-.]{1,63}$|^arn:(aws).*:s3-outposts:[a-z\\-0-9]+:[0-9]{12}:outpost[/:][a-zA-Z0-9\\-]{1,63}[/:]accesspoint[/:][a-zA-Z0-9\\-]{1,63}$"')
    Just don't start shooting that it's a regexp, figure it out 😄 it's a bit too complicated to me to understand and there is not much information about the dataset
    📝 1
    ✅ 1
    u
    • 2
    • 6
  • b

    Boggdan Barrientos

    09/24/2021, 7:48 PM
    Hi! I using Oracle connector with incremental sync. Is possible to set an start date for the cursor selected for incremental sync? My db source has historical data but when I tried the initial sync it fails, fails at the same point of records read when I reach 146MM. So now I want to extract only a part of it, from a start date, is possible? 😬
    logs-70-0.txt
    ✅ 1
    u
    • 2
    • 2
  • m

    Marcel Prothmann

    09/25/2021, 11:04 AM
    Hello everybody: would you rather install apache airflow on the gke cluster or would you just use GCP Cloud Composer?
    d
    u
    • 3
    • 6
  • p

    Peder Sviggum

    09/25/2021, 5:18 PM
    Hi everyone, I was wondering about sync frequency. Let's say I want the sync frequency to be once every 24h. But I want it to start at 23:00. Is there any way to accomplish this apart from running a sync manually at 23:00?
    ✅ 1
    p
    • 2
    • 1
  • o

    Oshini nishnali

    09/27/2021, 12:11 PM
    Hi, I configured airbyte on GKE. With just an external public IP it works as expected and is able to connect sources and destinations. Then I configured the ingress load balancer and set up an SSL certificate and expose it through public DNS and used identity aware proxy. Then I am unable to create sources and destinations getting this error. Can anyone help with this? Still able to create sources and destinations with external public IP
    u
    • 2
    • 1
  • s

    Sawyer Waugh

    09/27/2021, 2:38 PM
    With Meltano’s recent announcement of their formalized Singer Spec, the introduction of their Singer SDK, and their recent fundraising, it seems like most of Airbyte’s critiques of Singer either no longer apply or are much less applicable (save for the python-only limitations). I would love to hear from the airbyte team what they think of these recent efforts by Meltano with regards to the future of Singer and airbyte-singer compatability.
    👍 1
    m
    b
    a
    • 4
    • 7
  • d

    David Beck

    09/27/2021, 2:41 PM
    Hello! I'm new and looking for a way to send salesforce data to BigQuery. Airbyte seems like a solid option. However, I'm wondering if "Salesforce" also includes Salesforce Marketing Cloud-data in the connector?
    ✅ 1
    u
    • 2
    • 2
  • s

    Sam Werbalowsky

    09/27/2021, 4:24 PM
    Hello! We are looking at deploying Airbyte in the next 6 months or so as an alternative to Fivetran. I am not sure where the best spot to post this is, so feel free to guide me elsewhere. We need an additional stream in the hubspot connector (
    contact_form_submissions
    ). I don’t see this in the airbyte default hubspot connector. Is this something that is simple to add, and are there plans to add the full suite of connectors to Airbyte? the other thing I am worried about is if there are concerns about rate-limiting on the hubspot API. We frequently get millions of records updated via Fivetran, and I am a bit concerned about Airbyte’s handling of this given it says
    Copy code
    Hubspot's API will rate limit the amount of records you can sync daily, so make sure that you are on the appropriate plan if you are planning on syncing more than 250,000 records per day.
    u
    s
    • 3
    • 6
  • d

    David Beck

    09/28/2021, 11:23 AM
    Hello (again)! If running airbyte and planning to move data into bigquery: would you recommend running on Google Cloud? Is there a benefit compared running on AWS?
    ✅ 1
    d
    • 2
    • 3
  • j

    Jonas Bolin

    09/28/2021, 12:19 PM
    A data source reset fixed it. Not sure if that will a permanent solution going forward though.
    ✅ 1
    g
    • 2
    • 4
  • h

    Hicham Rahj

    09/28/2021, 3:29 PM
    hello, I have a question what is the temporary storage in here ? is it a container or the memory of the source or of the destination or none of them ? it's not clear to me from the documentation
    ✅ 1
    u
    • 2
    • 2
  • д

    Дмитрий Ансимов

    09/28/2021, 3:30 PM
    Hi all. Just started AirByte v0.29.22 in minikube, and looking forward to deploy it to GKE, but have a concern regarding logging. I'm using Cloud Logging (based on google-fluentd on GKE), which consumes ndjson (jsonl) lines from the container and pushes them into the Cloud Logging. Is there a way to enable cluster working components to print stdout/stderr in ndjson? Thanks in advance.
    👍 1
    ✅ 1
    u
    • 2
    • 2
  • m

    Mark

    09/28/2021, 9:50 PM
    Hi, just starting with Airbyte, taking it out for a spin. Setup was straight forward and without issue. Trying a simple transfer of data from MySQL to MySQL (both same and different servers) however in all scenarios tried. I get the same issue, airbyte not only reads from the source database, it creates a load of assume tmp table (underscore airbyte) and nothing in the destination. If I give a dedicated user select only on source and all privileges on destination, result is simply 3 failed attempts. For such a simple task I have obviously missed something??
    g
    • 2
    • 1
  • d

    David Beck

    09/29/2021, 1:09 PM
    Hi! I'm trying out Airbyte. I run it on my local computer just to get a feeling of how and if it would work. So my job is running and creating data in BigQuery from Salesforce. However, some errors occur. Almost always it needs 3 atempts and I have a feeling that all data is not transferred. Could this be a hardware issue?
    ✅ 1
    c
    • 2
    • 2
  • d

    David Beck

    09/29/2021, 1:21 PM
    Is this your first time deploying Airbyte: Yes  OS Version / Instance: MacOS  Memory / Disk: 8GB, 64GB Deployment: Docker  Airbyte Version: 0.29.21-alpha  Source name/version: Salesforce 0.1.1 File 0.24 Destination name/version: BigQuery (denormalized typed struct) 0.1.5 Step: On sync  Description: I'm trying to sync for the first time and the process doesn't finish. I’m very much a beginner so I expect everything to be my fault. I’ll provide the log of the first attempt in a thread.
    ✅ 1
    👀 1
    u
    h
    +2
    • 5
    • 22
  • m

    Mani Prakash

    09/29/2021, 9:44 PM
    Hi, Im getting started with Airbyte and wondering on performance comparing to any other data integration tools like Informatica, Talend, Pentaho etc. I would like to get an idea on handling 100 GB's of data on daily basis. Thanks.
    👍 1
    u
    a
    j
    • 4
    • 6
  • n

    Naveen Sai Patnana

    09/30/2021, 7:55 AM
    Hi, Is S3 as a source supports the sync of parquet files? because if I select that in Ui after refreshing it is replacing with csv. 2. While using basic normalisation for s3 as source and snowfalkes as destination, im getting error as Could not find image: airbyte/normalization:0.1.46, is there any solution for this?
    👀 3
    h
    l
    +2
    • 5
    • 21
  • j

    Jonas Bolin

    09/30/2021, 10:33 AM
    Can someone clarify a point regarding Incremental Deduped History User-Defined Primary key. In our case, we're pulling Google Analytics data with these columns: Dimensions: Date, Campaign, Source & Medium, Event Category, Metrics: Total Events, Event Value How do I select User-Defined Primary key. My guess is that GA won't return duplicated records of the combination of Date, Campaign, Source & Medium, Event Categor, so using all of them as the composite primary key should work. Am I correct in approaching it this way?
    👀 1
    h
    b
    • 3
    • 4
  • m

    Matheus de Freitas Andrade

    10/01/2021, 4:45 AM
    Guys, I’ve installed the Airbyte inside a EC2. How can i access the files to modificate the .dev file ? is it possible ? I can’t find the folders.
    ✅ 1
    j
    • 2
    • 5
  • r

    Ruben Jansen

    10/01/2021, 1:55 PM
    Hi all, I'm trying to get Airbyte working on the heroku platform but I'm running into some problems. Maybe one of you guys already got airbyte deployed at Heroku so I'm curious if one of you would like to share your 
    heroku.yml
     file or other steps needed to get it working at Heroku. My current 
    heroku.yml
     file looks like this:
    Copy code
    build:
      docker:
        web: airbyte-webapp/Dockerfile
        airbyte-temporal: airbyte-temporal/Dockerfile
        server: airbyte-server/Dockerfile
        worker: airbyte-workers/Dockerfile
        scheduler: airbyte-scheduler/app/Dockerfile
        init: airbyte-config/init/Dockerfile    
    run:
      web: /docker-entrypoint.sh nginx -g 'daemon off;'
      airbyte-temporal: /entrypoint.sh /bin/bash -c '/start.sh autosetup'
      server: /bin/bash -c bin/${APPLICATION}
      worker: /bin/bash -c bin/${APPLICATION}
      scheduler: /bin/bash -c bin/${APPLICATION}
      init: /bin/sh -c "./scripts/create_mount_directories.sh /local_parent ${HACK_LOCAL_ROOT_PARENT} ${LOCAL_ROOT}"
    Heroku App Information 1. Stack: container 2. Framework: No framework detected 3. Slug size: No slug detected Build failed
    Build Log.txt
    j
    • 2
    • 2
  • c

    Cédric Malet

    10/01/2021, 5:57 PM
    Copy code
    7th-street@7th-Street ~ % ssh -i /Users/7th-street/.ssh/airbyte-key.pem -L 8000:localhost:8000 -N -f <mailto:ec2-user@c2-18-170-227-242.eu-west-2.compute.amazonaws.com|ec2-user@c2-18-170-227-242.eu-west-2.compute.amazonaws.com>
    ssh: Could not resolve hostname <http://c2-18-170-227-242.eu-west-2.compute.amazonaws.com|c2-18-170-227-242.eu-west-2.compute.amazonaws.com>: nodename nor servname provided, or not known
    ✅ 1
    j
    j
    • 3
    • 4
  • c

    Cédric Malet

    10/01/2021, 9:36 PM
    Is a connector for Google Analytics 3 still available ?
    ✅ 1
    u
    • 2
    • 4
  • m

    Michel Yeung

    10/04/2021, 7:33 AM
    Hi all, I’m currently testing Airbyte in my team. What is the recommended way to “test” a connector? I do a full sync of all tables (by default) from a connector and get some errors (datatype mismatch between data and destination) for a few tables resulting in a failed sync. I remove the troublesome tables from the sync list and re-run the sync. The unmarked tables enables to go further in the sync and then I encounter new errors. I’m loosing a lot time because of the sync duration in-between sync failure. Should I instead test table one by one? More generally, how Airbyte deals with these kind of errors, eg. when only a single table sync fails? Do we have to re-run the sync for all the tables? Or am I missing something? Thanks in advance 🙏
    ✅ 1
    👀 1
    h
    u
    • 3
    • 7
  • p

    Prateek Gupta

    10/04/2021, 8:32 AM
    hey, I am using a postgres to postgres connector and one of the fields that I am trying to transfer is a "timestamp without timezone" type wich is getting converted to a string by airbyte, any way I can retain the specific type for this field ?
    👀 1
    t
    h
    • 3
    • 2
  • p

    Padraig O'Leary

    10/04/2021, 12:39 PM
    Hi. Newbie question here…. I am looking to integrate with a large number (100s?) of cloud providers (intercom, Hubspot, Kustomer etc). We want to perform specific operations on them, such as delete or access a specific customer record. From what I can tell, AirByte would not be a good solution for this use case. For instance, if I want to delete an Intercom Contact then I cannot use AirByte for that (https://docs.airbyte.io/integrations/sources/intercom). Rather I would have to use the native API (https://developers.intercom.com/intercom-api-reference/reference#delete-contact) Am I correct? And if I am, then assuming I should just start building native integrations with each of these cloud sources
    ✅ 1
    👍 1
    u
    • 2
    • 2
  • c

    Cédric Malet

    10/04/2021, 2:18 PM
    image.png
    ✅ 1
    j
    u
    • 3
    • 5
  • t

    Tim Chan

    10/05/2021, 1:10 AM
    Is there a Kafka Source connector?
    l
    • 2
    • 5
  • c

    Cédric Malet

    10/05/2021, 4:08 PM
    With the MongoBD connector, I got an error :
    Failed to fetch schema. Please try again
    Any idea to resolve the issue?
    logs-fe138931-60a4-4ab9-910a-e696ce5ed496-.txt
    👍 1
    ✅ 1
    j
    • 2
    • 3
1...101112...245Latest