https://linen.dev logo
Join Slack
Powered by
# feedback-and-requests
  • a

    Anil Kulkarni

    03/02/2022, 11:56 PM
    Hi team, I tried Airbyte be testing Strava and Posthog sources. How do i understand what's happening as part of basic normalization? Is there dbt repo configured to it?
  • v

    Valerio Di Tinno

    03/03/2022, 9:42 AM
    Hello team, Do you plan to add CampaignsInsights to Facebook Marketing’s Source available list of tables?
    h
    • 2
    • 2
  • j

    John Maguire

    03/03/2022, 4:02 PM
    Hi All, I have a few general questions and I am unsure if this is the appropriate place to ask. We are trying to build a data warehouse by syncing our customers shopify data to our hosted db, the problem is I am unsure how to scale this on AWS. We have 100s of customers so we would have 100s of connections and sources with a single destination and this will only grow. Using the cloud version of airbyte would be ideal if we didn’t have to worry about this scaling (cost dependent I guess) but api access is a must as we cannot manually set up and maintain these connections/sources . we typically on board a number of new customers every day and need to automate the set up of these these connections/sources as part of that process. Ideally we would be able to run workers on their own instances , Does anyone have any suggestions or examples of how this could be achieved? or maybe its just not possibly right now and I need to find another solution
  • p

    Philippe Boyd

    03/03/2022, 4:07 PM
    Hey guys, I’m relatively new to the DBT custom transformation world and I was wondering why can’t I only specify a docker image which contains my DBT project? In our DBT project we would have a Dockerfile which will contain the DBT runtime cli. Is there something I’m not understanding or Airbyte didn’t think about this use case? My DBT custom transformation project would then reside INSIDE the docker image so that I wouldn’t need to specify a git project and branch.
    c
    g
    a
    • 4
    • 6
  • z

    Zach

    03/03/2022, 5:13 PM
    Wanted to get thoughts around this: https://github.com/airbytehq/airbyte/issues/8503 It seems like most of the focus was on the performance characteristics, but another thing to consider would be there's quite a bit more data you can retrieve from the Shopify GraphQL endpoint that doesn't exist in the rest API. It also seems like this is the direction Shopify is heading is the GraphQL endpoint being the preferred way to interface with Shopify. I wouldn't be surprised if the APIs continue to diverge over time. One example is the customer journey data, this only exists in the GraphQL API and is quite important for understanding the data from a marketing perspective. https://shopify.dev/api/admin-graphql/2022-01/objects/CustomerJourneySummary#fields
    m
    s
    • 3
    • 4
  • g

    Gary K

    03/04/2022, 12:02 AM
    Hi. I want to suggest having the ability to label/name connections. It was suggested to use multiple connections with the same source & destination as a workaround for an issue I'm currently having where a single connection is not capturing all the configured streams, but is able to capture everything with a reduced set of streams. After thinking a bit more about it, using multiple connections with the same source & destination might also be desired when there are separate teams managing different groups of streams, or you want to put different source streams into different destination schemas. This leads to multiple connections on the connection page that look exactly the same. You then have to drill down to see what the differences are in the settings. I think being able to add a label or name to connections will prevent me creating multiple sources with the exact same settings (apart from the source name). I'd rather have multiple connections than multiple duplicate sources when I only have one real source.
    m
    • 2
    • 1
  • j

    Jeremy Owens

    03/04/2022, 8:22 PM
    Hey folks, I'm trying to field a request about log4j vulnerabilities and airbyte. Is there any documentation I can share?
  • j

    Judah Rand

    03/06/2022, 3:32 PM
    Hi there, I have been looking at using Airbyte to replicate our production Postgres database into BigQuery using CDC, however, I’ve encountered three blocking problems: 1. No way to either anonymize or not sync sensitive columns of certain tables. The recommendation of using views does not help as we need to capture deletes. 2. There is no way to add/remove streams from an existing connector without resyncing all streams (even if most do not need changing). This means that we are unable to sync our database as the initial sync of all tables takes too long if done in one go (WAL accumulates). Our Postgres database is very large. Ideally, we’d add the tables bit by bit to stop the WAL accumulating to unreasonable levels. 3. We have too many tables to realistically manage in the UI. We really need to be able to declare this ‘in code’ and mange it in version control. Are these this are planned to be improved/added/fixed in the near future?
    p
    • 2
    • 1
  • d

    David Beck

    03/07/2022, 2:31 PM
    Hi! On docs here: https://docs.airbyte.com/deploying-airbyte/on-gcp-compute-engine#connect-to-airbyte I was struggling "WHY DOESN'T MY SERVER START!" and, at least, shamefully, realized that my instance wasn't called airbyte ... 🤦 To prevent this from happening maybe you could edit:
    Copy code
    gcloud --project=$PROJECT_ID beta compute ssh airbyte -- -L 8000:localhost:8000 -N -f
    into
    Copy code
    gcloud --project=$PROJECT_ID beta compute ssh $INSTANCE_NAME -- -L 8000:localhost:8000 -N -f
    m
    • 2
    • 1
  • k

    kshitij chaurasiya

    03/08/2022, 10:21 AM
    Hi Team, I was going through the code base, I have some question regarding scheduling, currently there are list of specific time options for the data sync scheduling job like [1, 2 3 6 ..] hour, I guess current solution doesn't using temporal cron scheduling option for executing workflow, it has its own scheduling mechanism right? Is there is any specific reason behind it?
    h
    • 2
    • 1
  • s

    Sophie Lohezic

    03/08/2022, 1:34 PM
    Hi all! I am trying to set up a connection between salesforce and bigquery with the GCS method (because we have errors with the insert method when we have many tables and it seems to be ok with 'only' 10 tables). Could you please detail how to check the connection between the bucket and the machine running airbyte please ? in the docs : Make sure your GCS bucket is accessible from the machine running Airbyte. This depends on your networking setup. The easiest way to verify if Airbyte is able to connect to your GCS bucket is via the check connection tool in the UI. We have deployed airbyte on a VM in GCP, we have created a bucket in the same project and given object creator / object viewer rights to the airbyte service account but I wonder where / if I have other permissions to give as I am running into errors when testing the connection. Many thanks for your help 🙏
    a
    h
    • 3
    • 7
  • k

    K rodgers

    03/08/2022, 3:17 PM
    is there any tutorial on how to create airbyte connections dynamically using the API? in detail: I want to deploy airbyte on AWS and setup connections, sources, destinations dynamically from my backend service (also deployed on AWS). By dynamically I mean without using the GUI. I am finding it hard to understand which API calls I need to make to setup something like Google sheets <> MySQL DB
  • a

    Alex Bondar

    03/08/2022, 5:22 PM
    hey guys, just started to check out Airbyte, from the first look It looks very cool and will help huge amount of Data engineers around to make ELTs easier small question, is there any way to run sync in incremental more(per day) \w sleep/scheduling between runs, this for preventing running into rate limits & be able to run backfill for historical data?
  • k

    kannan

    03/08/2022, 8:32 PM
    Hi, Is there a documentation on how to set up Airbyte on Azure AKS . I see only EKS and GCP
    m
    • 2
    • 1
  • m

    Maxime Lemaitre

    03/09/2022, 9:02 AM
    Hey, Is there any eu vat source connector ? like for exchange rate, I need historical data for all eu vat rates. This is needed for all related finance reporting in Europe. If this is not available yet, which data source do you use ? thanku Sorry, If this is a duplicate I was not able to find any related message.
    a
    s
    • 3
    • 4
  • n

    Nandan Hegde

    03/09/2022, 5:22 PM
    I have been looking at using Airbyte to replicate our Postgres database into BigQuery using custom dbt, however, I’ve encountered blocking problems: 1:There is no way to remove airbyte_raw_data that not useful in the replicate from one DB to another DB 2:if i select the Normalized tabular data it will create a airbyte_raw_data Table and every time sync happed data will keep appending that table is there any way to avoid the"airbyte_raw_data Table"?
    m
    • 2
    • 2
  • t

    Thomas

    03/10/2022, 10:23 AM
    Hey All, Quick question, how do i change the sync mode for all data streams?
    a
    • 2
    • 3
  • n

    nor

    03/10/2022, 4:22 PM
    Hello everyone , is there a way to get all current running jobs in Airbyte ? I am aware that it is possible to get the list of jobs for a certain connection , but I am wondering if there is a workaround to get all running jobs ?
  • a

    Afonso Rodrigues

    03/11/2022, 2:56 PM
    Hello 😄, About this request: Add user management and login screen we have a plan to add or is present in roadmap? Thank's
    • 1
    • 1
  • z

    Zach Brak

    03/11/2022, 6:03 PM
    I’m seeing work on the BigQuery denormalized connector recently that uses a value of
    big_query_array
    to account for nulled arrays. This seems to then be adding this value of
    big_query_array
    into table schemas. Was this necessary to handle nulled arrays? A large value proposition of the Airbyte BigQuery-denormalized connector, was that it was able to match schemas directly to JSON coming out of the source API. I’m concerned interjecting this hard-coded value into the created schemas causes many more problems downstream than it solves upstream. (edited for clarity/tone)
    m
    • 2
    • 1
  • a

    avilares

    03/11/2022, 10:22 PM
    Hello! Is there a configuration file where Airbyte stores the connections? I have a connection where I have more than 1000 tables and some of them have a specific sync_mode (and I need to go one-by-one changing the sync mode), and having the configuration file where Airbyte stores that connection it will be more easier to change the sync mode according to the table name. Thanks 🙂
  • e

    Emil Sedgh

    03/12/2022, 4:28 AM
    And here is some testing I’ve done: If I sync a single project, it works very reliably. But if I don’t define a project, but only a group, or if i define multiple projects, then this starts happening
    h
    a
    • 3
    • 6
  • g

    Gilbert Vrancken

    03/13/2022, 8:47 PM
    There is an error in the Source documentation Reader Options. It says
    (e.g. {}, {'sep': ' '})
    but that doesn't work, it should say:
    (e.g. {}, {"sep": " "})
    (single quotes are no valid JSON)
    m
    • 2
    • 2
  • m

    Mouli S

    03/14/2022, 7:30 AM
    Hey Airbyte Team - As per API documentation Airbyte Cloud have not exposed the API's for interaction to create sources and destinations. Any ETA on when those feature would be enabled for Airbyte Cloud?
    • 1
    • 1
  • b

    Ben Mizrahi

    03/14/2022, 9:44 AM
    Hi Team, I use Airbyte with Kafka source to GCS - I cannot understand how can I parse the value message from Kafka to json schema ? I setup JSON schema via API but json validation fails :
    Copy code
    2022-03-14 09:31:22 INFO i.a.v.j.JsonSchemaValidator(test):56 - JSON schema validation failed. 
    errors: $: null found, object expected
    2022-03-14 09:31:22 ERROR i.a.w.p.a.DefaultAirbyteStreamFactory(lambda$create$1):70 - Validation failed: null
    2022-03-14 09:31:22 destination > 2022-03-14 09:31:22 INFO i.a.i.b.IntegrationRunner(runInternal):154 - Completed integration: io.airbyte.integrations.destination.gcs.GcsDestination
    2022-03-14 09:31:22 INFO i.a.w.DefaultReplicationWorker(run):165 - Source and destination threads complete.
    a
    • 2
    • 3
  • r

    Ramon Vermeulen

    03/14/2022, 1:53 PM
    Are the web pages/documentation in the source repo as well? I think the "ConnectorSpeficiation" link in this page https://docs.airbyte.com/connector-development/tutorials/cdk-tutorial-python-http/3-define-inputs is pointing to the wrong line number. If it is, I'll make a quick PR
  • b

    beer

    03/15/2022, 4:49 AM
    can add feature . they are select dataset staging for airbyte. i mean don’t create airbyte staging in dataset production.
    o
    o
    +2
    • 5
    • 7
  • o

    Octavia Squidington III

    03/15/2022, 2:38 PM
    loading...
    o
    m
    s
    • 4
    • 6
  • a

    Andrés O. Arredondo

    03/15/2022, 7:19 PM
    Hi, is there a way to do a full refresh daily sync without manually modifying the source config everyday?
    m
    • 2
    • 1
  • a

    Ameya Bapat

    03/16/2022, 7:12 AM
    Hi All, currently airbyte just sends plain-text data to the settings->notifications->webhook on success / failure. It is not programatically consumable and users can not effectively benefit from it. Instead can we pass the response of
    /api/v1/jobs/get
    for respective jobid ? I have added the comment in https://github.com/airbytehq/airbyte/issues/10598? Detailed notifications would help a lot.
    h
    m
    • 3
    • 6
1...1213141516Latest