Airbyte #feedback-and-requests

Finn Frotscher

09/23/2021, 12:30 PM

howdy, where do i post issues like this css issue on the connections page?

Anna Mariscotti

09/23/2021, 3:43 PM

Hi Airbyte team and all 👋 For the Google Ads source, is there already a request to support multiple customer ids? I found this issue and got my hope up, but then noticed that it is about the Google AdWords source, not Google Ads.

Nick Akincilar

09/24/2021, 3:19 PM

Just wanted to see if you guys made any updates to Snowflake connector. If I remember, it was accepting customer managed external S3 buckets vs. Snowflake internal stages which are easier and more secure. Also data should be sent in 100MB 250MB chucks in order for Snowflake to use full MPP power of larger warehouses to ingest data in parallel in a much faster way.

Tomas Peluritis

09/25/2021, 6:03 PM

Hey, https://airbyte.io/connector-categories/marketplace clicking on

Check their health status

gets me to "Page not found" is that expected?

Preetam Balijepalli

09/27/2021, 1:25 PM

Has anyone used amazon managed airflow and airbyte togather

Ameya Bapat

09/28/2021, 9:50 AM

Copy code

Currently, each data sync will only create one file per stream. In the future, the output file can be partitioned by size. Each partition is identifiable by the partition ID, which is always 0 for now.

Ref: https://docs.airbyte.io/integrations/destinations/s3 When will we get this feature? Parsing/reading from S3 files of size 10s of GBs would be complicated.

Ameya Bapat

09/28/2021, 10:01 AM

Hi Is there any plan to handle schema level changes in incremental - append mode?

Copy code

The current behavior of Incremental is not able to handle source schema changes yet, for example, when a column is added, renamed or deleted from an existing table etc. It is recommended to trigger a Full refresh - Overwrite to correctly replicate the data to the destination with the new schema changes.

Ref: https://docs.airbyte.io/understanding-airbyte/connections/incremental-append What is the workaround that you propose to identify changes in schemas ? Do we need to handle schema change detection logic out side of airbyte and then trigger full refresh?

Jeff Crooks

09/28/2021, 12:43 PM

Any update on when this might get back into a sprint? https://github.com/airbytehq/airbyte/issues/5765 Trying to plan workarounds, this and Mongo connector are still big blockers for me 😞

Kriti (Postman)

09/29/2021, 9:17 AM

Hi, I am trying to configure Redshift destination. The way our Redshift is setup, it requires

ssl=true

. How can I do this in Airbyte?

Zach Brak

09/30/2021, 1:22 AM

Is there a policy against soliciting users directly through this slack channel? How do I report someone if I’ve been approached by someone directly trying to sell their SaaS?

Stefan Otte

10/01/2021, 4:18 PM

Hey there! I wanted to version control my sources, destinations, and connections. But then I saw that airbyte switched away from files and stores the configuration in PG https://github.com/airbytehq/airbyte/blob/master/docker-compose.yaml#L22 which makes it harder to version control. I'm wondering if airbyte a) offers a utility like

dump-airbyte-configuration-to-file

and b) the option to load that file when running airbyte via

docker-compose up

Or maybe you folks have another nice workaround.

Milad saidi

10/03/2021, 1:49 PM

hi , i have this error on airbyte-scheduler-pod on k8s!! web ui is not loading

Copy code

2021-10-03 13:47:15 ERROR i.a.s.a.JobScheduler(run):86 - {workspace_app_root=/workspace/scheduler/logs} - Job Scheduler Error
java.lang.RuntimeException: io.airbyte.validation.json.JsonValidationException: json schema validation failed. 
errors: $.workspaceId: is missing but it is required

Kriti (Postman)

10/04/2021, 8:06 AM

Is this your first time deploying Airbyte: Yes OS Version / Instance: t3.medium Memory / Disk: - Deployment: Kubernetes Airbyte Version: 0.29.21-alpha Source name/version: - Destination name/version: Redshift 0.3.14 Step: Have entered input (database hose, port, username ...) on setup destination page & clicked the

Setup destination

button. The process

Testing connection

is running Description: I am trying to setup Redshift destination. The cluster IP is whitelisted for Redshift access. The

Testing connection

process keeps running for 10 mins (maybe more) and then stops with this message.

Namer Medina

10/04/2021, 2:26 PM

Hello -- I am getting an error when trying to configure Big Query Destination has anyone seen this error?

Kriti (Postman)

10/04/2021, 5:00 PM

Setup destination

button. Description: Destination setup fails because I get permission deny on the database. The credentials I am using do have permission to access the database.

Jeff Crooks

10/05/2021, 1:05 PM

Notified the lookback window is now present in Stripe source, but doesnt appear I can apply the change?

Philippe Boyd

10/05/2021, 7:39 PM

is there a way to increase the batch size of records read (currently 1000) increasing the batch size would reduce the number of network calls

Copy code

2021-10-05 19:38:22 INFO () DefaultReplicationWorker(lambda$getReplicationRunnable$2):203 - Records read: 67000
2021-10-05 19:38:22 INFO () DefaultReplicationWorker(lambda$getReplicationRunnable$2):203 - Records read: 68000
2021-10-05 19:38:23 INFO () DefaultReplicationWorker(lambda$getReplicationRunnable$2):203 - Records read: 69000
2021-10-05 19:38:23 INFO () DefaultReplicationWorker(lambda$getReplicationRunnable$2):203 - Records read: 70000
2021-10-05 19:38:24 INFO () DefaultReplicationWorker(lambda$getReplicationRunnable$2):203 - Records read: 71000

When dealing with tables that have millions of records, it will have a good performance effect to increase batch size

Jorge Castillo

10/05/2021, 8:56 PM

Hi! Are there any plans for supporting objects of type balance (GET /v1/balance) for the Stripe connector? I see it currently supports balance transactions , but I cant find info regarding retrieving the Stripe balance.

Boopathy Raja

10/06/2021, 1:59 AM

VERY URGENT CRITICAL ⚠️ Hi, I was testing a mysql to Bigquery using CDC in airbyte. I put a table for CDC and configured it as Incremental model. When I ran the pipeline for the first time, it gets the snapshot of the table good. But, for the subsequent runs the new records where not captured. I used both Bigquery and bigquery-denormalised destinations. And the behaviour is same for both. Earlier when I was using the older version of airbyte, it was fine. But now, this issue is not solvable for me. Here is the log, in which the pipeline run successfuly, but failed to capture the new records.

Zach Brak

10/06/2021, 2:04 AM

Documentation clarification: I’m seeing the note about Kubernetes external DB configuration here - being dependent on an issue now closed. Am I correct to assuming that the documentation about connecting to external postgres is applicable to Kubernetes (specifically GKE) environments?

Siavoush Mohammadi

10/07/2021, 10:28 AM

Hi! One thing i miss is flattening of the JSON outputs from airbyte, automatic without being forced to write the transformations yourself. This would create tons of value and simplify integration into tabular environments 🙂

Tuhin Banerjee

10/08/2021, 4:53 AM

Hi! I am trying out Airbyte and have few questions regarding data compression I am using Airbyte to load the data into s3 parquet file. Just wondering if Airbyte has native support to generate Arrow file?

Vláďa Macek

10/09/2021, 6:53 PM

Is it possible the Cloud (hosted Airbyte instance) uses Workspaces to isolate customers?

Boopathy Raja

10/10/2021, 2:54 AM

Hi, there was this one error, I could not find what is cause of it. It show the following error

Copy code

ERROR () LineGobbler(voidCall):65 - Exception in thread "main" java.lang.RuntimeException: com.google.cloud.bigquery.BigQueryException: Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the errors[] collection for more details.

Bruno Quinart

10/10/2021, 8:03 PM

Hello! I am wondering if something like "remote workers" is being considered. Hosting Airbyte on the cloud is great. However, in large companies you will often also have a bunch of sources on premise behind a firewall. Opening the firewall towards the cloud is not always going to be possible (even through private connections). In many solutions, you see a pattern where some "gateway" or "agent" is deployed on premise which polls the main environment (in the cloud) for work and executes any work it gets. It seems this would also be an interesting model for Airbyte. Manage your EL pipelines all on the same place, but allow some workers to execute for example on an on premise Kubernetes (and only needing to open https from inside to outside / depending on targets). I did a quick search on Airbyte docs and GitHub but didn't find anything related to something like this.

Julien Bovet

10/11/2021, 12:43 PM

Hello guys! Small feedback regarding the GCP Compute Engine deployment tutorial on Airbyte docs. It's mentionned that a e2.medium is sufficient for testing, but the default 10Go disk space turned out to be an issue even if I have only one connector activated... Ended up on https://docs.airbyte.io/operator-guides/scaling-airbyte#disk-space after a quick search and increased the disk space to 100GBs for comfort. Maybe adding that tip in the deployment tutorial would avoid some quick bottleneck when testing 😉. Thanks!

Matt Arderne

10/11/2021, 4:29 PM

Is it possible to store the config for Airbyte instances as code? Would hope to be able to track changes to sources/connections etc as code similar to a dbt project

Patrick McKinley

10/11/2021, 6:59 PM

Is it possible to control airbyte via API? I’m looking at this and we handle 99% of our infra using terraform, and it would be super handy to be able to do the same with AirByte - so we’re not a little locked into manual unversioned configuration

Tuhin Banerjee

10/12/2021, 8:09 AM

Hello Team! Is it possible to have a data preview during the discover schema phase so we load the data -> preview data ( say 50 rows ) -> discover schema -> select schema / apply transformation -> destination

Sathiya Sarathi Gunasekaran

10/12/2021, 10:52 AM

Team, documentation related to Shopify in connector catalog seems to be missing https://docs.airbyte.io/integrations/sources/shopify