Airbyte #advice-data-ingestion

Charles VERLEYEN

05/24/2022, 4:57 PM

Hello, We have set up a connection to a postgresql database on AWS with sink to BigQuery. All is working well except that Airbyte is creating one dataset per schema we have in postgreSQL. Is it possible to force somehow Airbyte to write all the data in one unique BigQuery dataset and not create multiple datasets ?

Eugene Krall

05/25/2022, 10:11 AM

If I update to a newer version of airbyte, will I have to reset my data storage?

Damian Crisafulli

05/25/2022, 1:08 PM

Hey everyone, I’ve got a Freshdesk -> Redshift connection failing on the normalization step. Airbyte Version: 0.39.1-alpha Source: Freshdesk (0.2.11) Destination: Redshift (0.3.35) Environment: k8s Here is the error from the logs:

Copy code

2022-05-25 12:06:31 ^[[42mnormalization^[[0m > 12:06:31  Finished running 12 incremental models in 205.05s.
2022-05-25 12:06:31 ^[[42mnormalization^[[0m > 12:06:31
2022-05-25 12:06:31 ^[[42mnormalization^[[0m > 12:06:31  Completed with 1 error and 0 warnings:
2022-05-25 12:06:31 ^[[42mnormalization^[[0m > 12:06:31
2022-05-25 12:06:31 ^[[42mnormalization^[[0m > 12:06:31  Database Error in model freshdesk_tickets (models/generated/airbyte_incremental/airbyte/freshdesk_tickets.sql)
2022-05-25 12:06:31 ^[[42mnormalization^[[0m > 12:06:31    Invalid input
2022-05-25 12:06:31 ^[[42mnormalization^[[0m > 12:06:31    DETAIL:
2022-05-25 12:06:31 ^[[42mnormalization^[[0m > 12:06:31      -----------------------------------------------
2022-05-25 12:06:31 ^[[42mnormalization^[[0m > 12:06:31      error:  Invalid input
2022-05-25 12:06:31 ^[[42mnormalization^[[0m > 12:06:31      code:      8001
2022-05-25 12:06:31 ^[[42mnormalization^[[0m > 12:06:31      context:   CONCAT() result too long for type varchar(65535)
2022-05-25 12:06:31 ^[[42mnormalization^[[0m > 12:06:31      query:     10971779
2022-05-25 12:06:31 ^[[42mnormalization^[[0m > 12:06:31      location:  string_ops.cpp:110
2022-05-25 12:06:31 ^[[42mnormalization^[[0m > 12:06:31      process:   query2_114_10971779 [pid=31540]
2022-05-25 12:06:31 ^[[42mnormalization^[[0m > 12:06:31      -----------------------------------------------
2022-05-25 12:06:31 ^[[42mnormalization^[[0m > 12:06:31
2022-05-25 12:06:31 ^[[42mnormalization^[[0m > 12:06:31  Done. PASS=11 WARN=0 ERROR=1 SKIP=0 TOTAL=12

Is there a way to configure normalization such that it truncates values that are too long?

Ben Jordan

05/25/2022, 2:25 PM

Hi, we're using the Google Ads connector which works well, but the token expired and it does not seem Airbyte is refreshing it. Does anyone know if this is a bug or not yet implemented? Why provide the refresh token if it won't be used?

Simon Thelin

05/25/2022, 4:24 PM

Hello. I have a quick question (hope it is the right place). Is it possible to set partitioning when loading data as parquet to S3? In my case I have a connection of

Postgres -> S3

. It feels a bit off this functionality is not there since this is a quite natural thing you might want to do? I can’t find any setting for it currently. Cheers

Coşkan Selçuk

05/25/2022, 5:55 PM

Hi all. I am a Data Solutions Architecture Manager from Turkey. I needed to consume data from an API. I am aware that the HTTP source is graveyarded. However it is really high maintenance to develop my own connector using the SDK when I only need to send a simple http request and receive a json file as a result. I would appreciate a "Walking Dead" episode regarding on HTTP source.

🙏 1

Yifan Sun

05/26/2022, 12:50 AM

Hey guys! I'm building a source-workday connector using Singer. When I run the 'pip install -r requirement.txt', it always gave me this ResolutionTooDeep(max_rounds) error. Any one know why? I didn't modify setup.py file, but shall I modify it?

Sergi Gómez

05/26/2022, 10:24 AM

Hey Product folks working at Airbyte! When will it be possible to add a new table in a source without having to refresh the whole source schema and therefore having to go cherrypick them one by one over and over again. I am a hardcore Airbyte user (I’ve implemented it across many teams and companies), and I beg you for this feature, the cherrypicking is REALLY annoying. Else please let me know where I should address to any other channel or platform. Thanks for making my voice be heard!! 🙏

❤️ 1

🙏 2

Hawkar Mahmod

05/26/2022, 3:02 PM

hey folks, I’m not sure if this is the best channel, it feels like an ingestion concern but please let me know if you think it belongs elsewhere. I setup an Airbyte server over a year a go to get data from a production Aurora (MySQL) database. We had some big tables (10M+) and Airbyte really struggled with them. I was told Airbyte wasn’t very suitable for such loads at that time. I wonder if the state of things has changed and we can confidently use Airbyte to replicate data between this database and Snowflake/Redshift on a regular basis (every hour). Most tables don’t grow that fast it’s just handful of tables that might have 100k records per day.

Yifan Sun

05/26/2022, 10:19 PM

Hey folks, I'm building a source-workday connector with airbyte cdk(python). In the README, the step "build via Gradle" needs a java JDK to build. Is this a requisite for or a equivalent to later build docker-image step? Thanks!

Pascal Cohen

05/27/2022, 12:29 PM

Hi I would like to set up a Custom connection toward a GRPC service. I did not find any relevant existing source so I tried to start my own Custom source - as this could be in any case a good exercise I started with the generate.sh to scaffold a Python project I wonder how to deal with the state passed in parameter to the read method in order to deal with incremental behavior Typically the signature is:

Copy code

def read(
    self, logger: AirbyteLogger, config: json, catalog: ConfiguredAirbyteCatalog, state: Dict[str, any]
) -> Generator[AirbyteMessage, None, None]:

And when I return the AirbyteMessage there are several places where I can return the state:

Copy code

yield AirbyteMessage(
    type=Type.RECORD,
    record=AirbyteRecordMessage(stream=stream_name, data=data, emitted_at=int(datetime.now().timestamp()) * 1000,),
    state=AirbyteStateMessage(data= XXX,
                              global_=YYY,
                              streams=[ZZZ])
)

I am not sure how to deal with that. Furthermore the documentation states that I should deal with state on my own but what is the point to pass a state in that case ? I think I missed something Any advice on best practice to persist and retrieve the state ? In my test case, I simply want to use an incremental id to ask for all the ids after this one and store this as a cursor for next read Thanks for any help

Yudian

05/27/2022, 8:21 PM

Hi airbyte team, not sure if this is the right channel. Recently I am working on a sync job between two Snowflakes (both src and target are Snowflake). However, if the source table is big (e.g., > 200GB), then the job often failed. I just pasted the part that the job triggered some errors, looks like you stored big chunk of temporary data in some temporary place and later cannot fetch it?

Copy code

2022-05-23 03:24:13 [32mINFO[m i.a.w.DefaultReplicationWorker(lambda$getReplicationRunnable$5):301 - Records read: 76853000 (223 GB)
2022-05-23 03:24:13 [32mINFO[m i.a.w.DefaultReplicationWorker(lambda$getReplicationRunnable$5):301 - Records read: 76854000 (223 GB)
2022-05-23 03:24:14 [44msource[0m > May 23, 2022 3:24:14 AM net.snowflake.client.jdbc.RestRequest execute
2022-05-23 03:24:14 [44msource[0m > SEVERE: Error response: HTTP Response code: 403, request: GET <https://sfc-ds1-customer-stage.s3.us-west-2.amazonaws.com/nwmj-s-HIDDEN/results/01a47188-0604-2855-0004-1504c0d4959b_0/main/data_0_6_54?x-amz-server-side-encryption-customer-algorithm=AES256&response-content-encoding=gzip&AWSAccessKeyId=****&Expires=1653276240&Signature=****> HTTP/1.1
2022-05-23 03:24:14 [44msource[0m > May 23, 2022 3:24:14 AM net.snowflake.client.jdbc.DefaultResultStreamProvider getInputStream
2022-05-23 03:24:14 [44msource[0m > SEVERE: Error fetching chunk from: <https://sfc-ds1-customer-stage.s3.us-west-2.amazonaws.com/nwmj-s-HIDDEN/results/01a47188-0604-2855-0004-1504c0d4959b_0/main/data_0_6_54?x-amz-server-side-encryption-customer-algorithm=AES256&response-content-encoding=gzip&AWSAccessKeyId=****&Expires=1653276240&Signature=****>
2022-05-23 03:24:14 [44msource[0m > May 23, 2022 3:24:14 AM net.snowflake.client.jdbc.SnowflakeUtil logResponseDetails
2022-05-23 03:24:14 [44msource[0m > SEVERE: Response status line reason: Forbidden
2022-05-23 03:24:14 [44msource[0m > May 23, 2022 3:24:14 AM net.snowflake.client.jdbc.SnowflakeUtil logResponseDetails
2022-05-23 03:24:14 [44msource[0m > SEVERE: Response content: <?xml version="1.0" encoding="UTF-8"?>
2022-05-23 03:24:14 [44msource[0m > <Error><Code>AccessDenied</Code><Message>Request has expired</Message><Expires>2022-05-23T03:24:00Z</Expires><ServerTime>2022-05-23T03:24:15Z</ServerTime><RequestId>KVT0V6FQ3SBDN3VR</RequestId><HostId>5Oztd4n8a6vWsAIHnaNKLMkXfmyYdQS9zwGpS1ebyb1E8JWxqZT8FFCwJWltzEm6hHOUsHnvGMg=</HostId></Error>
2022-05-23 03:24:14 [44msource[0m > May 23, 2022 3:24:14 AM net.snowflake.client.jdbc.RestRequest execute
2022-05-23 03:24:14 [44msource[0m > SEVERE: Error response: HTTP Response code: 403, request: GET <https://sfc-ds1-customer-stage.s3.us-west-2.amazonaws.com/nwmj-s-HIDDEN/results/01a47188-0604-2855-0004-1504c0d4959b_0/main/data_0_6_54?x-amz-server-side-encryption-customer-algorithm=AES256&response-content-encoding=gzip&AWSAccessKeyId=****&Expires=1653276240&Signature=****> HTTP/1.1
2022-05-23 03:24:14 [44msource[0m > May 23, 2022 3:24:14 AM net.snowflake.client.jdbc.DefaultResultStreamProvider getInputStream
2022-05-23 03:24:14 [44msource[0m > SEVERE: Error fetching chunk from: <https://sfc-ds1-customer-stage.s3.us-west-2.amazonaws.com/nwmj-s-HIDDEN/results/01a47188-0604-2855-0004-1504c0d4959b_0/main/data_0_6_54?x-amz-server-side-encryption-customer-algorithm=AES256&response-content-encoding=gzip&AWSAccessKeyId=****&Expires=1653276240&Signature=****>
2022-05-23 03:24:14 [44msource[0m > May 23, 2022 3:24:14 AM net.snowflake.client.jdbc.SnowflakeUtil logResponseDetails
2022-05-23 03:24:14 [44msource[0m > SEVERE: Response status line reason: Forbidden
2022-05-23 03:24:14 [44msource[0m > May 23, 2022 3:24:14 AM net.snowflake.client.jdbc.SnowflakeUtil logResponseDetails
2022-05-23 03:24:14 [44msource[0m > SEVERE: Response content: <?xml version="1.0" encoding="UTF-8"?>
2022-05-23 03:24:14 [44msource[0m > <Error><Code>AccessDenied</Code><Message>Request has expired</Message><Expires>2022-05-23T03:24:00Z</Expires><ServerTime>2022-05-23T03:24:15Z</ServerTime><RequestId>KVT87B5B2XRVG33J</RequestId><HostId>K7nziICuSHtr4I40+W08RwiAcd2seylrpGlT5gs36PX0DX7tIhZDsFgWcV1MplB+xDtZ93fADns=</HostId></Error>
2022-05-23 03:24:14 [44msource[0m > May 23, 2022 3:24:14 AM net.snowflake.client.jdbc.RestRequest execute
2022-05-23 03:24:14 [44msource[0m > SEVERE: Error response: HTTP Response code: 403, request: GET <https://sfc-ds1-customer-stage.s3.us-west-2.amazonaws.com/nwmj-s-HIDDEN/results/01a47188-0604-2855-0004-1504c0d4959b_0/main/data_0_6_54?x-amz-server-side-encryption-customer-algorithm=AES256&response-content-encoding=gzip&AWSAccessKeyId=****&Expires=1653276240&Signature=****> HTTP/1.1
2022-05-23 03:24:14 [44msource[0m > May 23, 2022 3:24:14 AM net.snowflake.client.jdbc.DefaultResultStreamProvider getInputStream
2022-05-23 03:24:14 [44msource[0m > SEVERE: Error fetching chunk from: <https://sfc-ds1-customer-stage.s3.us-west-2.amazonaws.com/nwmj-s-HIDDEN/results/01a47188-0604-2855-0004-1504c0d4959b_0/main/data_0_6_54?x-amz-server-side-encryption-customer-algorithm=AES256&response-content-encoding=gzip&AWSAccessKeyId=****&Expires=1653276240&Signature=****>
2022-05-23 03:24:14 [44msource[0m > May 23, 2022 3:24:14 AM net.snowflake.client.jdbc.SnowflakeUtil logResponseDetails
2022-05-23 03:24:14 [44msource[0m > SEVERE: Response status line reason: Forbidden

If I reduced the source table size to be within 100GB, then there is no problem. Would like to get some feedbacks / suggestions based on this. Thank you!

Siddharth Putuvely

05/28/2022, 9:31 AM

Hi Airbyte Team, I am trying to set an append+dedup history connection on a table. I see that all records are present in the _airbyte_raw_table but they are not seen in the scd table and the final table. The cursor field I am using is the "*updated_at*" column.

Copy code

select max(updated_at) from "PROD_DB"."SOURCE_SCHEMA"."CYCLES";
-> 2022-05-27T05:56:21.565000
select max(updated_at) from "PROD_DB"."SOURCE_SCHEMA"."CYCLES_SCD";
->2022-05-27T05:56:21.565000;
select max(_AIRBYTE_DATA:updated_at) from "PROD_DB"."SOURCE_SCHEMA"."_AIRBYTE_RAW_CYCLES";
-> 2022-05-28T08:04:19.061000;

Can anybody explain what I am missing? Airbyte version : 0.32.5

Ben Nicole

05/29/2022, 6:03 PM

Hi, if there is a new column added to the source table, would it be reflected automatically in the target side without manual changing Airbyte setting? If it's not, what is the best way to cater this DDL update? kindly advice . Thank you 🙏

Magnus Berg Sletfjerding

05/30/2022, 11:24 AM

Hey Airbyte teams 😄 We’re looking for a solution for ingesting data from our CockroachDB. In the Airbyte CockroachDB docs, I can’t tell whether the CockroachDB cluster needs to use the CRDB Enterprise license or not to work with the Airbyte connector. Do you know where I could find this information? 🙏

Ramon Vermeulen

05/30/2022, 12:05 PM

I am having an API with for each model 3 different endpoints: •

/data.xml

Giving back all records until now •

/updates.xml

Giving back all updates since a certain point in time •

/deletes.xml

Giving back all deletes since a certain point in time What are the best practices setting up an incremental sync in Python knowing the concept of these 3 endpoints. With only updates it was easy, and I could use the incremental sync - deduped history concept I suppose after reading about the sync modes. But how can I implement this if I want to also manage incremental deletes? Or is the only way with this set-up to use a full refresh every time, and incremental isn't possible? Or is the idea that I should add another field to the model in the data warehouse, for instance deleted true/false, and handle the "deletes" as an actual update to the records where deleted is set to true? The upside to this is that you still have those records in your data warehouse. Does anyone know any connectors with similar behavior in airbyte (python), would be nice to take a look at an example implementation.

Apostol Tegko

05/31/2022, 8:55 AM

Hey All 👋, We’ve updated yesterday from 0.35.29 -> 0.39.5 Since, we’re not seeing any sources/destinations under

settings -> sources

Looking at requests, it seems that this request is not returning any items:

<http://localhost:8000/api/v1/source_definitions/list_for_workspace>

Same for destination endpoint as well. Can’t see any errors in the server logs either. Do you have any advice?

⚠️ 1

Vytautas Bartkevičius

06/01/2022, 5:31 AM

Hello Airbyte community. I have question, why Airbyte deletes all the data after editing connection streams? For example if I add new stream to connection I’m getting such message:

WARNING! Updating the schema will delete all the data for this connection in your destination and start syncing from scratch

. Why is that? Why the data is deleted from all streams, not only from new one, but also from currently existing. So after this I need to collect all data from from scratch? Or how I could prevent from this?

Pranav Hegde

06/02/2022, 5:33 AM

Hey all, we are using Airbyte to ingest data from mixpanel to bigquery. It was working fine till now, however for the past few days we are getting Schema Validation error

Copy code

2022-06-02 05:27:08 [32mINFO[m i.a.v.j.JsonSchemaValidator(test):56 - JSON schema validation failed. 
errors: $: null found, object expected
2022-06-02 05:27:08 [1;31mERROR[m i.a.w.p.a.DefaultAirbyteStreamFactory(lambda$create$1):70 - Validation failed: null
2022-06-02 05:27:08 [43mdestination[0m > 2022-06-02 05:27:08 [32mINFO[m i.a.i.d.b.u.AbstractBigQueryUploader(uploadData):99 - Final state message is accepted.
2022-06-02 05:27:08 [43mdestination[0m > 2022-06-02 05:27:08 [32mINFO[m i.a.i.d.b.u.AbstractBigQueryUploader(dropTmpTable):111 - Removing tmp tables...
2022-06-02 05:27:08 [43mdestination[0m > 2022-06-02 05:27:08 [32mINFO[m i.a.i.d.b.u.AbstractBigQueryUploader(dropTmpTable):113 - Finishing destination process...completed
2022-06-02 05:27:08 [43mdestination[0m > 2022-06-02 05:27:08 [32mINFO[m i.a.i.d.b.u.AbstractBigQueryUploader(close):85 - Closed connector: AbstractBigQueryUploader{table=_airbyte_raw_indodana_mixpanel_export, tmpTable=_airbyte_tmp_mbo_indodana_mixpanel_export, syncMode=WRITE_APPEND, writer=class io.airbyte.integrations.destination.bigquery.writer.BigQueryTableWriter, recordFormatter=class io.airbyte.integrations.destination.bigquery.formatter.DefaultBigQueryRecordFormatter}
2022-06-02 05:27:08 [43mdestination[0m > 2022-06-02 05:27:08 [32mINFO[m i.a.i.b.IntegrationRunner(runInternal):171 - Completed integration: io.airbyte.integrations.destination.bigquery.BigQueryDestination
2022-06-02 05:27:08 [1;31mERROR[m i.a.w.DefaultReplicationWorker(run):141 - Sync worker failed.
java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: Source process exited with non-zero exit code 137
	at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396) ~[?:?]
	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073) ~[?:?]
	at io.airbyte.workers.DefaultReplicationWorker.run(DefaultReplicationWorker.java:134) ~[io.airbyte-airbyte-workers-0.35.2-alpha.jar:?]
	at io.airbyte.workers.DefaultReplicationWorker.run(DefaultReplicationWorker.java:49) ~[io.airbyte-airbyte-workers-0.35.2-alpha.jar:?]
	at io.airbyte.workers.temporal.TemporalAttemptExecution.lambda$getWorkerThread$2(TemporalAttemptExecution.java:174) ~[io.airbyte-airbyte-workers-0.35.2-alpha.jar:?]
	at java.lang.Thread.run(Thread.java:833) [?:?]
	Suppressed: io.airbyte.workers.WorkerException: Source process exit with code 137. This warning is normal if the job was cancelled.
		at io.airbyte.workers.protocols.airbyte.DefaultAirbyteSource.close(DefaultAirbyteSource.java:136) ~[io.airbyte-airbyte-workers-0.35.2-alpha.jar:?]
		at io.airbyte.workers.DefaultReplicationWorker.run(DefaultReplicationWorker.java:118) ~[io.airbyte-airbyte-workers-0.35.2-alpha.jar:?]
		at io.airbyte.workers.DefaultReplicationWorker.run(DefaultReplicationWorker.java:49) ~[io.airbyte-airbyte-workers-0.35.2-alpha.jar:?]
		at io.airbyte.workers.temporal.TemporalAttemptExecution.lambda$getWorkerThread$2(TemporalAttemptExecution.java:174) ~[io.airbyte-airbyte-workers-0.35.2-alpha.jar:?]
		at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: Source process exited with non-zero exit code 137
	at io.airbyte.workers.DefaultReplicationWorker.lambda$getReplicationRunnable$2(DefaultReplicationWorker.java:230) ~[io.airbyte-airbyte-workers-0.35.2-alpha.jar:?]
	at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
	... 1 more
Caused by: java.lang.RuntimeException: Source process exited with non-zero exit code 137
	at io.airbyte.workers.DefaultReplicationWorker.lambda$getReplicationRunnable$2(DefaultReplicationWorker.java:222) ~[io.airbyte-airbyte-workers-0.35.2-alpha.jar:?]
	at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
	... 1 more

Would appreciate any help regarding this issue. We are using the latest version of the mixpanel and bigquery connectors

raphaelauv

06/02/2022, 6:51 PM

I'm trying to setup salesforce CDC -> bigquery the salesforce airbyte connector is not compatible with CDC for salesforce is that correct ? thanks

06/02/2022, 7:37 PM

Hi, I'm attempting to use airbyte to pull data from a smartsheet table into a postgresql db. Connectivity appears to work fine, however when I check the data imported into Postgresql, the values are placed under the wrong columns. I searched around on the internet but was unable to find any relevant answers. Can anyone in here help me sort this out? it also appears that I'm not the only person who has experienced this, but i'm failing to find any resolutions: https://github.com/airbytehq/airbyte/issues/5520

HKR

06/03/2022, 11:13 AM

Hey, sry if this has been asked before, but I can't find an answer to that: Is Airbyte able to ingest an unstructured file somehow? My use case would be to download the file from some URL and save it either in the file system or mongodb

Adam Bloom

06/03/2022, 4:32 PM

Hi folks, Not sure if this is the right channel, so feel free to redirect me elsewhere. We've been working with airbyte for a few weeks now quite heavily and are preparing to deploy in k8s in our production environment. Due to the type of data we handle, we have very strict security requirements. Even though this lives in our own k8s cluster, we require all traffic between pods to be encrypted - this of course will include the airbyte source and destination pods. We attempted to do this with a proxy/service mesh, but discovered that the airbyte protocol is not very proxy friendly (specifically, airbyte-worker -> destination containers when no records are found in the source). We're willing to implement TLS with a self-signed CA (we've used https://cert-manager.io/ elsewhere) between airbyte-worker and the remote-stdin and relay-stdout/relay-stderr containers. Before doing so, we wanted to make sure that we weren't missing a roadmap item to address this (we noticed that the blog post describing the current architecture refers to a future v2...). We also wanted to ensure that this change is one that airbyte would be willing to accept (presumably needs to be configurable and not on by default).

ijac wei

06/06/2022, 2:53 AM

Hi, does anyone ingest data from Notion? Does anyone know what is the id of “_airbyte_data” for? I though it is “_airbyte_ab_id” but not.

Bastien Gandouet

06/06/2022, 3:40 PM

Hi everyone! My Amplitude source is stuck in an infinite loop as described here, is anyone else experiencing that? Any workaround?

João Pedro Smielevski Gomes

06/06/2022, 6:45 PM

Hi everyone.I'm trying to ingest a custom model from Zoho CRM, but it is always raising the IncompleteMetaDataException. The weird thing is that some custom models are being loaded, but the one we need only loads 2 old test records and the table name comes with the "x" prefix, which I believe has something to do with incremental sync. We are using a supper admin account with all the permissions listed on the documentation. We've also tried to add the stream on the configured_catalog.json and the test_stream_factory.py (even if from what I understand it should build the schemas automatically) and rebuild the image, but it did not work. Am I missing some config or, since the connector is still on alpha, do I have to write new code to pull the custom modules?

Prashant Golash

06/07/2022, 3:03 AM

Hi, I would like to know if there is a programmatic way to fetch sync history for a particular connection (Context - I am planning to add some monitoring/notifications on periodic basis). If there are any other suggestions, please let me know them as well

gunu

06/07/2022, 3:57 AM

hey team, can we please sort out MySQL source CDC issues for large tables. Im not sure any airbyte user is able to successfully perform DB replication on large table using CDC. and the only real benefit of CDC is when migrating large datasets. is it on the roadmap anywhere? wasn’t able to see it here and it feels like a key feature to airbyte (at least when comparing against competitors). I keep stumbling upon this article Airbyte Commoditizes Database Replication by Open-Sourcing Log-Based Change Data Capture which ideally I’d like to champion to other users interested in implementing airbyte

Kishore Sahoo

06/07/2022, 6:04 AM

Anyone used Airbyte to send data to API i.e. API as a destinations?

Abhiruchi Shinde

06/08/2022, 3:09 AM

hi Team I have created a connector In Airbyte and trying to pull data from zendesk to SQL but the incremental load is failing with normalization error can someone please help me with the error