https://linen.dev logo
Join Slack
Powered by
# advice-data-ingestion
  • b

    Boggdan Barrientos

    04/21/2022, 8:53 PM
    Hello everyone! Does anyone know the best way to pass a password containing ##. I am trying to use the Source File but I am getting an error on authentication. I had written this thread, but it has no solution yet I think. https://airbytehq.slack.com/archives/C01MFR03D5W/p1643925276635039
    a
    • 2
    • 1
  • m

    Michael Gao

    04/21/2022, 8:58 PM
    Hi, I’m having trouble setting up the Bigcommerce source connector.
    The connection tests failed.
    HTTPError('403 Client Error: Forbidden for url:
    Connecting the Bigcommerce via their python api works fine using the same credentials. I also noticed that the python asks for client_id as a parameter whereas i don’t see it as an option in Airbyte.
    a
    • 2
    • 1
  • a

    Akilesh V

    04/22/2022, 1:10 PM
    Hi, while updating
    schema source
    in sales force i am getting
    Internal Server Error: Cannot invoke "io.airbyte.api.model.AirbyteCatalog.getStreams()" because "discovered" is null
    this error.
    a
    • 2
    • 1
  • b

    Bijan Reza Soltani Hosseini

    04/25/2022, 10:17 AM
    Topic: Google Ads - Failing at step 5 of the Setup Guide, at creation of an OAuth consent screen Hey all, I'm having trouble creating the OAuth client credentials. I was hoping you may be able to help me out: • Google's instructions ask me to create a client ID and client secret • In Step 2a of that creation flow, it asks me to configure an OAuth consent screen • I cannot do this ◦ I cannot choose "internal" as project type because we don't use Google Workspace, and apparently this doesn't work without it ◦ If I choose "External", it doesn't save but it also doesn't give me a proper error message why (it only says it failed and the error message suggests there should be another error message elsewhere on the page, but there isn't) • Hence, I cannot create my Client ID and Client Secret 😕 Has someone had this issue before? Any idea how to solve it?
    a
    • 2
    • 1
  • a

    Akhtar Bhat

    04/26/2022, 12:13 PM
    Hello Everyone ! We have started exploring the airbyte where we have setup a connection as SQS-->AirByte-->S3. It works fine but the output from AirByte are stored as unreadable files. Afterwards, We downloaded these files from S3 and appended the files with ".gz" which helped to extract the file and view the contents inside it. I couldn't find any configuration which can be modified to fix it. Can someone please guide me if this is a default feature or am I doing something wrong here. Thanks in advance.
    ✅ 1
    j
    c
    +2
    • 5
    • 8
  • y

    Yudian

    04/26/2022, 10:43 PM
    Hello everyone! Not sure whether this is the right place, I am asking what’s the behavior if we perform incremental sync for every 15min, while what if the 1st initial sync will take more than 15min? Will airbyte wait for the 1st initial sync to be done, before running the incremental syncs?
    ✅ 1
    a
    • 2
    • 1
  • y

    Yudian

    04/26/2022, 11:12 PM
    BTW, for the incremental sync, airbyte requires to define a cursor over destination’s target table. Suppose my destination is a postgreSQL table (called “abc”) with the column name “start_time” as cursor. Could someone point me to the code / query that airbyte will get the highest value of “start_time” in each incremental sync? Is it running some query like
    select max(start_time) from abc
    ?
    ✅ 1
    a
    • 2
    • 3
  • h

    Hans Sucerquia

    04/27/2022, 5:20 PM
    Hello everyone, We are on the early steps on setting up airbyte open-sources connection, from Hubspot to Snowflake(GCP Server) but we are having the following error:
    Copy code
    2022-04-26 21:01:07 INFO i.a.v.j.JsonSchemaValidator(test):56 - JSON schema validation failed. 
    ....	
    at sun.nio.ch.NioSocketImpl.beginWrite(NioSocketImpl.java:366) ~[?:?]
    		at sun.nio.ch.NioSocketImpl.implWrite(NioSocketImpl.java:411) ~[?:?]
    		at sun.nio.ch.NioSocketImpl.write(NioSocketImpl.java:440) ~[?:?]
    		at sun.nio.ch.NioSocketImpl$2.write(NioSocketImpl.java:826) ~[?:?]
    		at java.net.Socket$SocketOutputStream.write(Socket.java:1035) ~[?:?]
    		at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:234) ~[?:?]
    		at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:313) ~[?:?]
    		at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:318) ~[?:?]
    		at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:160) ~[?:?]
    		at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:248) ~[?:?]
    		at java.io.BufferedWriter.flush(BufferedWriter.java:257) ~[?:?]
    		at io.airbyte.workers.protocols.airbyte.DefaultAirbyteDestination.notifyEndOfStream(DefaultAirbyteDestination.java:98) ~[io.airbyte-airbyte-workers-0.35.65-alpha.jar:?]
    		at io.airbyte.workers.protocols.airbyte.DefaultAirbyteDestination.close(DefaultAirbyteDestination.java:111) ~[io.airbyte-airbyte-workers-0.35.65-alpha.jar:?]
    		at io.airbyte.workers.DefaultReplicationWorker.run(DefaultReplicationWorker.java:126) ~[io.airbyte-airbyte-workers-0.35.65-alpha.jar:?]
    		at io.airbyte.workers.DefaultReplicationWorker.run(DefaultReplicationWorker.java:57) ~[io.airbyte-airbyte-workers-0.35.65-alpha.jar:?]
    		at io.airbyte.workers.temporal.TemporalAttemptExecution.lambda$getWorkerThread$2(TemporalAttemptExecution.java:155) ~[io.airbyte-airbyte-workers-0.35.65-alpha.jar:?]
    		at java.lang.Thread.run(Thread.java:833) [?:?]
    2022-04-26 21:01:23 INFO i.a.w.DefaultReplicationWorker(run):228 - sync summary: io.airbyte.config.ReplicationAttemptSummary@4b54d8af[status=cancelled,recordsSynced=215387,bytesSynced=4005240457,startTime=1651003433015,endTime=1651006883193,totalStats=io.airbyte.config.SyncStats@49e93c6a[recordsEmitted=215387,bytesEmitted=4005240457,stateMessagesEmitted=0,recordsCommitted=0],streamStats=[io.airbyte.config.StreamSyncStats@12e1e49a[streamName=companies,stats=io.airbyte.config.SyncStats@474eb705[recordsEmitted=132857,bytesEmitted=1408670912,stateMessagesEmitted=<null>,recordsCommitted=<null>]], io.airbyte.config.StreamSyncStats@2a328ee3[streamName=campaigns,stats=io.airbyte.config.SyncStats@2ab8fda3[recordsEmitted=1257,bytesEmitted=621001,stateMessagesEmitted=<null>,recordsCommitted=<null>]], io.airbyte.config.StreamSyncStats@37e0095a[streamName=contact_lists,stats=io.airbyte.config.SyncStats@524763c[recordsEmitted=1574,bytesEmitted=12505286,stateMessagesEmitted=<null>,recordsCommitted=<null>]], io.airbyte.config.StreamSyncStats@3b5fb5ed[streamName=contacts,stats=io.airbyte.config.SyncStats@56e712c0[recordsEmitted=79699,bytesEmitted=2583443258,stateMessagesEmitted=<null>,recordsCommitted=<null>]]]]
    2022-04-26 21:01:23 INFO i.a.w.DefaultReplicationWorker(run):250 - Source did not output any state messages
    2022-04-26 21:01:23 WARN i.a.w.DefaultReplicationWorker(run):258 - State capture: No new state, falling back on input state: io.airbyte.config.State@5df40ade[state={}]
    We've already tried 2 times recreating the connections and doing a fresh start on the Kubernetes configuration. Any clues on what could be the issue and how to solve it? I've attached the log for the whole session , this source has an estimated size of 25 GB.
    logs-20.txt
    ✅ 1
    a
    • 2
    • 1
  • j

    Jan Cienciala

    04/28/2022, 3:40 PM
    Hey guys, I’m trying to find whether MongoDB source integration in Airbyte supports MongoDB version 5.0. Does anyone know?
    a
    • 2
    • 2
  • a

    Adrian Chan

    04/28/2022, 4:07 PM
    I’m new to airbyte~ I just read from the docs that most connectors are now in Alpha. Wondering if the list is up to date? I love the simplicity and ease of use with Airbyte and would like to introduce it into my organization which stores application data in Postgres. Would you still suggest adopting Airbyte at this moment? Any alternatives would you suggest? Thanks loads!
    octavia loves 1
    m
    • 2
    • 3
  • l

    Latesh Subramanyam

    04/28/2022, 5:16 PM
    hi guys
    m
    • 2
    • 1
  • c

    Chasen Sherman

    04/28/2022, 5:51 PM
    Hi team, was curious about the S3 destination connector. Specifies
    Please note that the stream name may contain a prefix, if it is configured on the connection. A data sync may create multiple files as the output files can be partitioned by size (targeting a size of 200MB compressed or lower) .
    Is there a way (either UI or code changes) to change the 200mb to a different chunking size?
    m
    • 2
    • 4
  • d

    Douglas Martins

    04/29/2022, 3:34 AM
    Hey guys! I built a connector from a Singer Tap (tap-skubana). When I run my syncs no records are being replicated though. I used the connector generator and only created the spec.json file and implemented a basic
    check_config
    method. Any ideas of what might be happening here?
    m
    • 2
    • 1
  • t

    Tommy

    04/29/2022, 4:58 AM
    Hi Airbyte team! does anyone facing same issue with me?? I try to create destination from postgredb source, but no any schema/table detected.
    m
    a
    • 3
    • 5
  • s

    Srini Ranganathan

    04/29/2022, 5:01 AM
    Hi everyone, new to airbyte. Is there a way to injest data from Zoho suite of tools? Zoho has the option to trigger a webhook on create, update, delete events if its possible to have a webhook as a source
    m
    • 2
    • 1
  • s

    Shanhui Bono

    04/29/2022, 7:07 PM
    Is it still true that a schema change from the source will need to a full refresh to send to destination? And say if there is a new table in my postgres source, I would have to click "refresh source schema" to see it and reset all data in order to include it in destination?
    m
    • 2
    • 2
  • k

    Kavin Rajagopal

    04/30/2022, 8:43 AM
    Hello, I have been trying to move data from GA to a postgres instance. Google analytics currently has UA property only. I also tried to an S3 bucket but its still not possible. Can someone help me? Caused by: io.airbyte.workers.DefaultReplicationWorker$SourceException: Source process exited with non-zero exit code 1 at io.airbyte.workers.DefaultReplicationWorker.lambda$getReplicationRunnable$5(DefaultReplicationWorker.java:312) at java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804) ... 3 more
    ✅ 1
    a
    • 2
    • 1
  • y

    Yasha

    05/01/2022, 1:19 PM
    Hello, I am trying to configure an incremental sync mode for a regular Python connector (not the streams one). I have defined the cursor field and value. It shows in the UI on the replication tab, but does not work. I get double of the data. I am using a date timestamp. For some streams, I cannot control the sorting.
    ✅ 1
    a
    • 2
    • 1
  • a

    Ankur Chauhan

    05/03/2022, 4:36 AM
    Hi folks Can someone lead us to good resource to achieve Oracle CDC integration to S3 with airbyte? If we can integrate any other tool with airbyte ?
    m
    • 2
    • 1
  • c

    Connor Lough

    05/03/2022, 6:56 PM
    ❄️ Snowflake question ❄️ Hey there, I'm moving data into S3 and then making external tables in Snowflake. I've got 'Full Refresh | Overwrite' mode switched on; is there a way to only reference the newest file in an external stage? Am I doomed to
    create or replace
    this table with the newest MM_DD_YYYY path?
    a
    • 2
    • 1
  • j

    Jan-Henrik Funke

    05/05/2022, 12:48 PM
    Hi all, I am facing a problem with the MSSQL connector. I am trying to load a View with more than 160M rows. The time the view needs to run, so that Airbyte can start loading the data seems to be too long. The attempt is always canceled after 5 minutes with error message: “SQL Server did not return a response. The connection has been closed.” I thought that this would be a timeout of the sql server, but the timeout of the server is set to 2 hours and the problem still occurs. Is there anything like a JDBC timeout here and can I adjust this timeout?
    ✅ 1
    a
    • 2
    • 1
  • d

    David Effiong

    05/05/2022, 2:16 PM
    Hello everyone, I am trying to deploy airbyte to Google Compute engine using the documentations here. When I get to this point so that I can access the UI :
    david@data:~/airbyte$ gcloud --project=$PROJECT_ID beta compute ssh $INSTANCE_NAME -- -L 8001:localhost:8001 -N -f
    I get this error:
    *ERROR:* (gcloud.beta.compute.ssh) argument [USER@]INSTANCE: Must be specified.
    I really don't know how to resolve this or specify the USER argument required. Please assist. Thank you.
    a
    • 2
    • 11
  • d

    Dimitris L.

    05/05/2022, 6:52 PM
    Hey folks! Question about best practice when it comes to ingesting raw data. Here's some background: I'm ingesting raw csv files into a postgres db. I then want to start doing transformations and all that jazz on that data. My question is, how should I go about transforming the data? 1. Should I keep the raw data in one db and use another one where I create my data models? 2. Should I just create a new table in the same db and then apply my transformations there? Hope this is clear! Thanks in advance! Dimitri
    👋 1
    m
    a
    • 3
    • 5
  • s

    Shobhit Sharma

    05/06/2022, 5:47 AM
    Hi Everyone, I am trying to connect to a Google Analytics (GA4) account where I am an admin via the airbyte cloud. In the mandatory field called "View ID" I am entering the "Property ID" because GA4 doesn't have the concept of views afaik. I get the following error: "Please check the permissions for the requested view_id: 30XXXXX59. {'code': 403, 'message': 'User does not have sufficient permissions for this profile.', 'status': 'PERMISSION_DENIED'}" Am i missing something, do I need to do anything on the GA admin console. please advise. Thanks.
    ✅ 1
    a
    d
    • 3
    • 2
  • a

    Aditi Bhalawat

    05/06/2022, 6:58 AM
    Hey Everyone , I am tring to get from particular Api but my use case is to give multiple keywords together to api and then it given me the output of specific keywords separately and all of this should happen parallel. I was able to to the same with one input.But confused about performing same in multiple input. Could any one Guide me here? Thank you in advance.
    ✅ 1
    a
    • 2
    • 1
  • c

    Chakrit ThongEK

    05/06/2022, 10:08 AM
    Hello, our company is trying to use the Facebook Marketing Page to Bigquery, however, we're receiving this error
    Copy code
    "HTTPError('400 Client Error: Bad Request for url:"
    If im not wrong facebook is using v13.0 for their API while the current version for airbyte is 12.0 Could you advice me, thank you in advance
    ✅ 1
    a
    • 2
    • 1
  • s

    Subramony M

    05/06/2022, 12:06 PM
    Hi team, i have hosted airbyte on kubernetes cluster, everything is up, and when i try to add a source as postgres , it just throws the errror , on finding the kubernetes pod , one of the pod stuck in init and eventually failed
    Copy code
    Failed Pod : pod/urce-postgres-sync-a1bbe16a-00b1-41d7-be9e-6904f435a348-0-mtfks   0/4     Init:Error   0          2m53s
    Worker pod logs
    Copy code
    2022-05-06 11:58:24 INFO i.a.w.p.KubePodProcess(<init>):512 - Creating pod urce-postgres-sync-a1bbe16a-00b1-41d7-be9e-6904f435a348-0-mtfks...
    Log4j2Appender says: Creating pod urce-postgres-sync-a1bbe16a-00b1-41d7-be9e-6904f435a348-0-mtfks...
    2022-05-06 11:58:26 INFO i.a.w.p.KubePodProcess(waitForInitPodToRun):305 - Waiting for init container to be ready before copying files...
    Log4j2Appender says: Waiting for init container to be ready before copying files...
    2022-05-06 11:58:26 INFO i.a.w.p.KubePodProcess(waitForInitPodToRun):309 - Init container present..
    Log4j2Appender says: Init container present..
    2022-05-06 11:58:29 INFO i.a.w.t.TemporalAttemptExecution(lambda$getWorkerThread$2):161 - Completing future exceptionally...
    io.airbyte.workers.WorkerException: Error while getting checking connection.
    	at io.airbyte.workers.DefaultCheckConnectionWorker.run(DefaultCheckConnectionWorker.java:84) ~[io.airbyte-airbyte-workers-0.36.9-alpha.jar:?]
    	at io.airbyte.workers.DefaultCheckConnectionWorker.run(DefaultCheckConnectionWorker.java:27) ~[io.airbyte-airbyte-workers-0.36.9-alpha.jar:?]
    	at io.airbyte.workers.temporal.TemporalAttemptExecution.lambda$getWorkerThread$2(TemporalAttemptExecution.java:158) ~[io.airbyte-airbyte-workers-0.36.9-alpha.jar:?]
    	at java.lang.Thread.run(Thread.java:833) [?:?]
    Caused by: io.airbyte.workers.WorkerException: An error has occurred.
    	at io.airbyte.workers.process.KubeProcessFactory.create(KubeProcessFactory.java:148) ~[io.airbyte-airbyte-workers-0.36.9-alpha.jar:?]
    	at io.airbyte.workers.process.AirbyteIntegrationLauncher.check(AirbyteIntegrationLauncher.java:58) ~[io.airbyte-airbyte-workers-0.36.9-alpha.jar:?]
    	at io.airbyte.workers.DefaultCheckConnectionWorker.run(DefaultCheckConnectionWorker.java:53) ~[io.airbyte-airbyte-workers-0.36.9-alpha.jar:?]
    	... 3 more
    Caused by: io.fabric8.kubernetes.client.KubernetesClientException: An error has occurred.
    	at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:103) ~[kubernetes-client-5.12.2.jar:?]
    	at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:97) ~[kubernetes-client-5.12.2.jar:?]
    	at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.lambda$run$2(WatchConnectionManager.java:133) ~[kubernetes-client-5.12.2.jar:?]
    	at java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:934) ~[?:?]
    	at java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:911) ~[?:?]
    	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]
    	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162) ~[?:?]
    	at io.fabric8.kubernetes.client.okhttp.OkHttpWebSocketImpl$BuilderImpl$1.onFailure(OkHttpWebSocketImpl.java:72) ~[kubernetes-client-5.12.2.jar:?]
    	at okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:571) ~[okhttp-3.12.12.jar:?]
    	at okhttp3.internal.ws.RealWebSocket$2.onFailure(RealWebSocket.java:221) ~[okhttp-3.12.12.jar:?]
    	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:211) ~[okhttp-3.12.12.jar:?]
    	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) ~[okhttp-3.12.12.jar:?]
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
    	... 1 more
    ✅ 1
    a
    • 2
    • 2
  • g

    Gabriel Souza

    05/06/2022, 1:41 PM
    Hello every one! How are you? Guys, i running a pipeline postgres -> bigquery over cdc connector! Thats woking terrible fine. But I got some new colums at postgres database that is not reflecting to bigquery. What is the best strategy to add this coluns at bigquery with this colums without lose history?
    ✅ 1
    a
    • 2
    • 1
  • a

    Athanasios Spyriou

    05/09/2022, 10:23 AM
    Hello everyone I am new to Airbyte. Could anyone please let me know how to deal with the following error
    Copy code
    Cannot publish to S3: Storage backend has reached its minimum free disk threshold. Please delete a few objects to proceed. (Service: Amazon S3; Status Code: 507; Error Code: XMinioStorageFull;
    How could I add my own bucket? thanks!
    ✅ 1
    a
    • 2
    • 3
  • c

    Cody K.

    05/09/2022, 11:12 PM
    Looking for help on MSSQL connector. With CDC enabled on source connector i can pull the records during first sync using sync mode “Full Refresh”; however, no records and 3 timeouts when trying to use sync mode “Incremental” (both append and deduped). I’ve confirmed cdc is working properly on MSSQL Server by updating a record and checking for changes in the table, but that should be secondary as it’s not able to do the initial table pull for first records.
    Copy code
    2022-05-09 23:20:13 INFO () DefaultAirbyteStreamFactory(lambda$create$0):73 - 2022-05-09 23:20:13 [32mINFO[m i.a.i.s.r.AbstractDbSource(lambda$read$2):126 - Closed database connection pool.
    2022-05-09 23:20:13 ERROR () LineGobbler(voidCall):85 - Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: Couldn't obtain database name
    2022-05-09 23:20:13 ERROR () LineGobbler(voidCall):85 - 	at io.airbyte.integrations.debezium.internals.DebeziumRecordIterator.requestClose(DebeziumRecordIterator.java:132)
    • 1
    • 3
12345...12Latest