Andrzej Lewandowski
04/24/2024, 6:46 AMhttps://assets-global.website-files.com/6064b31ff49a2d31e0493af1/64f77d7c6fd48fcde2d86c25_gJ[…]kfNyXG2AGrRDutV_lOtuOQ9Cp4J7VSCc8zwgiXqMbAtrn99tE.png▾
Shubham Jain
04/24/2024, 7:46 AMChristian Schmid
04/24/2024, 8:32 AMfederalState
object, locality
string, municipality
object, name
string, postalCode
string) USING csv LOCATION 'abfss:path_to_storage/d4a1697e-7676-4ca6-b960-8c683ceaced7/landing/_airbyte_tmp_vsl_test/' options ("header" = "true", "multiLine" = "true")
--------------------------------------------------------------------------------------------------------------^^^Ananth Kumar
04/24/2024, 8:46 AMtype
, cursor_field
, end_datetime
, datetime_format
, cursor_granularity
, start_datetime
, and `step.`I want to introduces new field in a custom DatetimeBasedCursor
, for that I need to modify the underlying code of the DatetimeBasedCursor
to support this new field.
How to achieve this? please share code samples.user
04/24/2024, 9:39 AMkonrad schlatte
04/24/2024, 2:09 PMDaniel Zuluaga
04/24/2024, 4:14 PM2024-04-24 16:04:19 replication-orchestrator > writeToDestination: exception caught
java.net.SocketException: Broken pipe
at sun.nio.ch.NioSocketImpl.implWrite(NioSocketImpl.java:413) ~[?:?]
at sun.nio.ch.NioSocketImpl.write(NioSocketImpl.java:433) ~[?:?]
at sun.nio.ch.NioSocketImpl$2.write(NioSocketImpl.java:812) ~[?:?]
at java.net.Socket$SocketOutputStream.write(Socket.java:1120) ~[?:?]
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:313) ~[?:?]
at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:409) ~[?:?]
at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:414) ~[?:?]
at sun.nio.cs.StreamEncoder.lockedFlush(StreamEncoder.java:218) ~[?:?]
at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:205) ~[?:?]
at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:263) ~[?:?]
at java.io.BufferedWriter.implFlush(BufferedWriter.java:372) ~[?:?]
at java.io.BufferedWriter.flush(BufferedWriter.java:359) ~[?:?]
at io.airbyte.workers.internal.DefaultAirbyteMessageBufferedWriter.flush(DefaultAirbyteMessageBufferedWriter.java:31) ~[io.airbyte-airbyte-commons-worker-0.50.35.jar:?]
at io.airbyte.workers.internal.DefaultAirbyteDestination.notifyEndOfInputWithNoTimeoutMonitor(DefaultAirbyteDestination.java:140) ~[io.airbyte-airbyte-commons-worker-0.50.35.jar:?]
at io.airbyte.workers.internal.DefaultAirbyteDestination.notifyEndOfInput(DefaultAirbyteDestination.java:133) ~[io.airbyte-airbyte-commons-worker-0.50.35.jar:?]
at io.airbyte.workers.general.BufferedReplicationWorker.writeToDestination(BufferedReplicationWorker.java:442) ~[io.airbyte-airbyte-commons-worker-0.50.35.jar:?]
at io.airbyte.workers.general.BufferedReplicationWorker.lambda$runAsyncWithTimeout$5(BufferedReplicationWorker.java:256) ~[io.airbyte-airbyte-commons-worker-0.50.35.jar:?]
at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
at java.lang.Thread.run(Thread.java:1589) ~[?:?]
2024-04-24 16:04:19 replication-orchestrator > writeToDestination: done. (forDest.isDone:false, isDestRunning:true)
2024-04-24 16:04:19 replication-orchestrator > Attempt 0 to update stream status incomplete materialbank:sdg_leads_report
2024-04-24 16:04:19 replication-orchestrator > readFromDestination: exception caught
java.lang.IllegalStateException: Destination process is still alive, cannot retrieve exit value.
at com.google.common.base.Preconditions.checkState(Preconditions.java:502) ~[guava-31.1-jre.jar:?]
at io.airbyte.workers.internal.DefaultAirbyteDestination.getExitValue(DefaultAirbyteDestination.java:191) ~[io.airbyte-airbyte-commons-worker-0.50.35.jar:?]
at io.airbyte.workers.general.BufferedReplicationWorker.readFromDestination(BufferedReplicationWorker.java:476) ~[io.airbyte-airbyte-commons-worker-0.50.35.jar:?]
at io.airbyte.workers.general.BufferedReplicationWorker.lambda$runAsync$2(BufferedReplicationWorker.java:228) ~[io.airbyte-airbyte-commons-worker-0.50.35.jar:?]
at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
at java.lang.Thread.run(Thread.java:1589) ~[?:?]
2024-04-24 16:04:19 replication-orchestrator > readFromDestination: done. (writeToDestFailed:true, dest.isFinished:false)
2024-04-24 16:04:19 replication-orchestrator > Attempt 0 to update stream status incomplete materialbank:sdg_leads_report
2024-04-24 16:04:19 replication-orchestrator > processMessage: done. (fromSource.isDone:false, forDest.isClosed:true)
2024-04-24 16:04:19 replication-orchestrator > Attempt 0 to update stream status incomplete materialbank:sdg_leads_report_storage
2024-04-24 16:04:19 replication-orchestrator > readFromSource: exception caught
java.lang.IllegalStateException: Source process is still alive, cannot retrieve exit value.
at com.google.common.base.Preconditions.checkState(Preconditions.java:502) ~[guava-31.1-jre.jar:?]
at io.airbyte.workers.internal.DefaultAirbyteSource.getExitValue(DefaultAirbyteSource.java:127) ~[io.airbyte-airbyte-commons-worker-0.50.35.jar:?]
at io.airbyte.workers.general.BufferedReplicationWorker.readFromSource(BufferedReplicationWorker.java:362) ~[io.airbyte-airbyte-commons-worker-0.50.35.jar:?]
at io.airbyte.workers.general.BufferedReplicationWorker.lambda$runAsyncWithHeartbeatCheck$3(BufferedReplicationWorker.java:235) ~[io.airbyte-airbyte-commons-worker-0.50.35.jar:?]
at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
at java.lang.Thread.run(Thread.java:1589) ~[?:?]
2024-04-24 16:04:19 replication-orchestrator > readFromSource: done. (source.isFinished:false, fromSource.isClosed:true)
2024-04-24 16:04:19 replication-orchestrator > Attempt 0 to update stream status incomplete materialbank:sdg_leads_report
2024-04-24 16:04:19 replication-orchestrator > Attempt 0 to update stream status incomplete materialbank:sdg_leads_report_storage
2024-04-24 16:04:19 replication-orchestrator > Attempt 0 to update stream status incomplete materialbank:sdg_leads_report_storage
2024-04-24 16:04:22 replication-orchestrator > (pod: airbyte / destination-redshift-write-961-1-dbgeq) - Closed all resources for pod
2024-04-24 16:05:22 replication-orchestrator > (pod: airbyte / source-mysql-read-961-1-syorg) - Destroying Kube process.
2024-04-24 16:05:22 replication-orchestrator > (pod: airbyte / source-mysql-read-961-1-syorg) - Closed all resources for pod
2024-04-24 16:05:22 replication-orchestrator > airbyte-source gobbler IOException: Socket closed. Typically happens when cancelling a job.
2024-04-24 16:05:22 replication-orchestrator > (pod: airbyte / source-mysql-read-961-1-syorg) - Destroyed Kube process.
nadia nizam
04/24/2024, 4:14 PMGabriel Ceolato Muller
04/24/2024, 5:54 PM./run-ab-platform.sh
[emerg] 11#11: host not found in upstream "airbyte-webapp" in /etc/nginx/nginx.conf:17Gabriel Ceolato Muller
04/24/2024, 7:51 PMJustin Lemmon
04/24/2024, 10:02 PMairbyte-ci connectors --name=source-quickbooks build
to test some connector updates and I'm getting the error:
failed to start daemon: Error initializing network controller: error obtaining controller instance: unable to add return rule in DOCKER-ISOLATION-STAGE-1 chain: (iptables failed: iptables --wait -A DOCKER-ISOLATION-STAGE-1 -j RETURN: iptables v1.8.10 (nf_tables): RULE_APPEND failed (No such file or directory): rule in chain DOCKER-ISOLATION-STAGE-1
Any thoughts? I've already tried the suggestions of sudo update-alternatives --set iptables /usr/sbin/iptables-legacy
that I've seen in a few threads. Running Airbyte on Docker via WSL2 + Ubuntu.Daniel Friedman
04/24/2024, 10:09 PMjsonb
field into columns, like your Segments/Rudderstacks do?
Full context
We have a postgres -> postgres connection (latest airbyte & connector versions) with a fairly typical event
table (a view that mirrors the underlying table) with a jsonb
event.data
column with an object schema that changes depending on the event type.
We’re attempting to ‘transpose’ the object keys inside the event.data
field into individual columns in the destination. The destination is being consumed by Tableau/Power BI type tools, which don’t work with string json representations very well/at all.
We have several dozen event types emitted (and growing), so manually defining each object key -> column in the source view isn’t very scalable.
I’ve read this page a few times, and I’m not sure if the information is relevant for our use case? I think the now deprecated basic normalisation might be? But I don’t know how to work with it, if it is.
We’ve started pursuing a dbt transformer route, but having asked a few LLM’s they assure me that that is a dead-end, saying that dbt requires a predefined model structure, and that it doesn’t support dynamic schema evolution based on incoming data.
Any guidance anybody can give us would be massively appreciated.Jose Martinez
04/24/2024, 11:01 PMRitika Naidu
04/25/2024, 6:22 AMIncremental | Append + Deduped
. The job had been running fine for a few days, but now it has started failing with this error -
ERROR debezium-sqlserverconnector-ZoomSTG-change-event-source-coordinator i.d.p.ErrorHandler(setProducerThrowable):52 Producer failure com.microsoft.sqlserver.jdbc.SQLServerException: An insufficient number of arguments were supplied for the procedure or function cdc.fn_cdc_get_all_changes_ ... .
at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:265) ~[mssql-jdbc-10.2.1.jre8.jar:?]
When I run the same cdc.fn_cdc_get_all_changes command in the source database, it runs fine and returns all changed records. Any pointers on what could have gone wrong?邢亜豪
04/25/2024, 9:00 AMAhmed Hamid
04/25/2024, 11:16 AMjava.lang.RuntimeException: java.lang.RuntimeException: SQL compilation error: error line 159 at position 103 invalid identifier 'STARTDATE' in SYSTEM$MULTISTMT at 'throw Execution of multiple statements failed on statement {0} (at line {1}, position {2})..replace('{1}', LINES[i])' position 4
Despite thorough checks, I couldn't find any column named 'STARTDATE' in either my Snowflake table or stream replication. I've even tried dropping the 'airbyteinternal' table and resyncing, but the issue persists.
Versions I'm using:
• Airbyte open source version: 0.54.0
• Google Analytics 4 connector version: 2.4.2
• Snowflake connector version: 3.6.2
I have attached the logs file. Any insights on how to troubleshoot and resolve this would be greatly appreciated. Thanks in advance!Matthew Martin
04/25/2024, 12:18 PM<http://localhost:8001/api/public/v1/><endpoint>
, however, a couple of days ago it said to use <http://localhost:8006>
(wayback machine confirms this as of last month, however i was looking at this doc yesterday and it still said :8006
)
I’ve cleared out everything locally and reinstalled using ./run-ab-platform.sh
and confirm that the api still appears to be at 8006
:
❯ curl -v -u airbyte:password 127.0.0.1:8006/health
* Trying 127.0.0.1:8006...
* Connected to 127.0.0.1 (127.0.0.1) port 8006
* Server auth using Basic with user 'airbyte'
> GET /health HTTP/1.1
> Host: 127.0.0.1:8006
> Authorization: Basic YWlyYnl0ZTpwYXNzd29yZA==
> User-Agent: curl/8.4.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx/1.25.3
< Date: Thu, 25 Apr 2024 12:10:01 GMT
< Content-Type: application/json
< Content-Length: 20
< Connection: keep-alive
<
* Connection #0 to host 127.0.0.1 left intact
Successful operation%
❯ curl -v -u airbyte:password 127.0.0.1:8001/api/public/health
* Trying 127.0.0.1:8001...
* Connected to 127.0.0.1 (127.0.0.1) port 8001
* Server auth using Basic with user 'airbyte'
> GET /api/public/health HTTP/1.1
> Host: 127.0.0.1:8001
> Authorization: Basic YWlyYnl0ZTpwYXNzd29yZA==
> User-Agent: curl/8.4.0
> Accept: */*
>
< HTTP/1.1 404 Not Found
< Server: nginx/1.25.3
< Date: Thu, 25 Apr 2024 12:10:14 GMT
< Content-Type: application/json
< Content-Length: 17
< Connection: keep-alive
<
* Connection #0 to host 127.0.0.1 left intact
Object not found.%
I just want to confirm if we need to change api endpoints/configuration for our ec2 deployment if we upgrade (currently on 0.57.3
). We are newly trialing using Airbyte so don’t want to configure something that has already changed.
TIA 🙏Todd Matthews
04/25/2024, 1:48 PMJustin Beasley
04/25/2024, 1:50 PM0.57.3
and try to use it as a source, I get a "Discovering schema failed: Failed to run schema discovery." error in the UI. The logs both in Cloud Storage and in GKE are no help. This is a Helm-deployed Kubernetes instance, for what it's worth.
I'm seeing others in here and #ask-ai having similar errors . . . has anyone found a workaround for this?
(I'm not sure if it's related, but when I try to use Builder, I keep getting a `Error handling request: The manifest version 0.83.0 is greater than the airbyte-cdk package version (0.79.1). Your manifest may contain features that are not in the current CDK version.`—manually changing the version in YAML view works fine, but I'm not entirely clear why a brand new deployment would be out of sync with versions)Tymek Motylewski
04/25/2024, 3:24 PMRavil Khalilov (he/him)
04/25/2024, 3:43 PMHTTPSConnectionPool(host='<http://api.dev|api.dev>.<<http://our_domain_name.com|our_domain_name.com>>', port=443): Max retries exceeded with url: /v1/analytics/proposal (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1129)')))
As I understand the error, I need to install the root CA in the airbyte-connector-builder-server pod.
Below are the steps I took to try to fix the error.
I tried to create a custom docker image based on the official airbyte/connector-builder-server image, then copy my root CA file, install the ca-certificates OS package and run the update-ca-trust OS command. After helm upgrade with my custom image, my pod just won't start with error 127
in kubernetes logs and an error in the logs of the pod itself - /bin/bash: airbyte-app/bin/airbyte-connector-builder-server: No such file or directory.
I'll be appreciate to any help.
Thanksuser
04/25/2024, 4:23 PMStepan Chatalyan
04/25/2024, 4:57 PMJoe Harvey
04/25/2024, 10:14 PMOnly top-level primary keys are supported
issues with some destinations.Sar Joshi
04/26/2024, 1:12 AMcursor
to certain point of time to start the sync from recent date rather syncing the whole huge table? I couldn’t find any proper solution in the doc and in the forums.
I’m not sure this feature is still being worked on, but want to know if anyone had come across this situation before? Thanks 🙂Priyam Gupta
04/26/2024, 6:14 AMDATABASE_USER=****
DATABASE_PASSWORD=******
DATABASE_HOST=*****
DATABASE_PORT=5432
DATABASE_DB=airbyte
# translate manually DATABASE_URL=jdbc:postgresql://${DATABASE_HOST}:${DATABASE_PORT}/${DATABASE_DB} (do not include the username or password here)
DATABASE_URL=jdbc:postgresql://****:5432/airbyte
I started getting below exception
Caused by: java.lang.RuntimeException: java.net.UnknownHostException: airbyte-temporal
at io.grpc.internal.DnsNameResolver.resolveAddresses(DnsNameResolver.java:223) ~[grpc-core-1.61.0.jar:1.61.0]
at io.grpc.internal.DnsNameResolver.doResolve(DnsNameResolver.java:282) ~[grpc-core-1.61.0.jar:1.61.0]
at io.grpc.grpclb.GrpclbNameResolver.doResolve(GrpclbNameResolver.java:63) ~[grpc-grpclb-1.61.0.jar:1.61.0]
at io.grpc.internal.DnsNameResolver$Resolve.run(DnsNameResolver.java:318) ~[grpc-core-1.61.0.jar:1.61.0]
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
at java.base/java.lang.Thread.run(Thread.java:1583) ~[?:?]
Caused by: java.net.UnknownHostException: airbyte-temporal
at java.base/java.net.InetAddress$CachedLookup.get(InetAddress.java:988) ~[?:?]
at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1818) ~[?:?]
at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1688) ~[?:?]
at io.grpc.internal.DnsNameResolver$JdkAddressResolver.resolveAddress(DnsNameResolver.java:632) ~[grpc-core-1.61.0.jar:1.61.0]
at io.grpc.internal.DnsNameResolver.resolveAddresses(DnsNameResolver.java:219) ~[grpc-core-1.61.0.jar:1.61.0]
at io.grpc.internal.DnsNameResolver.doResolve(DnsNameResolver.java:282) ~[grpc-core-1.61.0.jar:1.61.0]
at io.grpc.grpclb.GrpclbNameResolver.doResolve(GrpclbNameResolver.java:63) ~[grpc-grpclb-1.61.0.jar:1.61.0]
at io.grpc.internal.DnsNameResolver$Resolve.run(DnsNameResolver.java:318) ~[grpc-core-1.61.0.jar:1.61.0]
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
at java.base/java.lang.Thread.run(Thread.java:1583) ~[?:?]
BigLeka
04/26/2024, 1:10 PMJoseph Goose Aranez
04/26/2024, 2:48 PM"stacktrace" : "Traceback (most recent call last):\n File \"/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py\", line 136, in read\n yield from self._read_stream(\n File \"/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py\", line 236, in _read_stream\n for record_data_or_message in record_iterator:\n File \"/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/streams/core.py\", line 145, in read\n for record_data_or_message in records:\n File \"/airbyte/integration_code/source_zendesk_support/streams.py\", line 147, in read_records\n yield from super().read_records(\n File \"/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/streams/http/http.py\", line 482, in read_records\n yield from self._read_pages(\n File \"/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/streams/http/http.py\", line 499, in _read_pages\n yield from records_generator_fn(request, response, stream_state, stream_slice)\n File \"/airbyte/integration_code/source_zendesk_support/streams.py\", line 611, in parse_response\n updated = data[self.cursor_field]\nKeyError: 'updated_at'\n",
Any advise?Ted Glomski
04/26/2024, 10:17 PM0.64.205
. We are using an annotated Service Account to grant AWS role credentials per AWS documentation for IAM roles for service accounts, and we have validated that non-Airbyte resources can access a given S3 bucket. All of our resources are in the AWS region us-east-1
- S3 bucket, IAM role, etc. All pods have us-east-1
set as the environment variables AWS_REGION
and AWS_DEFAULT_REGION
and we also have global.storage.s3.bucketRegion
set to us-east-1
. Whenever we try to access that S3 bucket or any other S3 bucket from Airbyte server, Airbyte Worker, or any source-check or destination-check job that gets spun up we get the same error:
software.amazon.awssdk.services.s3.model.S3Exception: The authorization header is malformed; the region 'us-east-1' is wrong; expecting 'eu-west-1'
This error comes up regardless if we specify the AWS region in the source or destination config, if we specify an AWS Access Key ID and Secret Access Key, if we specify an endpoint, etc.
Any assistance would be very welcome.