https://linen.dev logo
Join Slack
Powered by
# ask-ai
  • t

    Túlio Lima

    08/25/2024, 4:21 AM
    how to import DEFAULT_ERROR_MAPPING in a pytho connector?
    k
    • 2
    • 1
  • k

    KRISHIV GUBBA

    08/25/2024, 7:27 AM
    @kapa.ai how can i set up incremental sync based on a given range
    k
    • 2
    • 27
  • r

    Rabea Yousef

    08/25/2024, 9:36 AM
    @kapa.ai i upgraded my Airbyte to VERSION=0.63.13 i'm using airbyte UI I configured a connection with hubspot as source and Redshift as target with incremental append+Dedupe for deals, tickets and contacts object but the sync keep running all time and not stopping although the full load completed in target tables also i tried full refresh, but the the sync started automatically once full load completed in the schema airbyte_internal, while data didn't store in the target tables? while, i'm using Intercom and Jira as source with incremental load and everything is going well Can you help please?
    k
    • 2
    • 4
  • h

    Hassan Razzaq

    08/25/2024, 2:27 PM
    @kapa.ai how to create gdrive source using airbyte-api-python-sdk
    k
    • 2
    • 1
  • h

    Hassan Razzaq

    08/25/2024, 2:28 PM
    @kapa.ai using client id refreshtoken and secret
    k
    • 2
    • 1
  • h

    Hassan Razzaq

    08/25/2024, 2:29 PM
    @kapa.ai how i can use this in airbyte-api-python-sdk
    k
    • 2
    • 1
  • h

    Hassan Razzaq

    08/25/2024, 2:54 PM
    @kapa.ai i am getting this error
    {"status":422,"type":"<https://reference.airbyte.com/reference/errors#unprocessable-entity>","title":"unprocessable-entity","detail":"The body of the request was not understood","documentationUrl":null,"data":{"message":"json schema validation failed when comparing the data to the json schema. \nErrors: $: required property 'api_key' not found, $: required property 'url' not found "}}
    This is the url I am using url = "http:localhost:8000/api/public/v1/sources"
    k
    • 2
    • 4
  • t

    Túlio Lima

    08/25/2024, 3:51 PM
    how to implement an incremental stream with airbyte cdk in python
    k
    • 2
    • 1
  • t

    Túlio Lima

    08/25/2024, 5:34 PM
    how to define the primary key of a stream to show ate connection settings in airbyte web
    k
    • 2
    • 1
  • c

    Charles Bockelmann

    08/25/2024, 5:38 PM
    I have Airbyte up and running on a Docker container. Also, I have Airflow up and running on another Docker container. Both containers are in the same Docker network but after running my DAG, it throws the following error:
    Copy code
    urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /api/public/v1/connections/sync (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0xffff8581b770>: Failed to establish a new connection: [Errno 111] Connection refused'))
    After accessing the console inside the Airflow container and doing a curl on ports
    80
    and
    8000
    to
    localhost
    ,
    host.docker.internal
    and the Airbyte container IP address, I always receive the same response:
    Copy code
    <html>
    <head><title>404 Not Found</title></head>
    <body>
    <center><h1>404 Not Found</h1></center>
    <hr><center>nginx</center>
    k
    b
    • 3
    • 7
  • m

    mike Trienis

    08/25/2024, 5:55 PM
    @kapa.ai which test fixtures should I use for a new java destination?
    k
    • 2
    • 1
  • q

    Quang Nguyen

    08/25/2024, 6:30 PM
    I have a sync from PostgreSQL to BigQuery which has started failing, the offending error:
    Copy code
    ERROR sync-operations-4 i.a.i.b.d.t.TyperDeduperUtil(executeTypeAndDedupe):223 Encountered Exception on unsafe SQL for stream raw_postgres shipments with suffix , attempting with error handling com.google.cloud.bigquery.BigQueryException: Query error: Invalid datetime string "+20212-01-10T07:07:00.000000" at [2:1]
            at com.google.cloud.bigquery.Job.reload(Job.java:424) ~[google-cloud-bigquery-2.37.0.jar:2.37.0]
            at io.airbyte.integrations.destination.bigquery.typing_deduping.BigQueryDestinationHandler.execute(BigQueryDestinationHandler.kt:146) ~[io.airbyte.airbyte-integrations.connectors-destination-bigquery.jar:?]
            at io.airbyte.integrations.base.destination.typing_deduping.TyperDeduperUtil.executeTypeAndDedupe(TyperDeduperUtil.kt:219) ~[airbyte-cdk-typing-deduping-0.41.4.jar:?]
            at io.airbyte.integrations.destination.bigquery.operation.BigQueryStorageOperation.typeAndDedupe(BigQueryStorageOperation.kt:158) ~[io.airbyte.airbyte-integrations.connectors-destination-bigquery.jar:?]
            at io.airbyte.integrations.base.destination.operation.AbstractStreamOperation.finalizeTable(AbstractStreamOperation.kt:315) ~[airbyte-cdk-typing-deduping-0.41.4.jar:?]
            at io.airbyte.integrations.base.destination.operation.DefaultSyncOperation.finalizeStreams$lambda$9$lambda$8(DefaultSyncOperation.kt:138) ~[airbyte-cdk-typing-deduping-0.41.4.jar:?]
            at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768) ~[?:?]
            at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
            at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
            at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
    i'm using CDC wal mode. can someone help me this error
    k
    • 2
    • 1
  • q

    Quang Nguyen

    08/25/2024, 6:42 PM
    can we use only one publication postgres for many connection in airbyte?
    k
    • 2
    • 1
  • j

    Jhonatas Kleinkauff

    08/25/2024, 8:39 PM
    @kapa.ai can airbyte create hive stle partitionings when extracting data to s3 in parquet format?
    k
    • 2
    • 1
  • t

    Túlio Lima

    08/26/2024, 12:04 AM
    how to set airbyte-ci to build connector image with python 3.11?
    k
    • 2
    • 1
  • h

    Hoang Trung Hieu

    08/26/2024, 12:30 AM
    @kapa.ai
    k
    u
    +127
    • 130
    • 192
  • s

    Slackbot

    08/26/2024, 4:36 AM
    This message was deleted.
    k
    k
    • 3
    • 4
  • d

    DR

    08/26/2024, 5:07 AM
    @kapa.ai The Airbyte connector encountered an error while inferring the schema from the files. It attempted to process a CSV file from Google Cloud Storage but failed due to the underlying stream not being seekable.
    Copy code
    2024-08-26 04:36:41 INFO i.a.c.i.LineGobbler(voidCall):166 - ----- START DISCOVER -----
    2024-08-26 04:36:41 INFO i.a.c.i.LineGobbler(voidCall):166 - 
    2024-08-26 04:36:58 INFO i.a.c.ConnectorWatcher(run):87 - Connector exited, processing output
    2024-08-26 04:36:58 INFO i.a.c.ConnectorWatcher(run):90 - Output file jobOutput.json found
    2024-08-26 04:36:58 INFO i.a.c.ConnectorWatcher(run):96 - Connector exited with 1
    2024-08-26 04:36:58 INFO i.a.w.i.VersionedAirbyteStreamFactory(create):189 - Reading messages from protocol version 0.2.0
    2024-08-26 04:36:58 WARN i.a.m.l.MetricClientFactory(getMetricClient):43 - MetricClient has not been initialized. Must call MetricClientFactory.CreateMetricClient before using MetricClient. Using a dummy client for now. Ignore this if Airbyte is configured to not publish any metrics.
    2024-08-26 04:36:58 WARN i.a.w.i.VersionedAirbyteStreamFactory(internalLog):305 - Refusing to infer schema for 4975 files; using 10 files.
    2024-08-26 04:36:58 WARN i.a.w.i.VersionedAirbyteStreamFactory(internalLog):305 - Refusing to infer schema for 3949 files; using 10 files.
    2024-08-26 04:36:58 ERROR i.a.w.i.VersionedAirbyteStreamFactory(internalLog):304 - An error occurred inferring the schema. 
     Traceback (most recent call last):
      File "/usr/local/lib/python3.10/site-packages/airbyte_cdk/sources/file_based/stream/default_file_based_stream.py", line 289, in _infer_file_schema
        return await self.get_parser().infer_schema(self.config, file, self.stream_reader, self.logger)
      File "/usr/local/lib/python3.10/site-packages/airbyte_cdk/sources/file_based/file_types/csv_parser.py", line 168, in infer_schema
        for row in data_generator:
      File "/usr/local/lib/python3.10/site-packages/airbyte_cdk/sources/file_based/file_types/csv_parser.py", line 55, in read_data
        headers = self._get_headers(fp, config_format, dialect_name)
      File "/usr/local/lib/python3.10/site-packages/airbyte_cdk/sources/file_based/file_types/csv_parser.py", line 110, in _get_headers
        fp.seek(0)
    io.UnsupportedOperation: underlying stream is not seekable
    The above exception was the direct cause of the following exception:
    Traceback (most recent call last):
      File "/usr/local/lib/python3.10/site-packages/airbyte_cdk/sources/file_based/stream/default_file_based_stream.py", line 281, in _infer_schema
        base_schema = merge_schemas(base_schema, task.result())
      File "/usr/local/lib/python3.10/site-packages/airbyte_cdk/sources/file_based/stream/default_file_based_stream.py", line 291, in _infer_file_schema
        raise SchemaInferenceError(
    airbyte_cdk.sources.file_based.exceptions.SchemaInferenceError: Error inferring schema from files. Are the files valid? Contact Support if you need assistance.
    file=<https://storage.googleapis.com/gp-install-stats/installs_com.test_202408_overview.csv>? format=filetype='csv' delimiter=',' quote_char='"' escape_char=None encoding='UTF16' double_quote=True null_values=set() strings_can_be_null=True skip_rows_before_header=0 skip_rows_after_header=0 header_definition=CsvHeaderFromCsv(header_definition_type='From CSV') true_values={'t', '1', 'on', 'yes', 'y', 'true'} false_values={'no', '0', 'f', 'false', 'n', 'off'} inference_type=<inferencetype.none:> ignore_errors_on_fields_mismatch=False stream=install_report
    Traceback (most recent call last):
      File "/usr/local/lib/python3.10/site-packages/airbyte_cdk/sources/file_based/stream/default_file_based_stream.py", line 289, in _infer_file_schema
        return await self.get_parser().infer_schema(self.config, file, self.stream_reader, self.logger)
    k
    • 2
    • 1
  • j

    Jayant Kumar

    08/26/2024, 6:42 AM
    @kapa.ai I am using GA4 Airbyte source to ingest data into the BQ Warehouse. GA4 supports a new models for events and properties. On the Airbyte source setup page, I could only see list of reports and their sync modes. Is it possible to ingest GA4 events using Airbyte GA4 source?
    k
    • 2
    • 1
  • j

    Julie Choong

    08/26/2024, 7:06 AM
    I set up using Kubernetes via Helm in kind locally. I have my own Postgres creds in my values.yaml, and the Postgres contains data from my old Airbyte instance. However, when I point my Kubernetes instance to the Postgres creds, it ends up with a 502 http error.
    k
    • 2
    • 30
  • s

    Slackbot

    08/26/2024, 7:14 AM
    This message was deleted.
    k
    • 2
    • 1
  • s

    Shubham

    08/26/2024, 7:23 AM
    I have a REST API source which requires start_date and end_date as an input. For one stream, where I get a reliable cursor timestamp field, I am implementing an incremental sync but got another stream in the same source, I don't have a reliable field (I do have some timestamp column, which may or may not be present in all the records) How do I implement an
    incremental sync
    in the second case ? Even if I go for a
    full table append
    mode, How do I provide a changing end_date (I can't use current_Date because the source doesn't allow
    end_date-start_date
    to be greater than 7
    k
    • 2
    • 5
  • t

    Tom Montgomery

    08/26/2024, 7:50 AM
    I am using the Intercom connection to sync data about the contacts I have on Intercom. The
    tags
    field returns an object. Within this object there is a
    data
    key whose value is an array of tags objects. This array is limited to 10 objects. Should the contact have more tags, there is another key,
    has_more
    , which would be set to
    true
    in the case of the contact having more than 10 tags. Finally, there is a URL provided within the
    tags
    field that is used to get more resources for the contact (i.e., more tags). Would it be possible to automate the fetching of this additional data within the airbyte sync? At the moment we are only receiving 10 tags and this is obscuring our view
    k
    • 2
    • 1
  • t

    Thomas

    08/26/2024, 9:00 AM
    @kapa.ai I have migrated a docker-compose install via
    abctl local install --migrate
    which seemingly worked but when trying to open the web interface I get a 404 error from nginx
    k
    b
    • 3
    • 18
  • l

    L Theisen

    08/26/2024, 9:15 AM
    I am trying to run airbyte-oss in an airgaped Kubernetes environment. Unfortunatly i got the following Error at the Logs by starting Airbyte: 2024-08-26 090813 INFO i.a.c.s.RemoteDefinitionsProvider(<init>):75 - Creating remote definitions provider for URL 'https://connectors.airbyte.com/files/' and registry 'OSS'... Is it possible to deliver the connectors via an artifactory and use another Link?
    k
    b
    • 3
    • 12
  • i

    Ishan Anilbhai Koradiya

    08/26/2024, 9:45 AM
    @kapa.ai i keep getting an error - "Failed to retrieve ConnectionManagerWorkflow for connection 24170425-6e7f-4d1c-ba69-185ea9886a35. Repairing state by creating new workflow and starting with the signal" what does this mean. This is while resetting the streams via api
    k
    • 2
    • 1
  • s

    Syed Hamza Raza Kazmi

    08/26/2024, 9:53 AM
    airbyte took 27min for 1446 record(s). deployed airbyte using kubernetes
    k
    • 2
    • 1
  • q

    Quang Nguyen

    08/26/2024, 9:55 AM
    I have a sync from PostgreSQL to BigQuery which has started failing, the offending error:
    Copy code
    Stack Trace: com.google.cloud.bigquery.BigQueryException: Query error: Invalid NUMERIC value: 17976931348623157000000000000000... at [2:1]
            at com.google.cloud.bigquery.Job.reload(Job.java:424)
            at io.airbyte.integrations.destination.bigquery.typing_deduping.BigQueryDestinationHandler.execute(BigQueryDestinationHandler.kt:146)
            at io.airbyte.integrations.base.destination.typing_deduping.TyperDeduperUtil.executeTypeAndDedupe(TyperDeduperUtil.kt:219)
            at io.airbyte.integrations.destination.bigquery.operation.BigQueryStorageOperation.typeAndDedupe(BigQueryStorageOperation.kt:158)
            at io.airbyte.integrations.base.destination.operation.AbstractStreamOperation.finalizeTable(AbstractStreamOperation.kt:315)
            at io.airbyte.integrations.base.destination.operation.DefaultSyncOperation.finalizeStreams$lambda$9$lambda$8(DefaultSyncOperation.kt:138)
            at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
            at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
            at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
            at java.base/java.lang.Thread.run(Thread.java:1583)
    k
    • 2
    • 2
  • a

    Aditya Gupta

    08/26/2024, 10:34 AM
    @kapa.ai how does sync scheduling mechanism works and how it starts again and again,how does airbyte initaites and stores data and what data for that?
  • s

    Saurabh Agrawal

    08/26/2024, 11:21 AM
    is it possible to set
    SYNC_JOB_MAX_ATTEMPTS
    in the Airbyte Cloud accont..
    k
    • 2
    • 1
1...232425...48Latest