https://linen.dev logo
Join Slack
Powered by
# ask-community-for-troubleshooting
  • b

    Bram

    11/28/2022, 12:25 PM
    Hi, I am researching Airbyte and it's possibilities just for fun. And I was wondering wat is, and what is not possible regarding the Airbyte API license. Elastic License. Is it allowed to integrate the self-hosted Airbyte API into a custom microservices gateway and custom front-end, and run it in a customers private cloud? Is it allowed to host this setup on a website without any restrictions for free? Would love to get some clearification. Thanks in advance!
    • 1
    • 1
  • r

    Ramon de la Cruz Ariza

    11/28/2022, 1:27 PM
    Hello to everyone!! We are deploying Airbyte for the first in our k8s cluster, right now it’s already working in a EC2 docker instance and we wanna migrate our current setup. For that, we deployed the following helm chart using kustomization:
    Copy code
    ---
    apiVersion: <http://kustomize.config.k8s.io/v1beta1|kustomize.config.k8s.io/v1beta1>
    kind: Kustomization
    
    helmCharts:
    - name: airbyte
      version: 0.42.0
      repo: <https://airbytehq.github.io/helm-charts>
      namespace: airbyte
      releaseName: airbyte
    But we wanna add /modify the variables to the existing configuration (like enable the local basic http auth), and we have some questions: • Can I overwrite this file somehow? I saw it’s possible if we have the configMapGenerator enabled and also the
    .env
    file locally • What happens if I have this local file and I want to have an external db? (using the values for the helm chart it’s possible, but will this overwrite another configuration? Many thanks!!!
    m
    k
    +4
    • 7
    • 10
  • r

    Rahul Borse

    11/28/2022, 2:08 PM
    Hi Team, I want to modify postgres source connector according to our application needs. Where while creating connection we select schemas, and send it to create connection. Our application need is user will select few columns for which encryption should happen so that encrypted data will be stored in destination S3 for given columns. We can modify UI but I am not able to achieve passing extra field along with column which will tell if we need to encrypt this column or not. Can someone suggest how can I pass a custom field for the given columns for schemas
    • 1
    • 1
  • l

    laila ribke

    11/28/2022, 2:08 PM
    Hi all! For the ones working with the google ads / bingads source connectors and incremental + history deduped sync. I just realized that as the cursor is set by default to segments.date in google ads (for example for campaigns table), or to timeperiod in bing ads (for example in ad_group_performance_report_daily), it will never update records, only add new ones!! Which is a disaster for me. Is it correct what I´m saying? if so, how do you work with it??
    u
    b
    u
    • 4
    • 6
  • j

    Juan Chaves

    11/28/2022, 5:03 PM
    Is this your first time deploying Airbyte: No OS Version / Instance: Mac OS Ventura (M1 Apple Chip) Memory / Disk: 16Gb / 1Tb SSD Deployment: Docker Airbyte Version: 0.40.22 Step: Deploy on local (Docker compose up) Description: I'm trying to deploy airbyte open source on mi laptop using docker. but at some point docker compose up started to show this error. airbyte-webapp | 2022/11/23 234644 [error] 35#35: *16 connect() failed (111: Connection refused) while connecting to upstream, client: 172.21.0.9, server: localhost, request: "POST /api/v1/workspaces/list HTTP/1.0", upstream: "http://172.21.0.7:8001/api/v1/workspaces/list", host: "localhost", referrer: "http://localhost:8000/" I try to used all the options that mention in comments related even with the M1 Apple chip. The bootloader container exited with this 2022-11-23 234551 WARN c.z.h.p.HikariPool(shutdown):218 - Timed-out waiting for add connection executor to shutdown (edita (editado)
    n
    s
    a
    • 4
    • 15
  • c

    Carolina Buckler

    11/28/2022, 5:48 PM
    Hey team - Just setup Airbyte, and I am trying to configure my first connection. I’m trying to connect to a Salesforce sandbox environment and I am getting this error :
    Copy code
    The connection tests failed.
    'An error occurred: {"error":"invalid_grant","error_description":"expired access/refresh token"}'
    Any guidance on how to reset the refresh token?
    d
    n
    • 3
    • 8
  • s

    Semyon Komissarov

    11/28/2022, 6:43 PM
    Hello! I am running Airbyte as VM instance on GCP. Region - us-east4. And I am configuring sync from PostgreSQL to BigQuery. Using Incremental | Deduped + history mode. When I set the dataset location to us-east4. Everything works well, with no errors, and all data is in place. When I set the dataset location to US - sync fails on the normalization step. But if I set to another mode, for example, Incremental | Append. Everything works as intended. Logs in thread
    ✅ 1
    n
    • 2
    • 7
  • r

    Royzac

    11/29/2022, 12:59 AM
    Hello All, Would it be possible to deploy an containerized airbyte pipeline in an aws lambda?
    k
    • 2
    • 2
  • a

    Adam

    11/29/2022, 1:29 AM
    Hi All, we are looking to sync a 1B+ row table from MySQL using CDC Incremental Append. We have successfully sync'd 100M row datasets from MySQL but the loading time is a 1-2 days as far as I remember. This won't allow us enough time to catch up with the binlog. I don't need the entire table, and would be happy with the last year or two. I'm wondering if anyone has strategies for syncing such large tables? I can see that the M*S*SQL connector allows CDC to be configured to 'New Changes Only' - skipping the initial snapshot. Is there a way to achieve this with the MySQL connector? I'm happy to capture new changes only and then I can complete a full load of a view which restricts the data to the last year or two and combine the two in Snowflake. Would love to hear any experiences people have had in this space.
    j
    n
    • 3
    • 10
  • k

    Krishna Elangovan

    11/29/2022, 2:10 AM
    I have the Airbyte running on K8S setup using helm charts, creating a new mysql connection fails with this error
    Copy code
    Could not connect with provided configuration. Error: The variable "log_bin" should be set to "ON", but it is "OFF"
    even though its set to “ON” on my MySql host, this happens only when trying to connect with CDC enabled with standard it just works fine, what could be wrong here ?
    m
    • 2
    • 3
  • a

    ANISH R

    11/29/2022, 8:43 AM
    Hi Team, Airbyte external schema/external view contain schema are not discoverable in Airbyte for Redshift source connector:0.3.15 and I am getting this error :
    Copy code
    2022-11-29 08:39:36 INFO i.a.w.t.TemporalAttemptExecution(get):108 - Docker volume job log path: /tmp/workspace/b88f61d3-0dac-450d-be8e-890756668ee2/0/logs.log
    2022-11-29 08:39:36 INFO i.a.w.t.TemporalAttemptExecution(get):113 - Executing worker wrapper. Airbyte version: 0.40.15
    2022-11-29 08:39:36 INFO i.a.c.i.LineGobbler(voidCall):114 - Checking if airbyte/source-redshift:0.3.15 exists...
    2022-11-29 08:39:36 INFO i.a.c.i.LineGobbler(voidCall):114 - airbyte/source-redshift:0.3.15 was found locally.
    2022-11-29 08:39:36 INFO i.a.w.p.DockerProcessFactory(create):119 - Creating docker container = source-redshift-discover-b88f61d3-0dac-450d-be8e-890756668ee2-0-meimj with resources io.airbyte.config.ResourceRequirements@66d3a24c[cpuRequest=,cpuLimit=,memoryRequest=,memoryLimit=]
    2022-11-29 08:39:36 INFO i.a.w.p.DockerProcessFactory(create):163 - Preparing command: docker run --rm --init -i -w /data/b88f61d3-0dac-450d-be8e-890756668ee2/0 --log-driver none --name source-redshift-discover-b88f61d3-0dac-450d-be8e-890756668ee2-0-meimj --network host -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -e DEPLOYMENT_MODE=OSS -e USE_STREAM_CAPABLE_STATE=true -e AIRBYTE_ROLE= -e WORKER_ENVIRONMENT=DOCKER -e WORKER_JOB_ATTEMPT=0 -e WORKER_CONNECTOR_IMAGE=airbyte/source-redshift:0.3.15 -e AIRBYTE_VERSION=0.40.15 -e WORKER_JOB_ID=b88f61d3-0dac-450d-be8e-890756668ee2 airbyte/source-redshift:0.3.15 discover --config source_config.json
    2022-11-29 08:39:37 INFO i.a.w.i.DefaultAirbyteStreamFactory(internalLog):109 - starting source: class io.airbyte.integrations.source.redshift.RedshiftSource
    2022-11-29 08:39:38 INFO i.a.w.i.DefaultAirbyteStreamFactory(internalLog):109 - integration args: {discover=null, config=source_config.json}
    2022-11-29 08:39:38 INFO i.a.w.i.DefaultAirbyteStreamFactory(internalLog):109 - Running integration: io.airbyte.integrations.source.redshift.RedshiftSource
    2022-11-29 08:39:38 INFO i.a.w.i.DefaultAirbyteStreamFactory(internalLog):109 - Command: DISCOVER
    2022-11-29 08:39:38 INFO i.a.w.i.DefaultAirbyteStreamFactory(internalLog):109 - Integration config: IntegrationConfig{command=DISCOVER, configPath='source_config.json', catalogPath='null', statePath='null'}
    2022-11-29 08:39:38 WARN i.a.w.i.DefaultAirbyteStreamFactory(internalLog):106 - Unknown keyword order - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword
    2022-11-29 08:39:38 WARN i.a.w.i.DefaultAirbyteStreamFactory(internalLog):106 - Unknown keyword airbyte_secret - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword
    2022-11-29 08:39:38 INFO i.a.w.i.DefaultAirbyteStreamFactory(internalLog):109 - HikariPool-1 - Starting...
    2022-11-29 08:39:38 INFO i.a.w.i.DefaultAirbyteStreamFactory(internalLog):109 - HikariPool-1 - Start completed.
    2022-11-29 08:39:39 INFO i.a.w.i.DefaultAirbyteStreamFactory(internalLog):109 - HikariPool-1 - Driver does not support get/set network timeout for connections. ([Amazon][JDBC](10220) Driver does not support this optional feature.)
    2022-11-29 08:39:39 INFO i.a.w.i.DefaultAirbyteStreamFactory(internalLog):109 - Internal schemas to exclude: [catalog_history, information_schema, pg_catalog, pg_internal]
    2022-11-29 08:39:40 INFO i.a.w.i.DefaultAirbyteStreamFactory(internalLog):109 - HikariPool-1 - Shutdown initiated...
    2022-11-29 08:39:40 INFO i.a.w.i.DefaultAirbyteStreamFactory(internalLog):109 - HikariPool-1 - Shutdown completed.
    2022-11-29 08:39:40 ERROR i.a.w.i.DefaultAirbyteStreamFactory(internalLog):105 - Something went wrong in the connector. See the logs for more details.
    Stack Trace: java.lang.NullPointerException: null value in entry: isNullable=null
    	at com.google.common.collect.CollectPreconditions.checkEntryNotNull(CollectPreconditions.java:33)
    	at com.google.common.collect.ImmutableMapEntry.<init>(ImmutableMapEntry.java:54)
    	at com.google.common.collect.ImmutableMap.entryOf(ImmutableMap.java:339)
    	at com.google.common.collect.ImmutableMap$Builder.put(ImmutableMap.java:449)
    	at io.airbyte.integrations.source.jdbc.AbstractJdbcSource.getColumnMetadata(AbstractJdbcSource.java:267)
    	at io.airbyte.db.jdbc.JdbcDatabase$1.tryAdvance(JdbcDatabase.java:81)
    	at java.base/java.util.Spliterator.forEachRemaining(Spliterator.java:332)
    	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
    	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
    	at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921)
    	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    	at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682)
    	at io.airbyte.db.jdbc.DefaultJdbcDatabase.bufferedResultSetQuery(DefaultJdbcDatabase.java:56)
    	at io.airbyte.integrations.source.jdbc.AbstractJdbcSource.discoverInternal(AbstractJdbcSource.java:190)
    	at io.airbyte.integrations.source.redshift.RedshiftSource.discoverInternal(RedshiftSource.java:90)
    	at io.airbyte.integrations.source.redshift.RedshiftSource.discoverInternal(RedshiftSource.java:30)
    	at io.airbyte.integrations.source.relationaldb.AbstractDbSource.discoverWithoutSystemTables(AbstractDbSource.java:241)
    	at io.airbyte.integrations.source.relationaldb.AbstractDbSource.getTables(AbstractDbSource.java:498)
    	at io.airbyte.integrations.source.relationaldb.AbstractDbSource.discover(AbstractDbSource.java:110)
    	at io.airbyte.integrations.base.IntegrationRunner.runInternal(IntegrationRunner.java:127)
    	at io.airbyte.integrations.base.IntegrationRunner.run(IntegrationRunner.java:97)
    	at io.airbyte.integrations.source.redshift.RedshiftSource.main(RedshiftSource.java:136)
    2022-11-29 08:39:40 INFO i.a.w.t.TemporalAttemptExecution(get):134 - Stopping cancellation check scheduling...
    m
    • 2
    • 4
  • u

    404 Roy

    11/29/2022, 8:47 AM
    Hi guys No code in the airbyte-workers module has been modified, VERSION=0.40.5 ./gradlew airbyte-workers:build rebuilds a local image version for use, but adding any destination will cause an error. No error is reported when using a remote 0.40.6 image
    error.log
    • 1
    • 1
  • n

    Nikolai Nergård

    11/29/2022, 8:47 AM
    Has anyone successfully installed Airbyte v0.42.0 in k8s with Helm, and with external DB? The bootloader seems to get stuck and time out and the deployment fails. Installing default chart locally seems to work fine.
    e
    e
    • 3
    • 4
  • t

    thomas trividic

    11/29/2022, 9:02 AM
    tips against “geography” field bug
  • t

    thomas trividic

    11/29/2022, 9:02 AM
    if i do a octavia import it generate those 3 fields
    • 1
    • 1
  • t

    thomas trividic

    11/29/2022, 9:02 AM
    Copy code
    geography: auto
      schemaChange: no_change
      notifySchemaChanges: true
      nonBreakingChangesPreference: ignore
    • 1
    • 1
  • t

    thomas trividic

    11/29/2022, 9:02 AM
    removing them from the yaml and you can do octavia apply withtout bug ( in my case )
    ✔️ 2
    • 1
    • 1
  • b

    Benedikt Buchert

    11/29/2022, 9:58 AM
    Just stumbled over a really annoying UX issue with the LinkedIn connector. When adding the account ID you need to press enter to persist the ID in the source. I was just searching for the save button in the connector. But I got it working after a lot of trial end error.
    • 1
    • 3
  • b

    Bhavya Verma

    11/29/2022, 10:47 AM
    Hey guys, Hope you all are doing good, So I wanted to ask since I'm not an expert on Kubernetes.... Is it possible to deploy our AIrbyte Open Source on Kubernetes over GCP? If yes how should I go about doing the same... Should I be using GKE engine which GCP has to offer? And then finally have a redirect URL to the same with a Google Sign-On authentication layer over it
    t
    • 2
    • 10
  • a

    Andreas

    11/29/2022, 11:20 AM
    Hi! Im looking for some help on this problem. Basically my syncs fail randomly because of "com.google.cloud.bigquery.BigQueryException: Already Exists: Dataset my-project:source_airbyte_check_stage_tmp". I didn' t find anything regarding this error. If I manually sync the failing sources, they work flawlessly... https://discuss.airbyte.io/t/com-google-cloud-bigquery-bigqueryexception-already-exists-dataset-my-project-source-airbyte-check-stage-tmp/3302
    • 1
    • 1
  • n

    noobolte bawa

    11/29/2022, 11:20 AM
    Hi Team, Is there is a way that i can pass json body in post request in airbyte cdk with python?
    m
    • 2
    • 4
  • s

    Santosh

    11/29/2022, 12:09 PM
    Hey team,I want to know whether I send multiple csv from s3 bucket to different collections in mongo using single connection in airbyte
    • 1
    • 1
  • u

    데이브

    11/29/2022, 3:02 PM
    Hi, Team. I am working on ETL from AWS Aurora mysql to AWS Aurora Postgresql. However, it takes 6 hours to store 10 million records. It’s too slow. Is this a normal speed?
    h
    s
    • 3
    • 5
  • s

    Stuart Horgan

    11/29/2022, 3:29 PM
    I have deleted my whole local airbyte folder and everything on docker to start it all from fresh. I just cloned the repo again. I checked the guide here https://airbyte.gitbook.io/airbyte/deploying-airbyte/local-deployment and see there is now an additinal section about Mac M1 machines, which is what I have. When i follow those instructions, export the variables and run the command
    VERSION=dev docker-compose up
    I get this error:
    Copy code
    Error response from daemon: manifest for airbyte/worker:dev not found: manifest unknown: manifest unknown
    Anyone know what is happening? Didn't think I would hit an error so instantly when trying to start a clean re-install
    s
    • 2
    • 4
  • b

    Brad Nemetski

    11/29/2022, 3:39 PM
    I'm trying to copy data from excel/csv to a Snowflake database. Within my files are datetime columns that are being stored in snowflake as varchars. Is there some way for Airbyte to recognize that these are timetamps and save them as such? 1. I can't fix this with dbt - we're planning on using the api to allow clients to create new tables and if we have to insert a manual step in the middle the flow breaks down 2. I tried adding {"parse_dates": true} to the reader options to see if I could get pandas to recognize the dates. I've also tried to manually set the columns to datetimes, and that doesn't work as well {"dtype": {"trans_date":"datetime64"}}. a. This does work when I just try to read the file using pandas.read_csv(), i.e. in a python interpreter and not through Airbyte 3. I tried dumping to postgres instead of Snowflake to see if it was a destination issue, and it was a varchar there as well Any ideas? Thanks
    m
    • 2
    • 9
  • c

    Coleman Kelleghan

    11/29/2022, 4:38 PM
    Hi Airbyte, we are looking to connect a Azure Blob Storage source to a postgres destination, I understand that this connector is in development. Is this available in any way in a beta stage? Thanks
    s
    • 2
    • 3
  • r

    Rahul Borse

    11/29/2022, 5:21 PM
    Hi All, While creating connection between source postgres and destination s3, we pass json schema for tables for the selected tables. I need to pass extra type or or a information for a column to perform encryption for column data having this extra information of column. What information/class or file in airbyte source postgres connector code I need to change so that I can achieve the mentioned operation. Can someone please help?
    • 1
    • 3
  • a

    Alexandre Chouraki

    11/29/2022, 6:00 PM
    Hi Airbyte! I'm trying to build a custom connector with Python CDK, and I'm having some issues at the "check_connection" step. I've tried to emulate the OneSignal source connector implementation, as specified in the tutorial
    Copy code
    def check_connection(self, logger, config) -> Tuple[bool, any]:
            try:
                args = self.convert_config2stream_args(config)
                stream = ExportPatients(**args)
                print(stream.auth.token)
                records = stream.read_records(sync_mode=SyncMode.full_refresh)
                next(records)
                return True, None
            except Exception as e:
                return False, e
    But when testing it, I get these logs :
    Copy code
    {"type": "LOG", "log": {"level": "ERROR", "message": "[{\"errorCode\":\"TOKEN_NOT_FOUND\",\"componentType\":\"AUTH\",\"message\":\"JWT Token is not present in request headers\",\"details\":{},\"date\":\"2022-11-29T17:34:52.907Z\"}]"}}
    {"type": "LOG", "log": {"level": "ERROR", "message": "Check failed"}}
    {"type": "CONNECTION_STATUS", "connectionStatus": {"status": "FAILED", "message": "HTTPError('401 Client Error:  for url: <https://api.live.welkincloud.io/{redacted}/{redacted}/export/PATIENT>')"}}
    Even though I do have a token set up in stream.auth, it's printing well, and doing
    Copy code
    tok = "the_token_that_was_printed"
    h = {
            "Authorization": "Bearer {}".format(tok)
        }
    r = requests.get(f"<https://api.live.welkincloud.io/{redacted}/{redacted}/export/PATIENT>", headers=h)
    in Python yields a 200 instead of a 401... I can't fathom why that won't work, if the token can be found... Is there a way to check the headers of a request from the stream object? I could definitely use some help!
    m
    n
    • 3
    • 11
  • g

    Grember Yohan

    11/29/2022, 6:21 PM
    My stripe 'Payment Intents' stream is in Failure, and I don't understand why 🤔 Here are the logs (error of first attempt before line 510) The state is never saved because the stripe connector is super long and I never manage to reach the success state, so I always come back from the start, failure after failure. How can I fix my issue / at least do incremental loadings day per day that save the states regularly, without waiting for the whole replication to be in success? Thank you for your help! 🙏
    4d4a12d1_4adc_4e96_b384_90eb8238fd18_logs_2343_txt.txt
    m
    • 2
    • 6
  • k

    KalaSai

    11/29/2022, 6:33 PM
    Hello, Is it possible to not use kube overlays and instead containerize everything in one YAML which then could be kicked off as a service from compute cluster. That way end customer cannot ssh into our domain but instead just build, push and execute airbyte as a service. I am a beginner to both containerization and airbyte and would appreciate help.
    • 1
    • 1
1...101102103...245Latest