https://linen.dev logo
Join Slack
Powered by
# ask-community-for-troubleshooting
  • ł

    Łukasz Aszyk

    02/02/2023, 6:08 PM
    Hi, Airbyte community! I’m pretty new here, but already trying Airbyte API to create an API call to the locally hosted tool. I was able to connect to the instance successfully but I struggle with the
    <http://localhost:8000/api/v1/sources/create>
    endpoint. I know I need to pass credentials to the source, namely GSheet, but can’t find a way to find the requested fields for the API call body. Does anyone know where to find the formatted JSON to easily pass the variables to? Appreciate all help 🙂 I got this response:
    Copy code
    "exceptionStack": [
            "io.airbyte.server.errors.BadObjectSchemaKnownException: The provided configuration does not fulfill the specification. Errors: json schema validation failed when comparing the data to the json schema. ",
            "Errors: $.spreadsheet_id: is missing but it is required, $.credentials: is missing but it is required ",
            "Schema: ",
            "{",
            "  \"type\" : \"object\",",
            "  \"title\" : \"Stripe Source Spec\",",
            "  \"$schema\" : \"<http://json-schema.org/draft-07/schema#>\",",
            "  \"required\" : [ \"spreadsheet_id\", \"credentials\" ],",
            "  \"properties\" : {",
            "    \"credentials\" : {",
            "      \"type\" : \"object\",",
            "      \"oneOf\" : [ {",
            "        \"type\" : \"object\",",
            "        \"title\" : \"Authenticate via Google (OAuth)\",",
            "        \"required\" : [ \"auth_type\", \"client_id\", \"client_secret\", \"refresh_token\" ],",
            "        \"properties\" : {",
            "          \"auth_type\" : {",
            "            \"type\" : \"string\",",
            "            \"const\" : \"Client\"",
            "          },",
            "          \"client_id\" : {",
            "            \"type\" : \"string\",",
            "            \"title\" : \"Client ID\",",
            "            \"description\" : \"Enter your Google application's Client ID\",",
            "            \"airbyte_secret\" : true",
            "          },",
            "          \"client_secret\" : {",
            "            \"type\" : \"string\",",
            "            \"title\" : \"Client Secret\",",
            "            \"description\" : \"Enter your Google application's Client Secret\",",
            "            \"airbyte_secret\" : true",
            "          },",
            "          \"refresh_token\" : {",
            "            \"type\" : \"string\",",
            "            \"title\" : \"Refresh Token\",",
            "            \"description\" : \"Enter your Google application's refresh token\",",
            "            \"airbyte_secret\" : true",
            "          }",
            "        }",
            "      }, {",
            "        \"type\" : \"object\",",
            "        \"title\" : \"Service Account Key Authentication\",",
            "        \"required\" : [ \"auth_type\", \"service_account_info\" ],",
            "        \"properties\" : {",
            "          \"auth_type\" : {",
            "            \"type\" : \"string\",",
            "            \"const\" : \"Service\"",
            "          },",
            "          \"service_account_info\" : {",
            "            \"type\" : \"string\",",
            "            \"title\" : \"Service Account Information.\",",
            "            \"examples\" : [ \"{ \\\"type\\\": \\\"service_account\\\", \\\"project_id\\\": YOUR_PROJECT_ID, \\\"private_key_id\\\": YOUR_PRIVATE_KEY, ... }\" ],",
            "            \"description\" : \"Enter your Google Cloud <a href=\\\"<https://cloud.google.com/iam/docs/creating-managing-service-account-keys#creating_service_account_keys>\\\">service account key</a> in JSON format\",",
            "            \"airbyte_secret\" : true",
            "          }",
            "        }",
            "      } ],",
            "      \"title\" : \"Authentication\",",
            "      \"description\" : \"Credentials for connecting to the Google Sheets API\"",
            "    },",
            "    \"row_batch_size\" : {",
            "      \"type\" : \"integer\",",
            "      \"title\" : \"Row Batch Size\",",
            "      \"default\" : 200,",
            "      \"description\" : \"Number of rows fetched when making a Google Sheet API call. Defaults to 200.\"",
            "    },",
            "    \"spreadsheet_id\" : {",
            "      \"type\" : \"string\",",
            "      \"title\" : \"Spreadsheet Link\",",
            "      \"examples\" : [ \"<https://docs.google.com/spreadsheets/d/1hLd9Qqti3UyLXZB2aFfUWDT7BG-arw2xy4HR3D-dwUb/edit>\" ],",
            "      \"description\" : \"Enter the link to the Google spreadsheet you want to sync\"",
            "    }",
            "  },",
            "  \"additionalProperties\" : true",
    t
    n
    • 3
    • 4
  • a

    Adrian Bakula

    02/02/2023, 6:09 PM
    👋 Hi there! I wanted to raise a concern with one of the more recent updates to the Snowflake Destination Connector, specifically version
    0.4.46
    (these changes). Specifically with this:
    Copy code
    FULL_REFRESH will see partial data if fails part way through, this is being addressed in phase 2 with a table in destination that tracks when a sync starts and ends
    1. How can we tell on our side whether a full refresh was successful or not? We can have downstream dependencies that read from the raw data tables, but without checking via airbyte API whether the sync was successful for the connection that carries that table, there's no way for us to be sure. In cases where you have an orchestration tool this is fine, but there are cases where you read table contents without orchestration. 2. Relatedly, if we have tables of the form
    Full Refresh | Append
    , are we just going to have bad data in our raw tables forever? I suppose this is fine, but it puts deduplication burden on our end to make sure we aren't ingesting the same data twice because we had a one time failure. 3. I found the communication on this to be poor. The changelog on the connector website here just says
    Copy code
    0.4.46	2023-01-26	#20631	Added support for destination checkpointing with staging
    even though potential breaking changes were identified in the pull request and explicitly mentioned. This can have huge user impact and should be communicated explicitly. I happened to find it since I was digging through your blog post about release notes, but that shouldn't be where I find out about this kind of thing. Thanks all 😄
    u
    • 2
    • 3
  • s

    Sharath Chandra

    02/02/2023, 6:11 PM
    I am doing Mixpanel to Redshift migration, based on the logs, I see every 1000 records are being moved as a batch (which is slow). How to increase this number to make data load faster?
    u
    • 2
    • 2
  • ł

    Łukasz Aszyk

    02/02/2023, 6:22 PM
    Let me reiterate 🙂 How to get the JSON schema for connection configuration to pass to the body when calling the
    <http://localhost:8000/api/v1/sources/create>
    endpoint?
    n
    • 2
    • 1
  • l

    Lenin Mishra

    02/02/2023, 6:53 PM
    Hi folks, I keep getting this error while working with my custom connector.
    Copy code
    2023-02-02 18:48:43 ERROR c.n.s.JsonSchemaFactory(getSchema):395 - Failed to load json schema!
    u
    • 2
    • 1
  • l

    Lenin Mishra

    02/02/2023, 6:54 PM
    Can anyone suggest how to fix this?
  • y

    Yeshwanth LN

    02/02/2023, 7:06 PM
    Hey Folks , Facing Spec Job error while creating the custom connector on airbyte cluster which is deployed on k8s
    the logs on the K8's Cluster :
    java.lang.IllegalStateException: Get Spec job failed. at com.google.common.base.Preconditions.checkState(Preconditions.java:502) ~[guava-31.0.1-jre.jar:?] at io.airbyte.server.converters.SpecFetcher.getSpecFromJob(SpecFetcher.java:14) ~[io.airbyte-airbyte-server-0.39.42-alpha.jar:?] at io.airbyte.server.handlers.SourceDefinitionsHandler.getSpecForImage(SourceDefinitionsHandler.java:292) ~[io.airbyte-airbyte-server-0.39.42-alpha.jar:?] at..................
    The complete issue is written here : https://github.com/airbytehq/airbyte/issues/22336 Can anyone suggest how to fix this?
    m
    • 2
    • 1
  • d

    Domenic

    02/02/2023, 8:18 PM
    Hi, i am using the OracleDb connector and using
    Incremental - Append
    sync for a table. I am doing some testing and it is not acknowledge the new entered row on the table. Under
    Cursor field
    I selected the unique field from the table thinking that it would use this field to register changes. Unfortunately, this didnt work and I am not sure how to make this happen. I also noticed that I cannot select the
    Primary Key
    field - this is disabled for
    Incremental - Append
    sync mode. What am I doing wrong? How can I get the connector to acknowledge the added row record. Ultimately, I am trying to use this for CDC
    • 1
    • 1
  • m

    Michael Yee

    02/02/2023, 8:57 PM
    Hello, I'm syncing a MySQL database and all but one table is successful. Here is a snippet of the error:
    Copy code
    2023-02-02 06:51:40 [32mINFO[m i.a.w.g.DefaultReplicationWorker(getReplicationOutput):495 - failures: [ {
      "failureOrigin" : "destination",
      "failureType" : "system_error",
      "internalMessage" : "tech.allegro.schema.json2avro.converter.AvroConversionException: Failed to convert JSON to Avro: Could not evaluate union, field p_count is expected to be one of these: NULL, INT. If this is a complex type, check if offending field (path: p_count) adheres to schema: 2840014740",
      "externalMessage" : "Something went wrong in the connector. See the logs for more details.",
      "metadata" : {
        "attemptNumber" : 2,
        "jobId" : 65,
        "from_trace_message" : true,
        "connector_command" : "write"
      },
    p_count data type is int(10) unsigned, so 2840014740 is less than its max value of 4294967295... I am thinking that the connector is changing to a int which the max value is 2147483647
    n
    • 2
    • 7
  • a

    Annika Maybin

    02/02/2023, 9:22 PM
    Hello, I am using incremental sync + deduped history for a couple of tables syncing from MySQL to Redshift. Now, syncing it for the first time results in the right amount of entries. But syncing it a second time OR syncing any other table will result in almost all entries being duplicated. I can overwrite the data by doing a Full Refresh | Overwrite but changing it back to incremental sync and syncing a couple times will result in ALL my incremental final tables having duplicate entries. The scd table will contain the two different records from syncing twice, with marking one of the records correctly as __airbyte__active_row = 1, but the final table will have the record with __airbyte__active_row = 1 twice. The _airbyte_unique_key column is NULL for one of the two records in the final table. Any ideas? Thanks. airbyte version 0.40.17, all connectors updated to latest version.
    n
    • 2
    • 8
  • u

    Uria Franko

    02/02/2023, 10:25 PM
    Hey guys, Any one use Airbyte cloud for embedded solution?
  • r

    Ryan Watts

    02/02/2023, 11:14 PM
    Hello everyone, I would like to know if there is a configuration in airbyte that allows sending exported source data to the destination sooner during a sync. For example I notice in the logs that the job has collected lets say 1,000,000 records. However when I check my destination, it still says that there are 0 records inserted until the final step of that job. I’d like to ship the source records in batches of 10,000 or any arbitrary number. Is there a way to force airbyte to do this.
    u
    • 2
    • 1
  • s

    Slackbot

    02/03/2023, 3:44 AM
    This message was deleted.
    ✅ 1
    u
    u
    • 3
    • 2
  • d

    Dominik Mall

    02/03/2023, 3:49 AM
    Hello! I’m trying to update my helm chart version (to get a newer version of airbyte with the connector builder UI). I’m currently using chart 0.40.40, when I try to update the version to 0.43.24 (which appears to be the latest), the temporal deployment goes into a CrashLoopBackOff because it can’t connect to my external postgres database. I tried a few versions in below and get the same error. I guess somewhere between those versions, something changed, but I have no idea how to find that out short of trying every single helm chart version to narrow it down. _Logs from temporal container_:
    Copy code
    + temporal-sql-tool --plugin postgres --ep redacted-ip -u redacted-username -p 5432 create --db temporal
    2023-02-02T13:20:51.392Z	ERROR	Unable to create SQL database.	{"error": "unable to connect to DB, tried default DB names: postgres,defaultdb, errors: [pq: password authentication failed for user \"redacted-username\" pq: password authentication failed for user \"redacted-username\"]", "logging-call-at": "handler.go:97"}
    Semi-related: is there somewhere I can see which helm chart version is associated with which application version?
    n
    • 2
    • 1
  • a

    Aditya Raval

    02/03/2023, 5:56 AM
    👋 Hello, team! I want to sync a complex JOIN query data from postgreSQL to Redshift on specific interval (i.e every 15 mins) Does Airbyte support such behaviour (same as Logstash ETL Pipeline]?
    u
    • 2
    • 1
  • g

    Gowrav Tata

    02/03/2023, 6:40 AM
    I'm trying to make a connection with MongoDB as source. This is the following error. State code: -3; Message: Timed out after 30000 ms while waiting to connect. Client view of cluster state is {type=UNKNOWN, servers=[{address=10.46.56.235:27000, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketWriteException: Exception sending message}, caused by {javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target}, caused by {sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target}, caused by {sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target}}] Can someone please let me know, how to resolve this.
    mongo db error log.txt
    n
    n
    r
    • 4
    • 10
  • u

    김건희

    02/03/2023, 7:25 AM
    Hi! I'm using airbyte open-source on eks. I think i need to know about airbyte inside logic. Is there any document that describe about airbyte inside logic? Thank you for your effort to make airbyte.
    u
    • 2
    • 2
  • o

    Oliver Meyer

    02/03/2023, 9:26 AM
    Hi, I'm creating a new source using the low-code CDK and having trouble getting the tests to pass. PR is here. Would appreciate any help 🙂
    u
    • 2
    • 1
  • o

    Ouss

    02/03/2023, 10:17 AM
    Hello Everyone, I am trying to create a Shopify source using password method and Airbyte 0.40.32 deployed on Kubernetes but I am encountering
    non-json response
    error while establishing connection with Shopify. It seems the problem should have been fixed in the 0.40.26 version: I am seeing the same error on the console as reported by @Myroslav Tkachenko here https://airbytehq.slack.com/archives/C021JANJ6TY/p1671053701798289
    m
    m
    • 3
    • 3
  • j

    Joey Taleño

    02/03/2023, 10:26 AM
    Hi Team, Anyone tried using Toggl? I tried today and I'm only getting one record in TIME_ENTRIES 😅
    u
    • 2
    • 2
  • j

    Joviano Cicero Costa Junior

    02/03/2023, 1:11 PM
    Hey people, I am studying a situation where the connection has no data to load using CDC do capture data. Looks to me when has no new data to get, the process stay waiting few minutes to get something and this process expend more time then a situation that has few record to load. Is possible to configure this wait? I need to build a process that replicate data near real time. Few record can be replicated few more than 1 minute, but, if have no records, during at least 5 minutos. I am thinking in schedule to run every minute and I expected when has no data that processes finishes faster then when have records.
    u
    • 2
    • 1
  • m

    Mauro Veneziano

    02/03/2023, 1:13 PM
    Hello everyone! Im doing my first contribution (to airbyte and to open source in general. Never done it before 😄 ) I opened the pull request but im not sure if i need to do something else 🤔
    m
    u
    +2
    • 5
    • 10
  • u

    Uria Franko

    02/03/2023, 2:27 PM
    Hey guys Any one used airbyte as ETL and another tool to push data to CRMs ?
    c
    u
    • 3
    • 2
  • c

    Christo Olivier

    02/03/2023, 2:56 PM
    Hey everyone. I am currently trying to do a PoC where we pull data from Facebook's Pages API using Airbyte. We keep getting the following error from the API
    Copy code
    {
      "error": {
        "message": "Unsupported request - method type: get",
        "type": "GraphMethodException",
        "code": 100,
        "fbtrace_id": "AwHuSPxbkEDa9B4cpSU1wAR"
      }
    }
    Has anyone run into this before? I see people on online mentioning that a potential fix is getting Business Approval on Facebook for your business and app. We are using the latest version of Airbyte Open Source.
    u
    u
    • 3
    • 4
  • l

    Leon Graf

    02/03/2023, 3:22 PM
    Hy everyone, we are encountering a problem when trying to normalize the data we got from our mssql 2019 DB to put it in our mssql 2019 DB. We get the Error:
    Copy code
    Encountered an error:
    Database Error
      ('HYT00', '[HYT00] [unixODBC][Microsoft][ODBC Driver 17 for SQL Server]Login timeout expired (0) (SQLDriverConnect)'),retryable=<null>,timestamp=1675434737625], io.airbyte.config.FailureReason@64441cbe[failureOrigin=normalization,failureType=system_error,internalMessage=('HYT00', '[HYT00] [unixODBC][Microsoft][ODBC Driver 17 for SQL Server]Login timeout expired (0) (SQLDriverConnect)'),externalMessage=Normalization failed during the dbt run. This may indicate a problem with the data itself.,metadata=io.airbyte.config.Metadata@1ebd324a[additionalProperties={attemptNumber=2, jobId=923, from_trace_message=true}],stacktrace=AirbyteDbtError: 
    Encountered an error:
    Database Error
      ('HYT00', '[HYT00] [unixODBC][Microsoft][ODBC Driver 17 for SQL Server]Login timeout expired (0) (SQLDriverConnect)'),retryable=<null>,timestamp=1675434737625]]]
    We are running Airbyte Version 0.40.18 on our own Ubuntu Server: 22.04.1, with the Version 0.4.28 for the SQL Source Connector and 0.1.22 for the mssql destination. Any Ideas what could be the problem and how to fix it?
    n
    • 2
    • 13
  • a

    Akash Ghadge

    02/03/2023, 3:25 PM
    Hi Team, I am able to create a custom connector for the HTTP-API using connector development kit for python. I have executed commands below for the connector, • python main.py spec • python main.py check --config secrets/config.json • python main.py discover --config secrets/config.json
  • a

    Akash Ghadge

    02/03/2023, 3:30 PM
    Hi Team, I am able to create a custom connector for the HTTP-API using connector development kit for python. I have executed commands below for the connector, • python main.py spec • python main.py check --config secrets/config.json • python main.py discover --config secrets/config.json which are working properly and giving me a desired o/p, but when I execute the read command
    Copy code
    python main.py read --config secrets/config.json --catalog sample_files/configured_catalog.json
    I am getting an data from the API but at the end of the response I am getting
    failure_type: system_error
    , I have attached the screenshot for the same If anyone have any idea regarding to this please let me know, I am happy to try. Thank you
    u
    • 2
    • 1
  • r

    Rocky Appiah

    02/03/2023, 4:11 PM
    When trying to do an initial sync from a postgres source, I get
    Failed to fetch schema. Please try again .. Error: non-json response
    in the UI Error in logs:
    Copy code
    airbyte-server                      | Feb 03, 2023 4:10:14 PM org.glassfish.jersey.server.ServerRuntime$Responder writeResponse
    airbyte-server                      | SEVERE: An I/O error has occurred while writing a response message entity to the container output stream.
    airbyte-server                      | org.glassfish.jersey.server.internal.process.MappableException: org.eclipse.jetty.io.EofException
    airbyte-server                      |   at org.glassfish.jersey.server.internal.MappableExceptionWrapperInterceptor.aroundWriteTo(MappableExceptionWrapperInterceptor.java:67)
    airbyte-server                      |   at org.glassfish.jersey.message.internal.WriterInterceptorExecutor.proceed(WriterInterceptorExecutor.java:139)
    airbyte-server                      |   at org.glassfish.jersey.message.internal.MessageBodyFactory.writeTo(MessageBodyFactory.java:1116)
    airbyte-server                      |   at org.glassfish.jersey.server.ServerRuntime$Responder.writeResponse(ServerRuntime.java:638)
    airbyte-server                      |   at org.glassfish.jersey.server.ServerRuntime$Responder.processResponse(ServerRuntime.java:371)
    airbyte-server                      |   at org.glassfish.jersey.server.ServerRuntime$Responder.process(ServerRuntime.java:361)
    airbyte-server                      |   at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:256)
    Running • Airbyte 0.40.26 • Postgres 1.0.42
    u
    • 2
    • 1
  • c

    Chris

    02/03/2023, 4:19 PM
    I am running Airbyte Open Source on GCP (Google Cloud Engine). I am using Bing Ads as a source and BigQuery as as destination. When I try to sync I get this error:
    Failure Origin: destination, Message: The request signature we calculated does not match the signature you provided. Check your Google secret key and signing method. (Service: Amazon S3; Status Code: 403; Error Code: SignatureDoesNotMatch; Request ID: null; S3 Extended Request ID: null; Proxy: null)
    I am using Google so why are they talking about Amazon S3?
    error_log.txt
    n
    u
    • 3
    • 3
  • k

    Kevin Noguera

    02/03/2023, 4:25 PM
    I'm really enjoying Airbyte but is memory consumption a common issue? I tried to sync 3-4 streams that total around 4GB of data.. and it would make my AWS instance crash/hang everytime.. I'm running a t2.large instance as recommended on the documentation, is this a normal thing?
    u
    • 2
    • 1
1...135136137...245Latest