https://linen.dev logo
Join Slack
Powered by
# ask-community-for-troubleshooting
  • d

    dandpz

    11/15/2022, 10:34 AM
    Hi Everyone, I am using the Amazon Ads connector and I was wondering how to get the query field from the sponsored_product_keywords report. From amazon docs it should be a field added by default when using the param
    "segment": "query"
    , but I cannot find it in the downloaded data, maybe should it be added a new report stream with this kind of parameter? Thanks in advance 🙂
    n
    • 2
    • 2
  • k

    komal azram

    11/15/2022, 10:38 AM
    I am trying to make a connection between gcs and snowflake. Both connections are working but when I transfer data it fails and gives error. I have attached the error log file.
    logs-6580.txt
    • 1
    • 1
  • n

    navod perera

    11/15/2022, 12:21 PM
    Hello Team Airbyte, We integrated woo-commerce with Airbyte. So the problem is when we add a correct woocommerce shop URL and wrong secret keys (Consumer key and Consumer secret) it succeeds. I need airbyte and woocommerce integration to succeed only if the user's input shop URL and secret keys (Consumer key and Consumer secret) are valid. Thanks in advance.
    s
    • 2
    • 1
  • b

    Berzan Yildiz

    11/15/2022, 12:28 PM
    I keep getting
    AssertionError: Mismatched number of tables 190 vs 6 being resolved
    for my custom source connector to postgres. This occurs during normalizaiton. I am sure my schema is fine. What does this error mean?
    n
    • 2
    • 4
  • t

    thomas trividic

    11/15/2022, 1:15 PM
    hello, we have problem wit airbyte deployment on Kubernetes
    y
    n
    • 3
    • 6
  • t

    thomas trividic

    11/15/2022, 1:15 PM
    we have difficulties to expose the webapp
  • t

    thomas trividic

    11/15/2022, 1:15 PM
    in our internal AWS network
  • d

    Dave Tomkinson

    11/15/2022, 1:20 PM
    Hi all. I'm trying to understand the airbyte UI (0.40.18) and troubleshoot a sync. When my job completes I get a '`Sync Succeeded 10,000,000 emitted records | 10,000,000 committed records`' message. But when I do a count in the raw table I only have 9,986,326 rows. I found a line in the log which says
    A total of 13674 record(s) of data from stream AirbyteStreamNameNamespacePair{name='events_196', namespace='analytics_raw'} were invalid and were ignored.
    My sync is a raw sync with no normalisation postgres (RDS) to Redshift Serverless (using destination-redshift 0.3.51) going direct (not via S3) How do I figure out why those rows are invalid as all rows are required? (This was a test copy of 10M rows from a 1.7B row db) Why does the UI say its committed 10,000,000 records when it hasn't?
    n
    m
    • 3
    • 18
  • s

    Savio Lucena

    11/15/2022, 2:24 PM
    Does airbyte in K8s provides an entrypoint to allow us to define additional environment variables to the
    JOB_MAIN_
    container in a worker?
    s
    • 2
    • 3
  • r

    Rytis Zolubas

    11/15/2022, 2:41 PM
    Hello! Is it possible to use API without webapp running? How it could be done? Maybe I could expose a certain port from server?
    y
    • 2
    • 4
  • p

    Paulo Singaretti

    11/15/2022, 4:02 PM
    Hello! I'm starting my journey with Airbyte devopment a pipeline to collect data from Postgres(CDC) that sends to Kafka and then send these data to S3 or Kinesis, but I'm facing errors in any of these destinations I choose: S3:
    Copy code
    ERROR i.a.i.b.AirbyteExceptionHandler(uncaughtException):26 - Something went wrong in the connector. See the logs for more details.
    2022-11-15 15:59:03 INFO i.a.w.i.DefaultAirbyteStreamFactory(parseJson):78 - java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 1
    Kinesis:
    Copy code
    ERROR i.a.i.b.AirbyteExceptionHandler(uncaughtException):26 - Something went wrong in the connector. See the logs for more details.
    2022-11-15 15:59:41 INFO i.a.w.i.DefaultAirbyteStreamFactory(parseJson):78 - java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 1
    Do you guys have any idea what I'm doing wrong? I guess it's something in Kafka due to same error.
    • 1
    • 3
  • g

    Gergely Lendvai

    11/15/2022, 4:04 PM
    Hi all, We have a
    Hubspot -> S3
    connector with the following configs and we’d like to understand why it takes a bunch of time to run a sync and whether it can be sped up in any way. For the deployment we are using the helm chart with the following resource settings for jobs (this is not reflected in the destination definition which is weird, however the
    source-*
    and
    destination-*
    pods are using these limits):
    Copy code
    global:
      jobs:
        resources:
          requests:
            cpu: "200m"
            memory: "4Gi"
          limits:
            cpu: "200m"
            memory: "4Gi"
    Airbyte version:
    0.40.17
    Source definition:
    Copy code
    {
      "sourceDefinitionId": "36c891d9-4bd9-43ac-bad2-10e12756272c",
      "name": "HubSpot",
      "dockerRepository": "airbyte/source-hubspot",
      "dockerImageTag": "0.2.3",
      "documentationUrl": "<https://docs.airbyte.io/integrations/sources/hubspot>",
      "protocolVersion": "0.2.0",
      "releaseStage": "generally_available"
    }
    Destination definition:
    Copy code
    {
      "destinationDefinitionId": "4816b78f-1489-44c1-9060-4b19d5fa9362",
      "name": "S3",
      "dockerRepository": "airbyte/destination-s3",
      "dockerImageTag": "0.3.17",
      "documentationUrl": "<https://docs.airbyte.com/integrations/destinations/s3>",
      "protocolVersion": "0.2.0",
      "releaseStage": "generally_available",
      "resourceRequirements": {
        "jobSpecific": [
          {
            "jobType": "sync",
            "resourceRequirements": {
              "memory_request": "1Gi",
              "memory_limit": "1Gi"
            }
          }
        ]
      }
    }
    Source:
    Copy code
    {
      "sourceDefinitionId": "36c891d9-4bd9-43ac-bad2-10e12756272c",
      "sourceId": "457f3db8-6ce1-41be-9ecb-7ef9a724c88b",
      "workspaceId": "2b94a777-1e5e-4381-af9f-21582ecce5c7",
      "connectionConfiguration": {
        "start_date": "2022-11-15T12:00:00Z",
        "credentials": {
          "access_token": "**********",
          "credentials_title": "Private App Credentials"
        }
      },
      "name": "hubspot_test",
      "sourceName": "HubSpot"
    }
    Destination:
    Copy code
    {
      "destinationDefinitionId": "4816b78f-1489-44c1-9060-4b19d5fa9362",
      "destinationId": "033a010b-ff8e-4eb3-9ee6-6505a6c42d00",
      "workspaceId": "2b94a777-1e5e-4381-af9f-21582ecce5c7",
      "connectionConfiguration": {
        "format": {
          "compression": {
            "compression_type": "No Compression"
          },
          "format_type": "JSONL"
        },
        "s3_endpoint": "",
        "access_key_id": "**********",
        "s3_bucket_name": "****",
        "s3_bucket_path": "****",
        "s3_bucket_region": "****",
        "secret_access_key": "**********"
      },
      "name": "hubspot_s3",
      "destinationName": "S3"
    }
    Do you know what can cause the pulling of only
    3 MBs
    of data to take
    ~3 hours
    ? Also do you have any recommendations on how to handle this? Many thanks 🙏
    • 1
    • 2
  • k

    Kaan MurzoÄźlu

    11/15/2022, 4:43 PM
    Hello everyone! Here trying to insert data to clickhouse from a nested mongo json document. So my question is that, is it possible to add a column to base table as “documents.drivingLicence” instead of creating separate table "accounts_documents" table.
    Copy code
    accounts
    
    {
        "_id" : ObjectId("xxxx"),
        "clientId" : "xxxxx",
        "areaCode" : "xx",
        "gsm" : "xx",
        "status" : "approved",
        "createdAt" : ISODate("2022-05-26T15:35:44.113+0000"),
        "updatedAt" : ISODate("2022-06-27T10:22:07.959+0000"),
        "document" : {
            "drivingLicence" : "approved",
            "video" : "approved"
        }
    }
    đź‘€ 1
    m
    • 2
    • 4
  • f

    Felipe Cosse

    11/15/2022, 6:13 PM
    Hello everyone! I’m having problem passing data from
    MYSQL
    (AWS Aurora) to
    S3
    (AWS). When a table has a field with the
    TIME
    type, an error occurs when reading the
    PARQUET
    file. Here’s the error:
    Copy code
    Unable to create Parquet converter for data type "timestamp" whose Parquet type is optional int64 member0 (TIME(MICROS,true))
    The field is mapped as Struct and a dictionary is created with the timestamp and the string that would be the timezone.
    Copy code
    {
      "expire_timeofday": {
        "member0": "timestamp",
        "member1": "string"
      }
    }
    I try to convert the field to String but there is an error in the conversion. Wouldn’t it be possible to select the type of field to be saved in the Destination?
    • 1
    • 2
  • a

    Alexander Govgel

    11/15/2022, 6:32 PM
    Hi guys! Integer data type in Postgres source defines as Number. Is there way to get Integer data type as Integer?
    • 1
    • 2
  • j

    Jeff De Los Reyes

    11/15/2022, 6:36 PM
    Hi all, just a question on db replication i,e one postgres source to another postgres destination. Suppose I have a connection set-up to run nightly, if I change the source schema by adding new columns to a table or adding a new table to a database, will I have to refresh this source or can the connection detect this change?
    s
    • 2
    • 1
  • m

    Manish Tomar

    11/15/2022, 7:36 PM
    How can we create Airbyte source/ Destination in Bulk using Airbyte API to Automate things?
    e
    • 2
    • 4
  • j

    Jonathan Cachat PhD (JC)

    11/15/2022, 8:55 PM
    is anyone here aware of how to properly setup a Facebook Marketing API insights call for a Manager Account?? My group is a marketing manager, and we have a few 100 business we over see the social ads. SO, business_id=#####-me-##### and that manager account has access to 250 account_ids. I want to pull the insights reports for all accounts, so rather than use account_id for a single customer, i'd like to use my groups ID. however, whenever I try to use the business_id or act_#### ids - it comes back with FacebookAPIException('Error: 100, (#100) Missing permissions')
    r
    • 2
    • 1
  • j

    Jonathan Cachat PhD (JC)

    11/15/2022, 9:18 PM
    IF you can only call one ACCOUNT_ID via Facebook API - is there a way to call a custom report?? I made the custom report I am hoping to download in the reportbuilder. I have a report_id for it. Is there a graph URL that I can drop the report_id into and it will kick me out a flat data file!?
    • 1
    • 2
  • r

    Rahul Borse

    11/15/2022, 10:56 PM
    Hi all, Is there any way I can fork the base-java-s3 repository only, if I am trying fork option for the base-java-s3, it is doing fork for entire airbyte repository. Can someone please help.
    e
    r
    • 3
    • 2
  • a

    Abdi Darmawan

    11/16/2022, 2:06 AM
    hi all, how to make pod
    orchestrator-norm-job-xxx
    run to spesific nodepool already set
    JOB_KUBE_NODE_SELECTORS: pool-env=production-airbyte
    in configmap kubernetes ,but only for pods
    orchestrator-norm-job-xxx
    still running in random nodepool
    • 1
    • 1
  • b

    Benen Cahill

    11/16/2022, 3:16 AM
    Hi folks, I’m running a Mixpanel connector on an open source instance but unable to sync sucessfully even a days worth of data. It seems the mixpanel connector only retrieves 1000 rows per request, which combined with Mixpanel’s API request limits means we can never experience the throughput needed to catch up with the number of events flowing through our Mixpanel project. Is there any way to increase that row size up from 1000 rows at a time?
    âž• 2
    r
    j
    • 3
    • 5
  • m

    Mukul Gopinath

    11/16/2022, 7:26 AM
    Hey Team, I'm running Airbyte on EKS and 3 pods keep running into Pending state with the same Event,
    Warning  FailedScheduling  35s   default-scheduler  0/1 nodes are available: 1 node(s) had volume node affinity conflict.
    It gets fixed when I resize the
    airbyte-volume-configs
    persistent volume. Initially from 500Mi to 2Gi and then later pulled it to 20Gi too. Still facing this issue. Is there a suggestive volume that needs to be configured? Or is there a way to reclaim the volume if this is temporary? https://discuss.airbyte.io/t/eks-pods-running-into-pending-state-due-to-pv/3211
    e
    • 2
    • 2
  • b

    Berzan Yildiz

    11/16/2022, 7:36 AM
    Is there a way for the raw and tmp directories to be cleaned up after sync?
    • 1
    • 1
  • g

    Gergely Imreh

    11/16/2022, 8:55 AM
    Hi! I was doing some connector development work, and added the custom connector both as “dev” (so I can iterate on it), and a “regular” (so my live pipelines would use that connector). Now I’d like to remove the “dev” version from the available connectors. Is that possible in any way? I didn’t see anything in the UI. Thanks!
    • 1
    • 2
  • r

    Rahul Borse

    11/16/2022, 9:09 AM
    Hi all, I am using java 17.0.5 and gradle 6.9.3 and when I am trying to gradle build airbyte I am getting "Unsupported class file major version 61" error. Can someone help me with this? Do I need to change java version with any specific version?
    r
    • 2
    • 2
  • v

    Vikas Goswami

    11/16/2022, 10:17 AM
    Hi all, I am trying to setup airbyte on AWS EKS Cluster and I want to use S3 as log location I have configured AWS IAM User Credentials and Log bucket name and region as well and I removed minio because I am gonna use S3 for storing logs but due to this airyte-worker is not coming up and when I go with default minio deployment then everything works fine. I followed airbyte document but it doesn't work for me. I don't know what I am missing. If anyone can help me that would be great
    r
    e
    +4
    • 7
    • 32
  • m

    Monika Bednarz

    11/16/2022, 10:19 AM
    Hi Team ! octavia wave I was trying to set up the Adjust connector (alpha) and the below error pops up no matter the setup of the source. The equivalent API calls succeed 🤔 The issue occurrs with fetching the schema of the source sadparrot Screen below 🔽 It’d be so grateful on any insight! We’ve got the newest version of Airbyte.
    r
    m
    s
    • 4
    • 20
  • k

    Karan

    11/16/2022, 1:20 PM
    Hello Team, I have a gcp Managed Instance where I have installed Airbyte. But when I try connecting it to postgres Managed gcp instance it doesn't connect. It simply gives non-json response. However same connection works when I try connecting via local docker instance. Why is this happening and what is the turnaround?
    r
    m
    n
    • 4
    • 12
  • l

    Leonardo de Almeida

    11/16/2022, 2:09 PM
    Hi guys, I'm having a issue with airbyte v0.40.14 in kubernete when I try to test a Postgres source sometimes the rce-postgres-check pod was not created and sometimes it was. Anyone with this issue too?
    • 1
    • 2
1...949596...245Latest