https://linen.dev logo
Join Slack
Powered by
# advice-data-ingestion
  • r

    Ramon Vermeulen

    08/26/2022, 12:49 PM
    I'm getting a lot of
    Terminating due to java.lang.OutOfMemoryError: Java heap space
    on connections (including custom connectors) lately. I'm running airbyte within a GKE cluster, and I was wondering if there is anything I can do to solve this. I'm using the helm chart deployement. Update: Eventually editing the limits/requests of the "job" deployment in the helm chart solved the issue for me.
  • r

    rea peleg

    08/26/2022, 6:38 PM
    does airbyte support direct ingestion from on prem sqlserver to snowflake (without an explicit intermidiate step of files staged on some lake storage)? also this is not a one time migration operation but an hourly or so incremental load
  • r

    Rocky Appiah

    08/29/2022, 1:00 PM
    my airbyte instance became unstable, looked into it and we ran out of disk space. airbyte_workspace is taking up almost 50 GB of disk
    e
    a
    • 3
    • 3
  • r

    Rocky Appiah

    08/29/2022, 1:00 PM
    Can I clear this volume or?
  • r

    Rocky Appiah

    08/29/2022, 7:57 PM
    How do I clear old logs? When I
    docker exec -it "temperal_container_id" bash
    , I don’t see anything in the
    /tmp
    dir?
  • j

    Jordyn

    08/29/2022, 8:50 PM
    Hello. I'm working on a PR for source-mongodb-v2 that separates out the encryption settings from the instance type and adds the ability to input a certificate during the source configuration. Is backwards compatibility a concern I should be testing for in the integration tests here? Because the encryption settings are separated out, they're now represented differently and are in a different section of the spec.json. Is backward compatibility with old configs an expectation?
  • r

    Rocky Appiah

    08/30/2022, 11:55 AM
    Wonder if this is related to my question about logs above?
    z
    • 2
    • 6
  • d

    Denis

    08/30/2022, 3:24 PM
    Hello! Can anyone suggest which is an efficient way to get a list of mailchimp subscribers (email + subscription date) into a BQ table. The idea is to use the email_activity table and get the a list of distinct email addresses, but I am not sure if I am importing too much data to just get a list. (besides, I usually prefer to use the bigquery dts destination connection to avoid to have too many tables, but performance wise which would be better?)
    z
    • 2
    • 4
  • p

    Pavan Charan Dharmavaram Hari Rao

    08/30/2022, 8:02 PM
    in airbyte, is there a way to select multiple fields as cursor field for performing incremental load
  • m

    Miłosz Szymczak

    08/31/2022, 11:11 AM
    Hi, I encountered an issue with Salesforce source connector. It cannot sync the Contact objects, this is the only message I get:
    Copy code
    2022-08-31 08:28:33 source > Syncing stream: Contact 
    2022-08-31 08:30:38 source > [{"message":"Your query request was running for too long.","errorCode":"QUERY_TIMEOUT"}]
    2022-08-31 08:30:38 source > Cannot receive data for stream 'Contact', error message: 'Your query request was running for too long.'
    As you can see it's just 2 minutes which I understand is the standard timeout for Salesforce API. Unfortunately there's no way to modify the query from the UI to somehow prevent the long execution. Any ideas how to approach it? It's initial load so I believe this could work with the increments, but manually setting the checkpoint in Status entity on the database didn't help.
  • p

    Pablo Castelletto

    08/31/2022, 4:33 PM
    Hello!, this is my first post here! Im currentIy testing the tool, it's really nice, Great Job guys! 😄! I have a problem i cannot fix, maybe you could help me :3 I would like to use Postgresql Source using Logical Replication! Following the instructions i manage to replicate a test table successfully executing the following commands on my postgres db:
    Copy code
    SELECT pg_create_logical_replication_slot('airbyte_slot', 'pgoutput');
    ALTER TABLE test.test REPLICA IDENTITY DEFAULT;
    CREATE PUBLICATION airbyte_publication FOR TABLE test.test;
    Sadly, it starting taking all of my diskspace on my source db, so i deleted the slot It seems that the slot was producing events for every table in my db instead of just for the test table indicated in the publication and no one were consuming them 😞. What can i do to fix this issue? i couldn't find a solution Thanks!!
  • a

    Amanda Murphy

    08/31/2022, 8:40 PM
    Any update on this bug? https://discuss.airbyte.io/t/missing-mailchimp-email-activity-data/1830/6
  • m

    Maxime Naulleau

    09/02/2022, 9:56 AM
    Hello, I was wondering if it's a good thing to add regex validation in the schema files ? I would says yes but I'd like to have confirmation. Thanks!
  • d

    Dragan

    09/02/2022, 3:47 PM
    Hi, question about data ingestion to snowflake source has date fields in classic format
    2022-01-01
    but this end up in Snowflake as
    VARCHAR
    and it looks like
    2022-01-01T00:00:00Z
    we can normalise this in dbt but is there a plan to sort this out or is there an option to fix it while ingesting the data
  • v

    Vu Le Hoang

    09/05/2022, 7:22 AM
    Hi, MongoDB source is currently at alpha version despite pretty popular. Do we have any road map to develop this connector to GA?
    m
    y
    a
    • 4
    • 4
  • a

    Abiodun Adenle

    09/05/2022, 10:46 PM
    Hi All, I have an issue, when ingesting data from Oracle to Snowflake, columns with null values on Oracle are being omitted when it got into Snowflake. for example, if I have a record as:
    Name: Jedi
    Age: 30
    Salary: null
    In snowflake I only get
    Name: Jedi
    Age: 30
    The salary field is missing What can I do to ensure all null fields in oracle are in Snowflake correctly as null?
    m
    • 2
    • 1
  • g

    Giovani Freitas

    09/06/2022, 9:32 PM
    Hi! I would like to ask a quick question: has anyone here created a custom source to read data from custom objects of Hubspot? I appreciate if you can share!
  • g

    gunu

    09/08/2022, 1:36 AM
    @Andy Yeo (Airbyte) @Subodh (Airbyte) will there be a formal announcement on the release of this feature https://github.com/airbytehq/airbyte/issues/15725 what does this look like in the UI?
    s
    • 2
    • 1
  • r

    robinspilner

    09/08/2022, 2:25 PM
    @Marcos Marx (Airbyte) trying to find documentation on the connect string created by the Oracle connector? Currently the GUI asks for SID, but with RAC and specifically exadata we need to use Service Name. The container SID can't be found by the connector.
  • p

    Pierre Kerschgens

    09/08/2022, 2:56 PM
    Hey everybody, I’m planning on replacing a custom SQS (to S3) ingestion script with Airbyte and I’m not sure if understand Airbyte’s approach correctly. I’m testing with a SQS queue where I’m publishing 1 message/second to. When I start Airbyte’s SQS sync job it seems to run until I quit my SQS-publishing script so Airbyte won’t find any more events. Our prod stage won’t (hopefully) be all empty for a second. Now the question: Is there any message or time limit until a SQS messages batch gets uploaded? I didn’t find any config or documentation for that but I may have missed something. Thanks in advance! 🙏
  • r

    robinspilner

    09/08/2022, 5:39 PM
    @Marcos Marx (Airbyte) I found the connect string information in the OracleSource.java and it looks like Service Name vs SID was coded for but never implemented in the GUI ?
    Copy code
    connectionString = switch (connectionType) {
            case SERVICE_NAME -> buildConnectionString(config, protocol.toString(), SERVICE_NAME.toUpperCase(),
                config.get(CONNECTION_DATA).get(SERVICE_NAME).asText());
            case SID -> buildConnectionString(config, protocol.toString(), SID.toUpperCase(), config.get(CONNECTION_DATA).get(SID).asText());
            default -> throw new IllegalArgumentException("Unrecognized connection type: " + connectionType);
          };
        } else {
          // To keep backward compatibility with existing connectors which doesn't have connection_data
          // and use only sid.
          connectionString = buildConnectionString(config, protocol.toString(), SID.toUpperCase(), config.get(SID).asText());
        }
  • p

    Pierre Kerschgens

    09/09/2022, 8:21 AM
    Noticed this as well. To ensure no-compression I had to save the connector with enabled compression and without compression again.
  • a

    addu all

    09/09/2022, 3:51 PM
    2022-09-09 154705 ERROR i.a.c.i.LineGobbler(voidCall):82 - SLF4J: Found binding in [jarfile/airbyte/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] 2022-09-09 154705 ERROR i.a.c.i.LineGobbler(voidCall):82 - SLF4J: Found binding in [jarfile/airbyte/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class] 2022-09-09 154705 ERROR i.a.c.i.LineGobbler(voidCall):82 - SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 2022-09-09 154705 ERROR i.a.c.i.LineGobbler(voidCall):82 - SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
  • a

    addu all

    09/09/2022, 3:53 PM
    2022-09-09 154805 ERROR i.a.w.i.DefaultAirbyteStreamFactory(lambda$create$1):70 - Validation failed: null
  • a

    addu all

    09/09/2022, 4:07 PM
    this is a bug
    r
    • 2
    • 1
  • p

    Philip Johnson

    09/10/2022, 12:47 AM
    anyone ever encounter errors like this when trying to load data into a postgres destination?
    cross-database references are not implemented...
    • 1
    • 1
  • p

    Philip Johnson

    09/10/2022, 8:57 PM
    I'm currently developing a custom source and was wondering what the proper way is to "update" the source in the UI after making a change. Currently I'm doing the following but my connection doesn't seem to use the updated source after I update it. 1. Make code changes on my local machine 2. Build docker file and add a tag 3. Push to docker hub repo 4. Go into Airbyte and change the tag and click "Change" 5. After successfully updating the tag, run the connection again
    a
    • 2
    • 5
  • s

    Slackbot

    09/11/2022, 2:51 PM
    This message was deleted.
  • t

    Toan Doan

    09/11/2022, 11:53 PM
    Hi everyone, I am using MySQL connector for syncing from a MariaDB. I encounter an issue where if a date column has an invalid date (i.e. '0000-11-30'), sync fails with error message "java.sql.SQLException: YEAR". Please check the Github issue below for more details What's strange is, if column has some valid value, and the invalid value is all zero '0000-00-00', sync works fine. Github issue I created for this: https://github.com/airbytehq/airbyte/issues/16574 I really appreciate all your help, I have been trying a lot of ways to deal with this, but still fail. This is really important for my customer
    e
    • 2
    • 1
  • e

    Eli Sigal

    09/12/2022, 3:43 PM
    Hi. It is hard to say where lies the issue. lets try to debug: 1. is slack to other destination works? such as local file? 2. is the destination works? probably yes 3. did it work before? 4. lets try to look here for additional info:
    Copy code
    Docker volume job log path: /tmp/workspace/1091/0/logs.log
    it says
    Copy code
    Source did not output any state messages
    hard to say exactly
    s
    • 2
    • 7
1...89101112Latest