https://linen.dev logo
Join Slack
Powered by
# ask-ai
  • j

    Júlia Lemes

    06/30/2025, 5:45 PM
    @kapa.ai I have a PostgreSQL source that connects to my Redshift, there were two times where there was a discrepancy between the source data and the data in Redshift, but there were no errors in the logs. In one of those times, a lot of records in the destination table were null
    k
    • 2
    • 4
  • a

    Aliaksei Ivushkin

    06/30/2025, 6:54 PM
    #C01AHCD885S how to clean up the logs on airbyte running in abtcl
    k
    • 2
    • 1
  • k

    kapa.ai

    06/30/2025, 6:54 PM
    To clean up logs on Airbyte running with abctl, you have a few options: 1. Automatic Log Cleanup (Retention Policy): - Airbyte schedules periodic jobs to delete logs from syncs. By default, logs are deleted every month or when their size exceeds 5GB. - You can adjust the retention policy using environment variables: - `MINIMUM_WORKSPACE_RETENTION_DAYS`: Minimum age (in days) before files are eligible for sweeping (default: 1 day). - `MAXIMUM_WORKSPACE_RETENTION_DAYS`: Maximum age (in days) before files are definitely swept (default: 60 days). - `MAXIMUM_WORKSPACE_SIZE_MB`: Maximum workspace size before sweeping occurs (default: 5GB). - Set these variables in your environment or configuration to control log retention. Note that some older variables may no longer be effective in recent Airbyte versions, and the current solution may involve using
    TEMPORAL_HISTORY_RETENTION_IN_DAYS
    for certain cleanup tasks. Please check your Airbyte version for compatibility with these settings [Airbyte Rotation Logs](https://discuss.airbyte.io/t/332), [Airbyte log sweeping](https://discuss.airbyte.io/t/2714). 2. Manual Cleanup (Kubernetes/MinIO): - If you are running Airbyte with abctl (which uses Kubernetes and MinIO by default), you can manually delete log files by accessing the MinIO pod:
    Copy code
    sh      kubectl exec -it -n airbyte-abctl airbyte-minio-0 -- /bin/bash
    - Then, delete files inside the log directory, typically:
    Copy code
    /storage/airbyte-dev-logs/job-logging/workspace/
    - This will free up space used by old logs [How to clean up airbyte-minio](https://discuss.airbyte.io/t/3101). Note: Always ensure you are not deleting logs that are still needed for troubleshooting or compliance. If you want to completely uninstall Airbyte and remove all data (including logs), you can use:
    Copy code
    sh abctl local uninstall --persisted rm -rf ~/.airbyte/abctl
    This will stop all containers and delete all persisted data, including logs [OSS Quickstart Uninstall](https://docs.airbyte.com/platform/1.6/using-airbyte/getting-started/oss-quickstart#uninstall-airbyte). If you need more granular or automated log cleanup, consider scripting the deletion process or adjusting the retention environment variables as described above.
  • a

    Aliaksei Ivushkin

    06/30/2025, 6:57 PM
    #C01AHCD885S i see a lot of logs from the replication job
    k
    • 2
    • 1
  • d

    Disha

    06/30/2025, 7:25 PM
    For airbyte github PAT what scopes should pat have?
    k
    • 2
    • 7
  • a

    Annika Maybin

    06/30/2025, 7:58 PM
    @kapa.ai are there plans for the microsoft sharepoint connector to be able to read in excel data?
    k
    • 2
    • 1
  • a

    Aliaksei Ivushkin

    06/30/2025, 8:10 PM
    @kapa.ai application button disappear after restart
    k
    • 2
    • 2
  • d

    Diego Ramirez

    06/30/2025, 8:29 PM
    @kapa.ai Using Hubspot source version 5.0.0 returns 504 error with log c"annot create /pipes/stderr: Permission denied"
    k
    • 2
    • 1
  • m

    Max

    06/30/2025, 9:30 PM
    @kapa.ai I'm trying to install airbyte on an ec2 instance using docker and abctl, I have an existing pgdata file in the .airbyte directory that I want to leverage for the database but I get the following error:
    Copy code
    WARNING  Encountered an issue deploying Airbyte:
                Pod: airbyte-db-0.184deee0dc9d8b38
                Reason: BackOff
                Message: Back-off restarting failed container airbyte-db-container in pod airbyte-db-0_airbyte-abctl(ae9d9299-ac76-4a01-8c32-8a0065d43337)
                Count: 15
                Logs: chown: /var/lib/postgresql/data/pgdata: Operation not permitted
              chmod: /var/lib/postgresql/data/pgdata: Operation not permitted
              The files belonging to this database system will be owned by user "postgres".
              This user must also own the server process.
    
              The database cluster will be initialized with locale "en_US.utf8".
              The default database encoding has accordingly been set to "UTF8".
              The default text search configuration will be set to "english".
    
              Data page checksums are disabled.
    
              initdb: error: directory "/var/lib/postgresql/data/pgdata" exists but is not empty
              If you want to create a new database system, either remove or empty
              the directory "/var/lib/postgresql/data/pgdata" or run initdb
              with an argument other than "/var/lib/postgresql/data/pgdata".
    ▀  Installing 'airbyte/airbyte' (version: 1.4.0) Helm Chart (this may take several minutes) (9m8s)
    k
    • 2
    • 16
  • j

    Jerry Bao

    06/30/2025, 11:43 PM
    @kapa.ai if a source connector is running OOM when it tries to sync a specific stream, is it possible that the parent stream data is being entirely loaded in memory before the child stream starts iterating through the parent data to retrieve data?
    k
    • 2
    • 7
  • d

    dilan silva

    07/01/2025, 3:48 AM
    @kapa.ai I wrote a low code connector source to connect with JSONL endpoint and it works fine. Now I want that source to be supported for the Parquet format, What is the approach I should follow ?
    k
    • 2
    • 13
  • i

    Ishan Anilbhai Koradiya

    07/01/2025, 3:52 AM
    @kapa.ai why is airbyte transferring so much of data from temporal pods in helm setup ?
    k
    • 2
    • 1
  • b

    badri

    07/01/2025, 5:04 AM
    @kapa.ai configuration check failed something went wrong
    k
    • 2
    • 1
  • j

    Jase Pramod

    07/01/2025, 6:51 AM
    @kapa.ai Replication jobs does not start unless I restart the worker pod. Why? I'm using airbyte version 1.3.1
    k
    • 2
    • 1
  • d

    Durim Gashi

    07/01/2025, 6:55 AM
    @kapa.ai what does this error mean:
    Copy code
    error:  Target table and source table attributes don't match.
      code:      8001
      context:   The source table and the target table have different sort keys. Both tables must use the same sort keys and sort style.
    k
    • 2
    • 1
  • s

    Stefano Messina

    07/01/2025, 7:13 AM
    @kapa.ai one connection in Airbyte is configured with a mapper configuration to filter for a specific field value. For this connection the sync mode is incremental append on modified_timestamp. I was wondering whether Airbyte takes into account the field in the mapper configuration when querying the max(modified_timestamp) for the incremental, or if that's taken from the whole table, hence ignoring the mapper configuration
    k
    • 2
    • 1
  • h

    Hash Data

    07/01/2025, 9:05 AM
    what is difference between worker and workload launcher when i spawn airbyte
    k
    • 2
    • 1
  • r

    roane-flaunt

    07/01/2025, 10:50 AM
    I'm curious about best practice for migrating an existing, live connections stream prefix: 1. In my case I'm using bigquery as the destination 2. I'm pausing my airbyte sync during this migration 3. Since the stream prefix will change, I am manually updating the dataset in the destination to align with the new prefix 4. How do the airbyte raw streams work? After I update the stream prefix, will airbyte still be able to track which records have been previously synced
    k
    • 2
    • 1
  • k

    Konathala Chaitanya

    07/01/2025, 11:15 AM
    @kapa.ai i want to use iceberg destination and inser data to snowflake which access is requird do i need to take iceberg access also for this
    k
    • 2
    • 1
  • k

    Konathala Chaitanya

    07/01/2025, 11:19 AM
    @kapa.ai which type of access is required for iceberg
    k
    • 2
    • 1
  • m

    Max

    07/01/2025, 11:26 AM
    @kapa.ai how could I migrate all the connections from a production abctl local install to a dev environment
    k
    • 2
    • 1
  • m

    Max

    07/01/2025, 11:47 AM
    @kapa.ai for local abctl installs what psql username does airbyte use?
    psql
    results in
    psql: error: FATAL:  role "postgres" does not exist
    k
    • 2
    • 1
  • j

    Jase Pramod

    07/01/2025, 12:01 PM
    @kapa.ai message='Workflow has been deprecated. Retry to automatically transition to the current workflow.', type='Deprecated', nonRetryable=true
    k
    • 2
    • 10
  • i

    Isaac Yimgaing Kuissu

    07/01/2025, 12:23 PM
    Those airbyte support SharePoint connector
    k
    • 2
    • 5
  • p

    Pradyumna Kulkarni

    07/01/2025, 1:37 PM
    @kapa.ai I have a file source of json file in S3 bucket and snowflake destination.The file is created in the s3 bucket daily and may be empty .the sync mode is full refresh+append.How should i configure the connection so that there is no error
    k
    • 2
    • 4
  • a

    Affan Zafar

    07/01/2025, 1:42 PM
    @kapa.ai My Output response looks like this in airbyte UI [ { "metrics": [ { "ad_id": "...", "clicks": 248, ... } ], "metrics_updated_at": "2025-07-01T133937.068000+00:00" } ] How can i use record selector to select both metrics and metrics_updated_at
    k
    • 2
    • 1
  • m

    Michael Gallivan

    07/01/2025, 1:51 PM
    @kapa.ai I am having trouble getting the greenhouse source to load, specifically the application_interview table. Do you have any guidance?
    k
    • 2
    • 2
  • j

    Justin Yeh

    07/01/2025, 2:12 PM
    How do I build Similarweb as a source
    k
    • 2
    • 1
  • e

    Euan Blackledge

    07/01/2025, 2:42 PM
    Hey @kapa.ai, why does AIrbyte eat the storage in it’s RDS? We’ve already got a process to clean up the data from the attempts table older than 3 months but the db has used over 100gb of its 200gb. Cleaning the pg_toast table had no effect apart from causing a massive data loss as it triggered AIrbyte to discover schema changes and drop the ingestion layer of our data lake
    k
    • 2
    • 7
  • l

    Leandro Machado

    07/01/2025, 3:05 PM
    hey @kapa.ai! I'm having an issue where the Stripe Connector gets stuck after a few minutes running. It started happening like a week ago and its happening on incremental updates. Were there any updates on the connector or on the Stripe API that might be causing this? Did anyone else had the same problem? Airbyte version 1.5
    k
    • 2
    • 1