Bryan Meyerovich
10/31/2025, 12:00 PMkapa.ai
10/31/2025, 12:01 PMIcebergTableCleaner.kt file you referenced). Custom Components in Airbyte are designed for the Connector Builder and low-code CDK connectors, allowing you to override or extend Python-based components (such as pagination strategies, requesters, or record transformations) by providing your own Python classes. This feature is limited to connectors built with the low-code CDK and does not extend to compiled connectors written in Java or Kotlin, nor does it allow you to override arbitrary lines of code in those connectors. The mechanism works by specifying Python class names in your YAML configuration, not by patching JVM code or compiled binaries. For example, you can override a pagination strategy in a source connector, but not a method in a Kotlin class used by a destination connector like S3 Data Lake [Custom Components (Config-based)](https://docs.airbyte.com/platform/connector-development/config-based/advanced-topics/custom-components). If you need to change behavior in the S3 Data Lake destination (such as the logic in IcebergTableCleaner.kt), you would need to fork the connector, make your changes, and build a custom Docker image. The custom components feature does not provide a shortcut for this use case. If building a custom Docker image is proving difficult, you might want to seek help from the Airbyte community or review the connector development documentation for guidance on building and deploying custom connectors. Summary: - Custom Components only work for Python-based, low-code CDK connectors. - They cannot override or patch Java/Kotlin code in destination connectors. - For your use case, a custom build of the connector is required [Custom Components (Config-based)](https://docs.airbyte.com/platform/connector-development/config-based/advanced-topics/custom-components). If you need more detailed help with building a custom connector image, let me know!Bryan Meyerovich
10/31/2025, 12:04 PMkapa.ai
10/31/2025, 12:04 PMkapa.ai
10/31/2025, 12:04 PMIcebergTableCleaner.kt), building a custom Docker image for the S3 Data Lake destination, uploading it to a Docker registry, and then using it as a custom connector in another Airbyte instance. --- ## 1. Fork and Clone the Airbyte Repository - Fork the [airbyte](https://github.com/airbytehq/airbyte) repository on GitHub. - Clone your fork locally: bash git clone [email protected]:{YOUR_USERNAME}/airbyte.git cd airbyte ## 2. Make Your Code Change - Navigate to the file you want to change, e.g.: airbyte-cdk/bulk/toolkits/load-iceberg-parquet/src/main/kotlin/io/airbyte/cdk/load/toolkits/iceberg/parquet/io/IcebergTableCleaner.kt - Edit the line as needed. ## 3. Build the Connector Docker Image - Ensure you have the airbyte-ci tool installed (airbyte-ci --help). - Build the connector image. For the S3 Data Lake destination, the connector name is likely `destination-s3`: bash airbyte-ci connectors --name destination-s3 build - After building, your image will be available locally as airbyte/destination-s3:dev (or similar). ## 4. Tag and Push the Image to a Docker Registry - Tag your image for your registry (example for Docker Hub): bash docker tag airbyte/destination-s3:dev your-dockerhub-username/destination-s3:custom-tag - Push the image: bash docker push your-dockerhub-username/destination-s3:custom-tag - You can use any private registry (Docker Hub, AWS ECR, GCP Artifact Registry, etc.) [Uploading Docker-based custom connectors](https://docs.airbyte.com/platform/operator-guides/using-custom-connectors). ## 5. Use the Custom Docker Image in Another Airbyte Instance - In your target Airbyte instance, ensure it can pull from your Docker registry (authenticate if private). - In the Airbyte UI: 1. Go to Workspace settings > Destinations. 2. Click New connector > Add a new Docker connector. 3. Enter a display name. 4. For Docker full image name, enter your image (e.g., your-dockerhub-username/destination-s3). 5. For Docker image tag, enter your tag (e.g., custom-tag). 6. (Optional) Add documentation URL. 7. Click Add to save the connector. - You can now use this custom connector for new connections [Use your custom Docker connector in Airbyte](https://docs.airbyte.com/platform/operator-guides/using-custom-connectors#4-use-your-custom-docker-connector-in-airbyte). --- Note: - The above steps are based on the official Airbyte documentation for developing locally and uploading custom Docker-based connectors. - If you encounter issues with the build or registry steps, refer to the [developing locally](https://docs.airbyte.com/platform/contributing-to-airbyte/developing-locally#connector-contributions) and [custom connector upload](https://docs.airbyte.com/platform/operator-guides/using-custom-connectors) guides for troubleshooting. If you need more details on any specific step, let me know!Bryan Meyerovich
10/31/2025, 2:47 PMBryan Meyerovich
10/31/2025, 2:49 PMkapa.ai
10/31/2025, 2:49 PMkapa.ai
10/31/2025, 2:50 PMairbyte-ci) are: - Java 21 - Node 20 - Python 3.10 - Docker - jq Here are the commands you need to run on a fresh Amazon Linux 2023 instance to install these prerequisites, as described in the documentation (note: the docs assume Amazon Linux, but do not provide explicit package manager commands for Amazon Linux 2023, so you may need to adapt as needed): 1. Install Docker: bash sudo yum install -y docker sudo usermod -a -G docker ec2-user sudo systemctl start docker sudo systemctl enable docker 2. Install jq: bash sudo yum install -y jq 3. Install Python 3.10: The documentation recommends using a version manager like pyenv for Python: bash # Install dependencies for pyenv sudo yum install -y git gcc zlib-devel bzip2 bzip2-devel readline-devel sqlite sqlite-devel openssl-devel libffi-devel # Install pyenv curl <https://pyenv.run> | bash # Add pyenv to your shell (add these lines to ~/.bashrc or ~/.bash_profile) export PATH="$HOME/.pyenv/bin:$PATH" eval "$(pyenv init --path)" eval "$(pyenv virtualenv-init -)" # Restart your shell, then: pyenv install 3.10.0 pyenv global 3.10.0 4. Install Java 21: The documentation links to [Java 21 downloads](https://jdk.java.net/archive/). You may need to manually download and install the JDK, or use Amazon Corretto if available: bash sudo yum install -y java-21-amazon-corretto 5. Install Node 20: The documentation does not specify the method, but you can use nvm (Node Version Manager): bash # Install nvm curl -o- <https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh> | bash # Add nvm to your shell (add these lines to ~/.bashrc or ~/.bash_profile) export NVM_DIR="$([ -z "${XDG_CONFIG_HOME-}" ] && printf %s "${HOME}/.nvm" || printf %s "${XDG_CONFIG_HOME}/nvm")" [ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh" # Restart your shell, then: nvm install 20 nvm use 20 Note: The documentation does not provide explicit, step-by-step commands for Amazon Linux 2023, but these steps are derived from the prerequisites listed in the official docs [Developing Locally - Prerequisites](https://docs.airbyte.com/platform/1.8/contributing-to-airbyte/developing-locally#prerequisites). If you encounter issues with any package, please refer to the Amazon Linux 2023 documentation or package repositories for the most up-to-date installation methods. If you need to install airbyte-ci itself, follow the [airbyte-ci installation instructions](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md) as referenced in the Airbyte docs.Bryan Meyerovich
10/31/2025, 3:13 PMBryan Meyerovich
10/31/2025, 3:13 PMkapa.ai
10/31/2025, 3:13 PMkapa.ai
10/31/2025, 3:14 PMairbyte-ci are: - Java 21 - Node 20 - Python 3.10 - Docker - jq The documentation recommends using a version manager like pyenv for Python, but it is not strictly required—you can use your system package manager to install Python 3.10 directly if available. However, the documentation does not provide explicit commands for installing Python 3.10 (or the other prerequisites) on Amazon Linux 2023 without pyenv. It only lists the required tools and suggests using a version manager to simplify switching between versions [Developing Locally - Prerequisites](https://docs.airbyte.com/platform/1.8/contributing-to-airbyte/developing-locally#prerequisites). If you do not want to use pyenv, you will need to ensure that the correct versions of Java, Node, and Python are installed on your system by other means. The documentation does not provide further details or alternative installation commands for Amazon Linux 2023 without pyenv, so I cannot provide a fully detailed, officially supported command sequence for your scenario. Summary: - You need Java 21, Node 20, Python 3.10, Docker, and jq. - The docs recommend pyenv for Python, but it is not required. - The knowledge sources do not provide explicit instructions for installing these prerequisites on Amazon Linux 2023 without pyenv. If you need to install these tools without pyenv, you will need to consult Amazon Linux 2023 or upstream package documentation for the appropriate installation commands.