<@U01MMSDJGC9> i'm hitting a wall with trying to g...
# ask-community-for-troubleshooting
b
@[DEPRECATED] Marcos Marx i'm hitting a wall with trying to get airbyte working with aws glue backed by an s3 table bucket. i made a pull request but obviously it'll be a while before it's merged (if ever). can you advise, is it possible to use this feature to change this line of code so i can use this connector. i'd rather not build an entire docker image just to override this connector, and honestly i tried that and can't quite figure out how to do that anyways. hoping this method is easier.
k
I'm sorry, but based on the information in the provided knowledge sources, it is not possible to use Airbyte's "Custom Components" feature to override or patch code in the Java/Kotlin backend of a destination connector like S3 Data Lake (which includes the
IcebergTableCleaner.kt
file you referenced). Custom Components in Airbyte are designed for the Connector Builder and low-code CDK connectors, allowing you to override or extend Python-based components (such as pagination strategies, requesters, or record transformations) by providing your own Python classes. This feature is limited to connectors built with the low-code CDK and does not extend to compiled connectors written in Java or Kotlin, nor does it allow you to override arbitrary lines of code in those connectors. The mechanism works by specifying Python class names in your YAML configuration, not by patching JVM code or compiled binaries. For example, you can override a pagination strategy in a source connector, but not a method in a Kotlin class used by a destination connector like S3 Data Lake [Custom Components (Config-based)](https://docs.airbyte.com/platform/connector-development/config-based/advanced-topics/custom-components). If you need to change behavior in the S3 Data Lake destination (such as the logic in
IcebergTableCleaner.kt
), you would need to fork the connector, make your changes, and build a custom Docker image. The custom components feature does not provide a shortcut for this use case. If building a custom Docker image is proving difficult, you might want to seek help from the Airbyte community or review the connector development documentation for guidance on building and deploying custom connectors. Summary: - Custom Components only work for Python-based, low-code CDK connectors. - They cannot override or patch Java/Kotlin code in destination connectors. - For your use case, a custom build of the connector is required [Custom Components (Config-based)](https://docs.airbyte.com/platform/connector-development/config-based/advanced-topics/custom-components). If you need more detailed help with building a custom connector image, let me know!
b
@kapa.ai yes give me detailed instructions on how to change that line in that kotlin file, create and upload a docker image from it, then use that docker image as a connector in a different airbyte instance
k
A message from kapa.ai
Certainly! Here’s a step-by-step guide, based strictly on the official Airbyte documentation, for modifying a line in a Kotlin file (such as
IcebergTableCleaner.kt
), building a custom Docker image for the S3 Data Lake destination, uploading it to a Docker registry, and then using it as a custom connector in another Airbyte instance. --- ## 1. Fork and Clone the Airbyte Repository - Fork the [airbyte](https://github.com/airbytehq/airbyte) repository on GitHub. - Clone your fork locally:
Copy code
bash   git clone [email protected]:{YOUR_USERNAME}/airbyte.git   cd airbyte
## 2. Make Your Code Change - Navigate to the file you want to change, e.g.:
Copy code
airbyte-cdk/bulk/toolkits/load-iceberg-parquet/src/main/kotlin/io/airbyte/cdk/load/toolkits/iceberg/parquet/io/IcebergTableCleaner.kt
- Edit the line as needed. ## 3. Build the Connector Docker Image - Ensure you have the
airbyte-ci
tool installed (
airbyte-ci --help
). - Build the connector image. For the S3 Data Lake destination, the connector name is likely `destination-s3`:
Copy code
bash   airbyte-ci connectors --name destination-s3 build
- After building, your image will be available locally as
airbyte/destination-s3:dev
(or similar). ## 4. Tag and Push the Image to a Docker Registry - Tag your image for your registry (example for Docker Hub):
Copy code
bash   docker tag airbyte/destination-s3:dev your-dockerhub-username/destination-s3:custom-tag
- Push the image:
Copy code
bash   docker push your-dockerhub-username/destination-s3:custom-tag
- You can use any private registry (Docker Hub, AWS ECR, GCP Artifact Registry, etc.) [Uploading Docker-based custom connectors](https://docs.airbyte.com/platform/operator-guides/using-custom-connectors). ## 5. Use the Custom Docker Image in Another Airbyte Instance - In your target Airbyte instance, ensure it can pull from your Docker registry (authenticate if private). - In the Airbyte UI: 1. Go to Workspace settings > Destinations. 2. Click New connector > Add a new Docker connector. 3. Enter a display name. 4. For Docker full image name, enter your image (e.g.,
your-dockerhub-username/destination-s3
). 5. For Docker image tag, enter your tag (e.g.,
custom-tag
). 6. (Optional) Add documentation URL. 7. Click Add to save the connector. - You can now use this custom connector for new connections [Use your custom Docker connector in Airbyte](https://docs.airbyte.com/platform/operator-guides/using-custom-connectors#4-use-your-custom-docker-connector-in-airbyte). --- Note: - The above steps are based on the official Airbyte documentation for developing locally and uploading custom Docker-based connectors. - If you encounter issues with the build or registry steps, refer to the [developing locally](https://docs.airbyte.com/platform/contributing-to-airbyte/developing-locally#connector-contributions) and [custom connector upload](https://docs.airbyte.com/platform/operator-guides/using-custom-connectors) guides for troubleshooting. If you need more details on any specific step, let me know!
b
i have a fresh amazon linux 2023, what are the commands i need to run to install the pre-requisites for airbyte-ci?
@kapa.ai i have a fresh amazon linux 2023, what are the commands i need to run to install the pre-requisites for airbyte-ci?
k
A message from kapa.ai
Based on the official Airbyte documentation, the prerequisites to build Airbyte locally (and thus to use
airbyte-ci
) are: - Java 21 - Node 20 - Python 3.10 - Docker - jq Here are the commands you need to run on a fresh Amazon Linux 2023 instance to install these prerequisites, as described in the documentation (note: the docs assume Amazon Linux, but do not provide explicit package manager commands for Amazon Linux 2023, so you may need to adapt as needed): 1. Install Docker:
Copy code
bash    sudo yum install -y docker    sudo usermod -a -G docker ec2-user    sudo systemctl start docker    sudo systemctl enable docker
2. Install jq:
Copy code
bash    sudo yum install -y jq
3. Install Python 3.10: The documentation recommends using a version manager like
pyenv
for Python:
Copy code
bash    # Install dependencies for pyenv    sudo yum install -y git gcc zlib-devel bzip2 bzip2-devel readline-devel sqlite sqlite-devel openssl-devel libffi-devel     # Install pyenv    curl <https://pyenv.run> | bash     # Add pyenv to your shell (add these lines to ~/.bashrc or ~/.bash_profile)    export PATH="$HOME/.pyenv/bin:$PATH"    eval "$(pyenv init --path)"    eval "$(pyenv virtualenv-init -)"     # Restart your shell, then:    pyenv install 3.10.0    pyenv global 3.10.0
4. Install Java 21: The documentation links to [Java 21 downloads](https://jdk.java.net/archive/). You may need to manually download and install the JDK, or use Amazon Corretto if available:
Copy code
bash    sudo yum install -y java-21-amazon-corretto
5. Install Node 20: The documentation does not specify the method, but you can use
nvm
(Node Version Manager):
Copy code
bash    # Install nvm    curl -o- <https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh> | bash     # Add nvm to your shell (add these lines to ~/.bashrc or ~/.bash_profile)    export NVM_DIR="$([ -z "${XDG_CONFIG_HOME-}" ] && printf %s "${HOME}/.nvm" || printf %s "${XDG_CONFIG_HOME}/nvm")"    [ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh"     # Restart your shell, then:    nvm install 20    nvm use 20
Note: The documentation does not provide explicit, step-by-step commands for Amazon Linux 2023, but these steps are derived from the prerequisites listed in the official docs [Developing Locally - Prerequisites](https://docs.airbyte.com/platform/1.8/contributing-to-airbyte/developing-locally#prerequisites). If you encounter issues with any package, please refer to the Amazon Linux 2023 documentation or package repositories for the most up-to-date installation methods. If you need to install
airbyte-ci
itself, follow the [airbyte-ci installation instructions](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md) as referenced in the Airbyte docs.
b
i dont want to use pyenv
@kapa.ai i dont want to use pyenv
k
A message from kapa.ai
Based on the official Airbyte documentation, the prerequisites for building Airbyte locally and using
airbyte-ci
are: - Java 21 - Node 20 - Python 3.10 - Docker - jq The documentation recommends using a version manager like
pyenv
for Python, but it is not strictly required—you can use your system package manager to install Python 3.10 directly if available. However, the documentation does not provide explicit commands for installing Python 3.10 (or the other prerequisites) on Amazon Linux 2023 without
pyenv
. It only lists the required tools and suggests using a version manager to simplify switching between versions [Developing Locally - Prerequisites](https://docs.airbyte.com/platform/1.8/contributing-to-airbyte/developing-locally#prerequisites). If you do not want to use
pyenv
, you will need to ensure that the correct versions of Java, Node, and Python are installed on your system by other means. The documentation does not provide further details or alternative installation commands for Amazon Linux 2023 without
pyenv
, so I cannot provide a fully detailed, officially supported command sequence for your scenario. Summary: - You need Java 21, Node 20, Python 3.10, Docker, and jq. - The docs recommend
pyenv
for Python, but it is not required. - The knowledge sources do not provide explicit instructions for installing these prerequisites on Amazon Linux 2023 without
pyenv
. If you need to install these tools without
pyenv
, you will need to consult Amazon Linux 2023 or upstream package documentation for the appropriate installation commands.