https://linen.dev logo
Join Slack
Powered by
# infra-deployment
  • v

    visch

    10/18/2022, 7:30 PM
    Stumbled on https://docs.docker.com/engine/reference/commandline/run/#:~:text=Set%20environment%20variables-,%2D%2Denv%2Dfile,-Read%20in%20a
    Copy code
    docker run --env-file .env
    today. chef kiss
    e
    • 2
    • 1
  • s

    Stéphane Burwash

    10/24/2022, 2:30 PM
    Happy monday everyone! I had a question regarding deployment with airflow. I'm trying to setup a way to include flags in my production pipeline (basically to run a full-refresh) if I need to reset state. How do you guys manage
    full-refresh
    calls to a specific stream in production?
    d
    • 2
    • 5
  • m

    Matt Menzenski

    12/02/2022, 2:34 AM
    Is anyone running Meltano in kubernetes via Argo Workflows (as opposed to Airflow)?
    e
    d
    • 3
    • 7
  • m

    monika_rajput

    12/14/2022, 8:59 AM
    Hello Everyone, I am trying to use MWAA ( Amazon Managed Workflows for Apache Airflow) with Meltano but stuck at the integration part, has anyone integrated Meltano with AWS Amazon Managed Workflows for Apache Airflow (MWAA)? And how can we run Meltano dags with MWAA as For using MWAA we need to add dags in s3. Is there any way we can integrate the meltano dags (meltano.py) with MWAA seamlessly. Or just wanted to know that the MWAA is the right choice for scheduling the task? @edgar_ramirez_mondragon @douwe_maan
  • m

    michel_ebner

    12/21/2022, 3:44 PM
    Hey everyone, <!subteam^S02BCD9FFEF> nice job for the Dagtser utility. Locally it works nicely, but I tried setting it up in Prod with something closely to your Cubed deployment. Meening 2 DB pods (Dasgter and Meltano), 1 Pod for Dagster and 1 for my Meltano custome code. Unfortunately, I don't get how to connect the dagster extension from meltano to dagster, as the 2 things are running in different K8s Pods. There is also no way to share the Volumes in K8s. Does anyone has some idea how to achieve this? Otherwise I will have to get back to Airflow 😕 @ken_payne?
    j
    • 2
    • 18
  • f

    fred_reimer

    04/14/2023, 3:15 PM
    We are looking to move our current production deployment using the integrated Airflow orchestration, under Kubernetes, to running the jobs under airplane.dev as Tasks. Airplane Tasks can run as a Docker image, or a simplified Python task, which also builds a docker image, but is designed to run a Python program as the task workload. We'd prefer to use the Python task, as we will need to setup some things for the Meltano ELT job run, such as env vars with sensitive information pulled dynamically from our security solution. Has anyone worked out or are there examples of starting a Meltano run from a Python program? Can we import the meltano module and call the starting point somehow? I'd rather not even consider shelling out and calling another instance of the Python interpreter if at all possible. If it is better or required that we create a "full" Docker image with a bash script to call various utilities to setup for the run we can do that also. Just wondering if anyone is calling Meltano ELT job runs from within a Python program without shelling out...
    v
    • 2
    • 3
  • a

    Andy Carter

    04/19/2023, 6:45 AM
    I think my org wants to move to more containerised deployments in future, so I have a lot to learn about docker. The docs have already been a great help in actually building a container, so thank you for that! Interested to know how you are running containerised jobs, I can see a number of pathways: • a single
    meltano run sync_all
    job on a schedule, running all tasks sequentially • kicking off a specific task from another orchestrator, ADF or AWS CloudEvents etc. • using airflow or dagster as meltano utilities in a container running 24/7
    m
    a
    • 3
    • 2
  • l

    lanre_nathaniel-ayodele

    06/01/2023, 9:10 AM
    Hey All, I am trying to deploy my first custom tap to Production using airflow as an orchestrator everything works as expected in my local but once I switch from my SQLite database to a Postgres database for airflows metadata i am unable to start Airflows web server i just get a WORKER TIMEOUT message. In the airflow docs you can set a keepalive setting and i have done this but still get the same error. Anyone have any ideas on what the issue could be.
  • a

    Andy Carter

    06/16/2023, 10:15 AM
    A couple things I have learnt from trying to deploy Meltano on Azure Container Instances (ACI) via bicep, hopefully they can be of use to someone else. • When creating the ACI, you need to define your
    meltano
    command, and it cannot be altered. There is no way (I've found) to run an image with a custom command, it has to be fixed along with the container at creation time. So you end up having one container for each tap & target combo, which leads to... • Very quickly hitting the default Azure quota for assigned Container vCPU, which is 10 for your whole region. So if you have 10 tap-target combo ACIs created and you assign them 1 vCPU each, you can't create any more ACIs. Even if those images are not running, they count as 'assigned' for vCPU quota purposes. So I have scaled down to fractional vCPU for many of them to allow me to create some more. I would make a request to increase the vCPU limit immediately as requires Azure support to do that I have kind of got to a reasonable working solution, but feel like this would be much easier on AWS 😞
    k
    a
    • 3
    • 5
  • h

    haleemur_ali

    06/17/2023, 1:43 PM
    Hi Folks, I'm facing this challenge for the first time, and not sure how other's have tackled it. We dockerize meltano and run the meltano etl processes through aws batch. Secrets are passed as environment variables from aws secretsmanager to the container. We have to ingest some information from google bigquery. We're looking at the tap variants. All require a _client_secrets.json_ file to be available on the container (e.g. tap-bigquery - Meltano Hub) How do other folks inject sensitive files into the docker container at runtime?
    v
    m
    s
    • 4
    • 5
  • s

    Stéphane Burwash

    08/14/2023, 1:47 PM
    Goooooood morning and happy monday! Does anyone here self-host airflow? I'm having issues regarding deploying changes to the scheduler / jobs without losing tracking for all of my current tasks.
    e
    v
    c
    • 4
    • 5
  • a

    Andy Carter

    02/14/2024, 12:26 PM
    Does anyone have experience building their meltano image in Azure Pipelines? I am struggling to get my private taps on Azure Devops installed in my
    meltano install
    step due to git permissions. I've tried adding a git ssh key, including the repos as resource definitions, using a Azure Repos service connection, nothing got me very far. Any pointers welcome.
    v
    • 2
    • 3
  • j

    Jairo Souza

    03/13/2024, 3:03 PM
    Does anyone already deployed Meltano on Kubernetes using a Helm chart? Could you send some documentation about this? Can this documentation be used as a reference? Unfortunately, I didn't see any guide in the official documentation. https://gitlab.com/meltano/infra/helm-meltano/-/tree/master?ref_type=heads
    e
    • 2
    • 1
  • j

    Josh Bielick

    03/14/2024, 4:50 PM
    I'm not sure if this will be useful to others, but I've put together an orchestrator utility to generate Kubernetes CronJob manifests from the meltano schedule of jobs. The assumptions about your meltano project are as follows: • Your meltano config defines jobs and those jobs appear in a schedule configuration • You'd like to run meltano jobs on a recurring basis using a container image you build • Your meltano project code is built into a container image which can be run in a kubernetes cluster • You want to manage the execution and tracking of jobs using kubernetes, not airflow • You are comfortable using kustomize overlays to customize your kubernetes pod spec The developer experience is something like this: • You commit changes to your meltano project repository • CI/CD runs, builds your container image based on this revision • CI/CD invokes this utility, kustomize base layer kubernetes manifests are generated • CI/CD has kubectl set up with a valid kubeconfig for your cluster • You
    kubectl apply -k orchestrate/kubernetes/production/kustomize.yml
    (a file you create ahead of time—see README for more details) and apply the manifests to your cluster My team found this to be the most lightweight way of provisioning kubernetes CronJobs for meltano jobs in our cluster in which the engineer writing new meltano jobs and configurations did not need to modify any infrastructure definitions in order to have their jobs run regularly. The kubernetes CronJob/Job tracking and logs were sufficient and worked very well for our needs (primarily Extract/Load), but this utility could be extended quite a bit to support a lot of use cases. Feedback is welcome, but not all use-cases are guaranteed to be addressed/supported. I hope this helps someone! https://github.com/AdWerx/meltano-kubernetes-ext
    👌 2
    💪 1
    melty bouncy 2
    p
    m
    j
    • 4
    • 6
  • v

    visch

    03/21/2024, 9:25 PM
    UV is sounding more and more interesting as I"m staring at a 50 minute current build time. Could get this down to ~15 minutes I think just by upgrading Meltano but getting to 5-15 minutes with 15 tap/targets without uv seems tough without deduplication across venvs.
    👀 1
    ➕ 2
    e
    • 2
    • 3
  • m

    Matt Menzenski

    04/25/2024, 10:25 PM
    Is anyone programmatically generating Meltano YAML project files? I’ve been doing this for a long time in an adhoc way (scaffold tap definitions for 130 databases, or 80 google analytics projects, etc) but we’re scaling out our Meltano infra, we’ll have a lot more tap and target definitions to manage, and I’m interested in any prior art in this area. For example: does the
    meltano
    python package expose project file schemas? Could I generate a YAML file and then validate it against that schema?
    e
    • 2
    • 4
  • g

    Gaurav Arora

    05/14/2024, 4:16 PM
    Hi everyone, We are testing meltano for our data pipeline, Everything seems to be working fine. We just have one issue, once deployed we want our non-tech team (who can't use CLI) to be able to schedule and monitor tasks, plus we might also need some sort of Auth layer. Since, meltano doesn't provide any API layer out of the box, what should be ideal solution in this case?
    v
    a
    • 3
    • 3
  • a

    ashish

    05/15/2024, 11:29 AM
    Hi Team I am exploring my options to install and deploy meltano taps/targets on AWS MWAA I found some old threads earlier (most of them are inaccessible now). In a nutshell, containers was the way using MWAA + ECS. Is there any way to setup meltano pipelines without using containers on MWAA? CC: @Edgar Ramírez (Arch.dev)
    e
    • 2
    • 4
  • s

    Shubham Kawade

    06/27/2024, 11:30 AM
    hello folks, I am trying to orchestrate a custom tap we are using on
    Mage
    . I am using python subprocesses to execute meltano commands. I have created a task in mage for each stream I want to sync. and I want to somehow track whether at the end of the task run, that stream was synced or not. How do I programatically check whether after a sync, that stream was successfully attempted or not? e.g. a log file which I could check in python P.S. I checked the logging section and the log file is created in a run folder which is randomly generated so it doesn't seem reliable for this purpose
    v
    e
    • 3
    • 10
  • c

    Conner Panarella

    07/01/2024, 7:20 PM
    Is anyone deploying Meltano on Azure? I'd love to hear about your experiences. I am currently working with Azure Container Apps, but I am wondering how others have been deploying in the Azure ecosystem.
    a
    • 2
    • 30
  • a

    aaron_phethean

    07/01/2024, 7:50 PM
    In Azure we mostly orchestrate in Kubernetes with Spring Cloud Tasks managed by our platform, but we do have container instances in some tests, and clients with container apps too. This example shows the az command deploy, but I prefer the yaml way as setting the env for select is easier. https://www.matatika.com/your-guide-to-loading-quality-test-data/
    🙌 1
  • c

    Conner Panarella

    07/01/2024, 8:11 PM
    @aaron_phethean Thank you! That is very similar to what I have setup at the moment. Do you have any clients using Airbyte taps? To my knowledge container apps do no support docker in docker.
  • a

    aaron_phethean

    07/01/2024, 8:16 PM
    No. We don’t actually. We have a few Matatika community edition installs with docker-compose where we do Docker in Docker. If we ran an Airbyte tap in that setup it would be docker in docker in docker 😂
    👀 1
  • i

    Ian OLeary

    07/24/2024, 3:43 PM
    When deployed, does meltano need anything other than a state backend to start running scheduled jobs with dagster? I haven't set my state backend yet so any time I materialize my assets it starts from the start date, but my scheduled job for last night didnt seem to run.
    👀 1
    e
    • 2
    • 1
  • j

    James Stratford

    07/25/2024, 7:53 AM
    I have a custom tap I am running which runs fine on my local machine (dockerized container) but when running it on a Kubernetes Pod through Cloud Composer (GCP) the pod gets killed after 15-16 minutes with no error, the last log is a successful HTTP call... Any ideas what is happening? Something I have noticed is the State file is really large because it has several keys in the context and there are many entities, which makes the log very large
    ✅ 1
    e
    • 2
    • 6
  • j

    James Stratford

    07/25/2024, 7:59 AM
    Another thing I have noticed is for the airflow k8s worker, all logs are sent to stderr? They are info logs but are labeled as ERROR
    e
    • 2
    • 1
  • i

    Ian Hsu

    09/10/2024, 8:41 AM
    Hi folks, I got very slow on
    meltano install
    every time I use Dockerfile to build the image since it will install everything from scratch. Is there anyway I can use the cache with docker or the packages previous built? My Dockerfile:
    Copy code
    # <http://registry.gitlab.com/meltano/meltano:latest|registry.gitlab.com/meltano/meltano:latest> is also available in GitLab Registry
    ARG MELTANO_IMAGE=meltano/meltano:latest
    FROM $MELTANO_IMAGE
    
    WORKDIR /project
    
    # Install any additional requirements
    COPY ./requirements.txt .
    RUN pip install -r requirements.txt
    
    # Copy over Meltano project directory
    COPY . .
    # RUN meltano install
    
    # Don't allow changes to containerized project files
    ENV MELTANO_PROJECT_READONLY 1
    
    # Expose default port used by `meltano ui`
    EXPOSE 5000
    
    ENTRYPOINT ["meltano"]
    e
    • 2
    • 2
  • e

    Emre Üstündağ

    01/21/2025, 1:10 PM
    Hi. I initialized a meltano project via dockerized meltano image, added extractors and targets. I also added airflow as the orchestrator. Everthing seems fine in local but I wonder how to deploy it into a server environment so that I wont need to run it in my machine. I searched for some info, I may use aws ecs or ec2 to use meltano in a server. In my local environment there are two running containers now, airflow scheduler and airflow ui. But I did not understand some concepts. In https://docs.meltano.com/guide/containerization, there is some info for running meltano in containers. If I build a new docker image, and register it in aws ecr, then create a task definition from my meltano project's image, and run two tasks (one for airflow ui and one for airflow scheduler) from this task definition, will everything be fine? You may say "just try it" but I am not sure this approach is the best practise 🙂 So I need to know to setup this infrastructure with the proper deployment process. I also dont know how to create a CI/CD workflow. I will be working on it to automate dev-deploy workflow. Lastly, I am using target-clickhouse as a loader now. My clickhouse db is in an aws fargate service. Can I use dbt with clickhouse? If so, can I install dbt to make transformations in the meltano project, or should I try to use dbt cloud to transform data outside the project?
    ✅ 1
    e
    • 2
    • 7
  • a

    Andy Carter

    03/06/2025, 1:24 PM
    @Matt Menzenski I saw your post over on dagster re dagster-meltano and the deprecation of
    dagster-shell
    and to use
    PipesSubprocessClient
    instead. https://dagster.slack.com/archives/C05RDKSEFFT/p1740419325050919 Do you think that would be a drop-in replacement for
    execute_shell_command
    ? In-built graceful termination would be great 🙂 https://dagster.slack.com/archives/C066HKS7EG1/p1741268013311159 Worth a shot! Sorry I don't think that link works but I asked their AI bot and it came it with something passable, but probably one for Jules to assist with if he can as the ext owner.
    👀 1
    m
    • 2
    • 4
  • a

    Andy Carter

    04/08/2025, 7:43 AM
    Hi @Matt Menzenski just wondered if you had any luck looking at the
    dagster-meltano
    change? Ended up in a bit of jam last week where our dagster alerts stopped working but I couldn't update to dg 1.10.x because of the meltano issue :S
    m
    j
    • 3
    • 13