https://linen.dev logo
Join Slack
Powered by
# advice-data-orchestration
  • a

    Amit Gupta

    07/07/2022, 1:28 PM
    Can someone help me with this, have been stuck since last night
    a
    • 2
    • 1
  • s

    Samuel Garvis

    07/07/2022, 3:14 PM
    It is very difficult to understand where my tables are in airbyte since you can't name connections. Say there are many table going from Postgres to one BigQuery destination. Is it bad practice to make multiple destinations for different tables even though they're all going to the same destination, so I can name them differently. I realize this could result in more connections open at a time, but it would be really nice to name them differently
    a
    • 2
    • 2
  • k

    konrad schlatte

    07/08/2022, 3:37 PM
    Hey I am trying to implement dagster with airbyte and it works fine if I have dagster and airbyte running on the same server but I can't make it work on two separate EC2 servers. So what I'm trying to do is run Dagster on one instance and access Airbyte on another EC2 instance via ssh and run the airbyte jobs. Has anyone implemented something similar or can point me in the right direction?
    a
    m
    • 3
    • 14
  • m

    Marcos Marx (Airbyte)

    07/28/2022, 12:33 AM
    Hello 👋 I’m sending this message to help you identify if this channel is the best place to post your question. Airbyte has a few channels to open discussion about data topics (architecture, ingestion, quality, etc). In these channels you may ask general questions related to the particular topic. If you’re having problem deploying or running a connection in Airbyte this is not the topic. We recommend to you open a Discourse Topic where our support team will help you troubleshooting your issue.
  • h

    Hans Lellelid

    08/01/2022, 9:23 PM
    Question about viewing logs in the webapp UI. I'm trying to map the docker-compose.yml to a Nomad deployment. The path the UI is showing is correct inside the workers. These are Docker volumes in the workers; however, the logs appear empty in the UI. (They are not empty on the workers; I can exec in to the container and tail them.) Is there a volume that needs to be shared with the webapp (or with the server?) for these logs? Or how exactly are they getting from the worker containers to the web UI? Thanks in advance!
    • 1
    • 1
  • a

    Arun

    08/02/2022, 2:40 AM
    hi , please advice on the issue. I am new to Airbyte, I am trying to pull the data from Postrgresql and load into Snowflake. Its 10M rows table but after 600k it got stuck and nothing processing.
  • a

    Arun

    08/02/2022, 2:40 AM
    is there any settings where we need to take a look?
  • l

    Lior Chen

    08/03/2022, 11:01 PM
    hi, my k8s cluster stops working a few days. basically it stops scheduling any new worker pods so syncs are not running and new connectors cannot be created. it only fixes if I delete all the AB deployments and re create them I’ve noticed this error in the log (truncated):
    Copy code
    {"level":"warn","ts":"2022-07-28T17:28:31.501Z","msg":"Processor unable to retrieve tasks","service":"history","shard-id":4,"address":"192.168.53.224:7234","shard-item":"0xc000a68000","component":"visibility-queue-processor","error":"GetVisibilityTasks operation failed. Select failed. Error: dial tcp 10.100.195.170:5432: connect: connection timed out","logging-call-at":"queueProcessor.go:265"}
    {"level":"error","ts":"2022-07-28T17:29:02.221Z","msg":"Operation failed with internal error.","service":"history","error":"GetTimerTasks operation failed. Select failed. Error: dial tcp 10.100.195.170:5432: connect: connection timed out","metric-scope":30,"shard-id":2,"logging-call-at":"persistenceMetricClients.go:676","stacktrace":"<http://go.temporal.io/server/common/log/loggerimpl.(*loggerImpl).Error\n\t/temporal/common/log/loggerimpl/logger.go:138\ngo.temporal.io/server/common/persistence.(*workflowExecutionPersistenceClient).updateErrorMetric\n\t/temporal/common/persistence/persistenceMetricClients.go:676\ngo.temporal.io/server/common/persistence|go.temporal.io/server/common/log/loggerimpl.(*loggerImpl).Error\n\t/temporal/common/log/loggerimpl/logger.go:138\ngo.temporal.io/server/common/persistence.(*workflowExecutionPersistenceClient).updateErrorMetric\n\t/temporal/common/persistence/persistenceMetricClients.go:676\ngo.temporal.io/server/common/persistence>.
  • r

    Roberto Malcotti

    08/09/2022, 3:15 PM
    it will affect all the tables
    m
    • 2
    • 2
  • s

    Simon Späti

    08/11/2022, 3:41 PM
    Tuesday was a big day for Dagster at the Dagster Day. Huge congrats on the v1.0 release and the General Availability of Dagster Cloud! We wrote a short recap about the announcements and newest features. It was not a typical launch with many features presented. Instead, they showed off their rock solid product four years in the making with lots of user feedback, now production-ready for the public to use. Nevertheless, there was still one big feature announcement with the brand new
    Branch Deployments
    . https://airbyte.com/blog/dagster-1-0-launch
  • f

    Federico Cipriani Corvalan

    08/16/2022, 3:24 PM
    o7
  • f

    Federico Cipriani Corvalan

    08/16/2022, 3:24 PM
    elbow salute
  • f

    Federico Cipriani Corvalan

    08/16/2022, 3:35 PM
    Hey, how’s it going? I hope you’re doing well…
  • f

    Federico Cipriani Corvalan

    08/16/2022, 3:37 PM
    Question: How does one should configure
    pod-sweeper
    service? I’m getting a lot of pods in
    Completed
    and
    Error
    state…
    • 1
    • 1
  • f

    Federico Cipriani Corvalan

    08/16/2022, 4:01 PM
    Thanks in advance,
  • b

    Brian Nelson

    08/16/2022, 7:14 PM
    Anyone know if there's a way to detect sync failures via the API for jobs that weren't triggered from the API? I'd like to work in some notification of failures so we don't have to babysit the UI, but I'm not seeing it immediately when looking through the API docs.
    • 1
    • 1
  • a

    Alex Banks

    08/18/2022, 8:02 PM
    It seems like Airbyte isn't releasing postgres connections when sync jobs finish. Has anyone encountered this before?
    • 1
    • 2
  • a

    Amadeus Nicoya

    08/21/2022, 8:27 PM
    Hello everyone! I need your guidance on my project. I already have a Docker project with Dagster, Dagit and DBT. Now I’m trying to integrate Airbyte to run all the services at the same time. I basically copy-paste the docker-compose file from Airbyte into mine and I’m able to run all services but on
    localhost:8000
    is returning an error. I was digging into it and I saw that the services
    bootloader
    never completes the process of the database migration, it just get stuck here:
    Copy code
    2022-08-21 20:14:36 INFO i.a.c.EnvConfigs(getEnvOrDefault):977 - Using default value for environment variable CONFIG_DATABASE_USER: 'docker'
    2022-08-21 20:14:36 INFO i.a.c.EnvConfigs(getEnvOrDefault):977 - Using default value for environment variable CONFIG_DATABASE_PASSWORD: '*****'
    2022-08-21 20:14:36 INFO i.a.c.EnvConfigs(getEnvOrDefault):977 - Using default value for environment variable CONFIG_DATABASE_URL: 'jdbc:<postgresql://db:5432/airbyte>'
    2022-08-21 20:14:36 INFO c.z.h.HikariDataSource(<init>):80 - HikariPool-1 - Starting...
    2022-08-21 20:14:36 INFO c.z.h.HikariDataSource(<init>):82 - HikariPool-1 - Start completed.
    2022-08-21 20:14:36 INFO c.z.h.HikariDataSource(<init>):80 - HikariPool-2 - Starting...
    2022-08-21 20:14:36 INFO c.z.h.HikariDataSource(<init>):82 - HikariPool-2 - Start completed.
    2022-08-21 20:14:37 INFO i.a.c.EnvConfigs(getEnvOrDefault):977 - Using default value for environment variable SECRET_PERSISTENCE: 'TESTING_CONFIG_DB_TABLE'
    any idea why this is happening? or if you have a better solution of how to have all this services into one single docker-compose? Thanks in advance!
    • 1
    • 2
  • t

    Tanmay Baxi

    08/22/2022, 5:54 PM
    Hello Everyone! We have started using AirByte recently - we want to deploy few connectors to AWS Fargate. Any help or guidance would be appreciated.
  • a

    Abba

    08/25/2022, 7:59 AM
    Hi Everyone, How can i periodically prune the data in the mounted directories on my host (especially log files), as they tend to grow very large. My disk storage is almost filled up
  • p

    Pranit

    08/26/2022, 6:15 AM
    Hello I am trying to integrate prefect cloud with Airbyte. But I am facing below error
    ModuleNotFoundError: No module named 'prefect.tasks.airbyte'; 'prefect.tasks' is not a package
    Have I missed to install any dependency ? I didnt see anything on airbyte/prefect pages.
    s
    • 2
    • 4
  • d

    Daniel Meyer

    09/05/2022, 8:33 AM
    Is it possible for a source to change with each run. perhaps by supplying arguments to the connector?
  • a

    Abba

    09/09/2022, 2:42 PM
    Hi Everyone, Can anyone share a script for deleting old logs from a workspace? The server storage disk is filled up
    m
    • 2
    • 1
  • l

    Linh T T Phan

    09/14/2022, 11:06 AM
    Hello everyone, I am using Airbyte for synchronizing data from Airtable to Azure Blob Storage. I tried all these formats: “CSV root level flattering”, “CSV no flattering”, and "JSON Lines". My Airtable table has 19 columns, but the “CSV root level flattering” only transfers 16/19 columns. “CSV no flattering” and "JSON Lines" are worse, only 5/19 columns can be transferred. Can you please advise what I should do? Many thanks! Best, Linh
  • s

    Simon Thelin

    09/15/2022, 9:40 AM
    Hello. I am currently trying to upgrade to the latest airbyte release. I have been using airbyte to sync from postgres to s3. Postgres source still works. But when I try to setup my bucket destination, it gives me 403 error, even though I know I am providing the correct creds. Airbyte logs mentions that the bucket does not exist, but it does. However, the bucket name has a digit in the beginning, in this case
    7b-lakehouse-dev
    . Is there a chance that in the latest s3 destination it does not like the
    digit
    as the first letter in the bucket name?….
  • s

    Sushim Mukul Dutta

    09/19/2022, 7:11 AM
    Hello everyone! I was seeking some guidance regarding the
    airbyte-workers
    task registration. Based on this I can see that there are provisions to have multiple task queues for
    SYNC
    jobs, whereas for the other jobs, they have only one task queue (static). • I wanted to understand what is the reason to have multiple task queues for data sync in particular. • Also, that only the queue name for data sync tasks can be overridden using environment variables, but not the other queue names?
  • m

    Marielby Soares

    09/19/2022, 5:24 PM
    Hello everyone! Is it yet possible to run a script after every sync?
    e
    • 2
    • 3
  • p

    Philippe Boyd

    09/23/2022, 3:06 PM
    Hello Airtbyters, What kind of software/solution are you using to monitor your ELT pipelines on the Ops side? I’m looking for tools such as Datadog or New Relic but built for ELT pipelines? • What are the best practices to put in place to check if a pipeline successfully ran? • If data if not corrupted? • If any error occurred? I know Airbyte has a slack notification for syncs but it’s too much rudimentary especially when you have dozens of syncs.
    a
    k
    • 3
    • 2
  • c

    Christopher Wu

    09/27/2022, 7:14 PM
    Hello Airbyte Team! We are experimenting with transitioning our Airbyte instances from single-node EC2 deployments to EKS with Fargate. We are seeing extremely long startup times when executing connection syncs: 8-9 minutes from a sync being initiated to the source beginning to pull data. One minute of that is for the Fargate node to launch, but then roughly 8 more minutes for what appears to be orchestrator-related processes. Is this the expected performance for Airbyte EKS deployments?
    m
    d
    • 3
    • 10
  • s

    Sujith Kumar.S

    09/29/2022, 9:55 AM
    Hi team, Anyone tried any work around for Airbyte target as hadoop/hive ?
    m
    • 2
    • 1