https://linen.dev logo
Join Slack
Powered by
# ask-community-for-troubleshooting
  • k

    Kelvin Omereshone

    08/19/2021, 4:45 PM
    Hey all trying to run
    docker compose up
    on windows and I keep getting unauthorized authentication required
    j
    u
    • 3
    • 20
  • n

    Niclas Grahm

    08/20/2021, 7:21 AM
    Hi everybody! Just getting started with airbyte, and everything seems to be working fine so far! One question that comes to mind is: when creating a new connection, why do i always have to create new sources and destinations? (as opposed to choosing from already existing)?
    ➕ 1
    👀 1
    i
    u
    • 3
    • 3
  • p

    Phil Marius

    08/20/2021, 8:46 AM
    Hey all, having a few problems with Airbyte on EC2 with a Redshift destination. I’m loading our MySQL database in and it’s taking a very long time for ~20GB data and then fails. I started running it at 11:30am BST yesterday and it failed at 19:30pm. Looking at the logs I can’t tell what’s failing because I’m getting
    Completed successfully
    from the dbt output. As it failed though, it’s on its third attempt since, close to 24 hours after I first tried running it. Anyone come across this before? Airbyte is running on a t2.large instance and the Redshift cluster is a 2 node dc2.large. Screenshots attached of “failing” logs and the long run times
    c
    o
    u
    • 4
    • 11
  • t

    Tom Hallett

    08/20/2021, 4:08 PM
    hi airbyte community. my startup is trying to build it’s first data pipeline, and there are 2 usecases which have come up: 1. ELT: postgres/mixpanel/salesforce/etc => snowflake, dbt for analytics transforms 2. audit trails so we can ask “what was the state of this object (and related objects) at a specific point in time” (for use by operations, ML, etc) The first one seems like a pretty clear implementation of airbyte, where snowflake has the latest snapshot of the postgres data. 🙂 🎉 For the second one, it sounds like a problem in the event sourcing space: an append only log of object changes, so i can replay the changes at anytime to get the state of the object at a specific time (not just right now). Is there anyone using airbyte to solve this problem (ie: postgres CDC => kafka => snowflake => query to calculate aggregates on the fly)? or do I need to tackle this in the application tier (app => eventstore/kafka => projection application)?
    j
    u
    • 3
    • 3
  • m

    Mindaugas Nižauskas

    08/21/2021, 12:16 PM
    Hello, I set up Airbyte locally. My source is Redshift and destination is Snowflake. For destination I use AWS S3 staging. If I run such integration manually, I would use UNLOAD command in Redshift to place data in S3 and then COPY into Snowflake. But what Airbyte does, it just uses select statements in Redshift (no UNLOAD). Can I set some parameters so UNLOAD could be used? I can’t imagine how long will it take without UNLOAD command for 1TB, 10TB or 100TB tables.
    r
    p
    • 3
    • 5
  • c

    Corrensa

    08/23/2021, 11:47 AM
    Sorry I'm trying to setup a destination connector, I'm getting an error
    trap: ERR: bad trap Removing previous generator if it exists
    u
    r
    • 3
    • 25
  • g

    gunu

    08/24/2021, 12:26 PM
    trying to understand CDC - getting a final table row with. mostly NULL values and reviewing the
    _scd
    table i can see several rows with changes but not sure why they’re not persisting to the final deduped row: see example in photo, the final row with have OTHER = NULL
    u
    s
    c
    • 4
    • 21
  • n

    Noel Gomez

    08/24/2021, 7:57 PM
    I have a JSON file on S3 that is Gzipped. the files are in folders that follow a pattern /search/year=<year>/month_<month>/day_<day>/file_<number>.gz I dont think the file connection can handle compression or a pattern and the S3 connector can’t handle JSON and I dont see an option to select the compression. Is this correct or am I missing something?
    ✅ 1
    u
    l
    +2
    • 5
    • 7
  • b

    Ben Hadman

    08/26/2021, 1:35 PM
    Hi Airbyte, been really impressed with how easy the product was to setup and use. I have managed to get the http-request source working great on my instance but I would like to loop through many api requests on the same endpoint, each sending data to the same destination. e.g https://endpoint.com/?id=1 , https://endpoint.com/?id=2 , https://endpoint.com/?id=3 Is this possible with any current connectors?
    ✅ 1
    s
    • 2
    • 4
  • c

    Charles

    08/26/2021, 4:13 PM
    Hello everyone, I have a bug when using the file source connector. It don't detect the correct order in my columns... I supply a csv with no headers so I specify in the reader option the columns names but then when it detect the schema the colunns are not in the order I gave it
    👀 1
    g
    c
    • 3
    • 10
  • e

    Emily Cogsdill

    08/27/2021, 5:25 PM
    Hi! I am trying to connect my Airbyte db to an external postgres database, per these instructions. I understand how to change the environment variables to point to the new db - but how do I transfer the data that already exist in the airbyte db over to the external postgres db I want to use going forward? Is there something I need to do to get Airbyte to start sending data to the new db? Thanks in advance for your help 🙏
    ✅ 1
    u
    s
    m
    • 4
    • 8
  • a

    Ashwin

    08/27/2021, 11:23 PM
    What’s the easiest way to connect to a rest API source and dump the outputs to a CSV flat file with little to no coding done whatsoever?
    ✅ 1
    g
    • 2
    • 1
  • o

    Osinachi Chukwujama

    08/28/2021, 5:54 AM
    Hello folks I'm trying to set up Postgres as my source on Airbyte, but it keeps connecting forever Attached are the logs from my Airbyte installation.
    airbyte-server.logairbyte-scheduler.log
    u
    • 2
    • 2
  • a

    Alex Hu

    08/30/2021, 1:09 AM
    Hi guys, I have a question about submitting my changes. After I pushed the changes I made to my cloned repo, I found out there's like a few hundreds of files(.class, .json, .xml), all of which generated by running the gradlew build and acceptance test command, got pushed alongside my changes. I'm wondering if there's any way to effectively avoid this?
    l
    • 2
    • 4
  • t

    Timothy Tian Yang

    08/30/2021, 7:35 AM
    Did anyone test source for MinIO (object storage)?
    u
    • 2
    • 2
  • a

    Affan Syed

    08/30/2021, 9:50 AM
    @John (Airbyte) have some architectural questions. 1. How does Airbyte handle the transfer of data between a source and target connector. 2. Do connector support pulling data from a sharded source (postgres or Kafka partitions) --- if so, how does it get moved to the same destination table?
    u
    • 2
    • 4
  • t

    Timothy Tian Yang

    08/30/2021, 12:18 PM
    Any roadmap that clickhouse add into the data destination?
    ✅ 1
    u
    • 2
    • 2
  • b

    Ben la Grange

    08/31/2021, 10:24 AM
    Hello - testing GCP Postgres -> Bigquery. How do I ensure my primary key is correct? At source, it’s “id” - integer, but in Bigquery, id is a float.
    ✅ 1
    o
    • 2
    • 2
  • b

    Ben la Grange

    08/31/2021, 1:02 PM
    Next hurdle - I’m testing with CDC of one table (Psql->BQ), initial sync worked, but incremental syncs are not picking up any changes.
    ✅ 1
    u
    • 2
    • 9
  • p

    Phil Marius

    08/31/2021, 2:14 PM
    Hey folks, got two questions: 1. We currently use a MySQL DB as our warehouse with dbt transforming the data. I set up airbyte -> redshift last week and most tables are present, aside from the dbt transformed tables. Has anyone come across this before? 2. Airbyte is eating up a lot of resources on our Redshift instance, we’ve used ~$300 in a week with only the MySQL connection running. The same happened when we ran a trial with Snowflake, eating up ~50 credits of our 100 credit trial. Is there a way we can reduce the warehouse usage at all? Fivetran had a footprint MUCH smaller than Airbyte’s on our Snowflake trial
    👀 1
    j
    g
    • 3
    • 17
  • c

    Charles

    08/31/2021, 6:11 PM
    Do you have any news regarding my bug ? https://airbytehq.slack.com/archives/C021JANJ6TY/p1629994383018900
    u
    • 2
    • 10
  • d

    Duncan

    08/31/2021, 6:32 PM
    Hey, we’re looking to use Airbyte to ETL data from BigQuery to Postgres, but the tasks are taking a very long time. We currently have an in-house solution which extracts data to GCS and executes a COPY to load the into postgres, which takes about 10 minutes. Unfortunately, the bigquery source -> postgres destination using Airbyte is taking over 3 hours. We’ve not moving too much data, about 30M rows, about 3gb worth of data. Are there any ways of speeding this up using these connectors? Would an S3 -> postgres be faster?
    s
    • 2
    • 1
  • j

    Jarrod Parkes

    08/31/2021, 6:49 PM
    can anyone point me to a
    Incremental Sync - Deduped History
    example?
    j
    a
    • 3
    • 23
  • e

    Eric Gagnon

    09/01/2021, 1:32 AM
    I'm attempting to deploy locally using docker-compose. Due to filesystem restrictions I have to use host directories in a different directory instead of letting docker manage volumes. I intend to locate all volumes under /data/local-volumes/airbyte-docker. I thought I changed everything I needed to in .env and docker-compose up -d works as expected, but when I try to setup a source, I keep seeing an error: java.nio.file.NoSuchFileException: source_config.json Does anyone have an example .env for using host mounts like this?
    👀 1
    u
    • 2
    • 6
  • p

    Phil Marius

    09/02/2021, 4:03 PM
    Getting this when trying to upgrade connectors:
    Copy code
    The docker image cannot be found. Please ensure that the input tag points to a valid Airbyte docker image
    Anyone come across this before? Did an upgrade earlier with the
    wget
    method
    ✅ 1
    u
    • 2
    • 12
  • j

    Jiyuan Zheng

    09/03/2021, 5:57 AM
    Hi Airbyte team, which sync mode should I pick for CDC? I am a bit confused what does full refresh mean in the context of CDC. I would like the insert, delete, update to be reflected in the destination.
    ✅ 1
    a
    • 2
    • 1
  • m

    Manel Rhaiem

    09/03/2021, 11:54 AM
    Hello all, I didn't know where to ask, so I am setting up Airbyte for a new company and most of us using MacBook M1. when I did
    docker-compose up
    I am not able to connect to localhost. I see a lot of issues like this
    Copy code
    airbyte-temporal   | {"level":"info","ts":"2021-09-03T11:52:04.299Z","msg":"none","service":"history","shard-id":4,"address":"172.18.0.6:7234","shard-item":"0xc0019cbe80","component":"visibility-queue-processor","lifecycle":"Started","component":"transfer-queue-processor","logging-call-at":"queueProcessor.go:158"}
    airbyte-temporal   | {"level":"info","ts":"2021-09-03T11:52:04.299Z","msg":"none","service":"history","shard-id":4,"address":"172.18.0.6:7234","shard-item":"0xc0019cbe80","component":"history-engine","lifecycle":"Started","logging-call-at":"historyEngine.go:344"}
    airbyte-temporal   | {"level":"info","ts":"2021-09-03T11:52:04.299Z","msg":"none","service":"history","shard-id":4,"address":"172.18.0.6:7234","shard-item":"0xc0019cbe80","lifecycle":"Started","component":"shard-engine","logging-call-at":"controller_impl.go:439"}
    airbyte-temporal   | {"level":"info","ts":"2021-09-03T11:52:04.389Z","msg":"temporal-sys-tq-scanner-workflow workflow successfully started","service":"worker","logging-call-at":"scanner.go:188"}
    airbyte-temporal   | {"level":"info","ts":"2021-09-03T11:52:04.439Z","msg":"Get dynamic config","name":"system.advancedVisibilityWritingMode","value":"off","default-value":"off","logging-call-at":"config.go:79"}
    airbyte-temporal   | {"level":"info","ts":"2021-09-03T11:52:04.439Z","msg":"Get dynamic config","name":"history.historyVisibilityOpenMaxQPS","value":"300","default-value":"300","logging-call-at":"config.go:79"}
    airbyte-temporal   | {"level":"info","ts":"2021-09-03T11:52:51.560Z","msg":"none","service":"matching","component":"matching-engine","lifecycle":"Starting","wf-task-queue-name":"/_sys/temporal-sys-processor-parent-close-policy/2","wf-task-queue-type":"Workflow","logging-call-at":"matchingEngine.go:188"}
    Also all the services have the poor performance run for amd64. any thing you would recommend to do in here? Thank you 🙏
    ✅ 1
    e
    u
    • 3
    • 4
  • o

    Osinachi Chukwujama

    09/03/2021, 1:52 PM
    Hi folks I'm trying to sync data between PostgreSQL and snowflake using Airbyte, but it results in some errors. Here's the log file from the sync
    postgres-to-snowflake-sync-logs.txt
    ✅ 1
    g
    u
    • 3
    • 9
  • j

    Jiyuan Zheng

    09/03/2021, 5:36 PM
    Hi Airbyte Team, I am using Airbyte to do CDC from Postgres to BigQuery. Do I need to create a new replication slot every time I add a new table to the existing connection? I have updated the setting to include more tables and seems like it’s only adding new changes but doesn’t do any initial sync.
    ✅ 1
    d
    g
    • 3
    • 4
  • j

    Jiyuan Zheng

    09/03/2021, 9:35 PM
    Is there a way to limit the traffic during sync so that Airbyte causes minimum impact to the source DB?
    ✅ 1
    d
    • 2
    • 4
1...789...245Latest