https://linen.dev logo
Join Slack
Powered by
# feedback-and-requests
  • a

    Anatole Callies

    11/26/2021, 11:10 AM
    Hi, Is there any guide on how to upgrade Airbyte ? The doc seems empty : https://docs.airbyte.io/troubleshooting/on-upgrading I am interested in this fix : https://github.com/airbytehq/airbyte/pull/8065 And would like to know if it's included in the latest version and how to get it. Thanks
    • 1
    • 3
  • b

    Bruno Marra

    11/26/2021, 6:20 PM
    Hi guys, good afternoon I'm have one demand to extract some files every day from this public dataset of ANAC: https://sistemas.anac.gov.br/dadosabertos/Voos%20e%20operações%20aéreas/Registro%20de%20serviços%20aéreos/2021/11%20-%20Novembro/ the new files has a pattern based only on date. but, with the simple file connector I can't change the files url. Is there a way that can i define a pattern to extract or i have to create my own custom connector based on file connector? thanks.
    • 1
    • 2
  • a

    Andik Achmad

    11/28/2021, 4:30 AM
    Hi guys. I am currently using airbyte for ingestion from postgres to postgres. I found that airbyte convert timestamp column to varchar in the destination. Is there any way we can do to keep it as timestamp? Thank you.
    • 1
    • 1
  • t

    Terje Russka

    11/29/2021, 8:21 AM
    I have a question about the connection schema changes. If I add a new table to be synced, it will reset the data it other tables too (Even if I haven't changed their configuration)? And if a new column is added to an excising configure table would it be automatically updated?
    • 1
    • 1
  • z

    Zach Brak

    11/30/2021, 9:16 PM
    Considering support for multi-table queries, has anyone considered allowing a table suffix to be defined in a synchronization rather than a prefix?
    • 1
    • 2
  • r

    Remi Salmon

    12/01/2021, 4:41 PM
    Hi all, I am curious if the following trick if doable with Airbyte: we are currently syncing source A to destination B using <not Airbyte> and would like to switch to Airbyte to sync A to destination C, however we are not able to complete a full sync of A using Airbyte when the connection is first run (A is a Heroku postgres db and a full sync throws errors ending in Heroku killing the connection, B and C are Snowflake dbs). Is it possible to trick Airbyte by copying the existing data from B to C and having Airbyte start with incremental syncs from A to C using from the last cursor available from the data in C, instead of starting with a full sync? Hope that all makes sense.
    c
    • 2
    • 3
  • a

    Alejo

    12/01/2021, 5:12 PM
    hi all, we’re evaluating Airbyte and starting to use Databricks. I’ve seen you already support databricks as a destination, but still only in append only mode. We’d like to replicate relational databases (with CDC) into databricks. Do you have this feature in your roadmap? (to be able to do updates on databricks)
    • 1
    • 1
  • c

    Christopher Wu

    12/01/2021, 7:17 PM
    Is it possible to run a manual sync of a connection passing something like a config override json, so that we can run a one-off sync without having to explicitly reconfigure the actual source or connection? Example: we have a connection with a Github source, but we want to do a one-off sync on a different set of repositories while keeping the current set of repositories in the source configuration for the normal sync schedule.
    • 1
    • 3
  • h

    han so

    12/02/2021, 10:40 PM
    anyplans to integrate with datahub so that data lineage information can be centralized?
    • 1
    • 2
  • a

    Anatole Callies

    12/03/2021, 4:17 PM
    This is not a big deal, but I noticed that hourly syncs shifts by a couple of minutes every day. My syncs take more than 10 minutes each, so it's not due to the sync duration (a priori)
    • 1
    • 4
  • j

    Jeff Crooks

    12/03/2021, 6:24 PM
    Does Airbyte support views in MongoDB?
    n
    c
    • 3
    • 19
  • v

    Vikram Bhamidipati

    12/03/2021, 6:40 PM
    Hello, We are evaluating airbyte and I have a few questions before I start running a full-fledged POC. Thank you in advance for taking the time to share your thoughts and knowledge: • The big use case for us to use Airbyte is for CDC from MySQL and Postgres. I see that Airbyte is using Debezium 1.4.2 and it seems like the upgrade to the latest version is not on the radar and potentially a big lift. For context, we have multiple databases with more on the way and processing data in the range of 500 Million rows every day. ◦ I would love for some feedback on what the community's experience has been with the CDC sources in general ◦ The majority of issues I have seen reported with Debezium affecting versions >= 1.4.2 and likely to impact us are about not being able to parse the logs when there are DDL statements in the logs. Have people run into similar issues ? • I see that the PR to use a configurable backend for Secrets has been merged. Is the functionality available in the Open-Source version ? • I have not been able to confirm, but it seems like the way CDC works is that the first connection to the source will want to do a full snapshot of the data. This is a big no-no for us from the Production DB which is huge. Can the source connector be configured to only read from a given position in the transaction log (BinLog/LSN) ?
    n
    • 2
    • 6
  • p

    Pranav Hegde

    12/04/2021, 7:00 AM
    Hi is it possible to schedule syncs based on cron expressions ? If not, is there any plan to incorporate it ?
    • 1
    • 1
  • n

    Nemish Kanwar

    12/06/2021, 9:18 AM
    Can we name our connections as well?? Right now we are syncing from Mongo to BigQuery, with each table being synced as a separate connection... 2 questions • can we finetune all the tables within a connection based on some cron syntax?? • Can we name our connection?? as this doesn't say what is getting synced @Devanshu Mishra @Tharun Singh
  • c

    Charbel Seif

    12/07/2021, 10:41 AM
    Hey guys, are you planning to integrate shopify visits data by any chance?
    • 1
    • 2
  • m

    Matt Wright

    12/08/2021, 8:31 PM
    Is there anyway to adjust the sync frequency to <5m? or is there a way to manually trigger a refresh from an api?
    • 1
    • 2
  • o

    Olivier Girardot

    12/08/2021, 9:10 PM
    Hello, in the process of deploying on GKE it seems there's not really a way to deploy it using workload identity, you have to specify a service account json file to load as secret. Is there a limitation that would prevent us to just use the loaded service account ?
    y
    • 2
    • 25
  • c

    Clovis Masson

    12/09/2021, 8:39 AM
    Hi everyone ! Deployment: Kubernetes Airbyte Version: 0.32.8-alpha I can't find any trace of the Appsflyer application in the sources (doc or in-app cf. linked screenshot), despite the announcement made in this post. I use this 5 months old docker image in a custom way but still I can't track the different image evolutions / updates (version, changelog, etc.) via Airbyte settings like the other sources. I'm not sure about the procedure and how I can help to do so, but would it be possible in a future release to add Appsflyer source connector to the "native" source list ?
    • 1
    • 1
  • v

    Vikram Bhamidipati

    12/09/2021, 11:10 PM
    Hi All, I am looking for info on how to instrument and add monitoring and alerting capabilities to Airbyte open source. I can see that there are two issues • Instrument the Scheduler. · Issue #7154 · airbytehq/airbyte (github.com) • Instrument the worker. · Issue #7155 · airbytehq/airbyte (github.com) Is this possible at all with the current open-source version ? What is on the roadmap of the open-source version to add these capabilities ?
    a
    • 2
    • 3
  • y

    yu

    12/10/2021, 1:54 AM
    Hi team, according to some news, an apache log4j vulnerability was founded. The affected versions are 2.0 <= Apache log4j2 <= 2.14.1. Airbyte uses one of them. So, it would be great to upgrade log4j immediately. • https://www.cyberkendra.com/2021/12/worst-log4j-rce-zeroday-dropped-on.html • https://www.spigotmc.org/threads/security-releases-%E2%80%94-1-8-8%E2%80%931-18.537204/
    m
    • 2
    • 11
  • e

    Eugene Krall

    12/13/2021, 11:11 AM
    Is there a a chance this can be ommited without any performance and data integrity issues? "*WARNING! Updating the schema will delete all the data for this connection in your destination and start syncing from scratch."* We are a growing startup so sometimes our schema changes and old records get removed, when doing the data reset we can loose records that are present in BQ but no longer present in our database
    • 1
    • 3
  • a

    Ameya Bapat

    12/14/2021, 12:21 PM
    I am on 0.33.11-alpha but can't see amazon sqs in sources UI list. This was released as a part of  0.30.24 https://airbytehq.slack.com/archives/C01A4CAP81L/p1635985029034300
  • m

    mog

    12/14/2021, 7:54 PM
    does anyone know if you can integrate collibre with airbyte? or has anyone tried this?
  • l

    Lenny Chase

    12/15/2021, 5:08 PM
    Hi there, cool work you guys are doing! Quick question, is the Shopify connector good for production on Airbyte Cloud? Saw a task in github around that so wanted to get some more details
    r
    • 2
    • 4
  • r

    Ravi Kottu

    12/16/2021, 3:02 PM
    Hello All!! We have a requirement like below and could you please help us to understand if the Airbyte is a good fit for our use case for EL of ELT. 1. We have 10,000+ customers and they send the data in CSV, XLSX, CSV/XLSX as email attachment and EDI file formats 2. Each customer sends 100s of files daily in one of the above file formats and around we receive around 100,000 files each day 3. We are evaluating to use Airbyte and File as Source and Postgres as Destination with basic normalization enabled to write to Postgres 4.  We are also evaluating dbt to run on top of the above said basic normalized data to have a common data model as much as possible We ran a basic validation/POC with Airbyte to evaluate to achieve above requirements but we could not close to a solution thus far due to below current issues. Could you please help us by addressing below current challenges: 1. How do we process 100,000 unique files using File(Source) to Postgres(Dest.) connection? 2. Do we need to create 10,000+ Sources/connections as one per each customer? 3. Do we need to create 10,000+ Destinations for Postgres schema's as one per each customer? 4. How do we handle if we receive 100+ files from same customer on a same day(currently we are seeing only one file name can be configured through Source/Destination and Connector i.e. 1:1)? 5. How can we handle the frequent data type and columns(+/-) changes at source with Airbyte? 6. What is the connections limit at Airbyte? 7. Can we able to manage these many sources/destinations and connections? 8. Is there a better way to achieve the above requirement by using Airbyte and bdt? 9. What is the plan/roadmap for accepting the EDI files as File Source? 10. What is the plan/roadmap for accepting the XML files as File Source? 11. Any source connector to read to retrieve the mail attachment for above said files? 12. We have converted EDI file to JSON and sourced it to Airbyte but we could not see the data in tables created in Postgres with basic normalization, any help/video to understand E2E flow? Could you please help us with a demo and/or a workshop to know how Airbyte can used to achieve our above requirement. Let us know if we can meet any of the Airbyte Engineers/SMEs to know more to introduce this feature rich trending EL of ELT tool. Thanks in anticipation!! Thanks, Ravi Kottu
    g
    • 2
    • 3
  • g

    gunu

    12/17/2021, 4:38 AM
    hey team, i’ve seen many users here request some variation of wanting to add another table to an existing connector that uses CDC. I’m aware of the strict configurations due to the pointer on the binlogs being applied to the entire connector. I imagine there’s a manual process to add the new table in a seperate connector (full refresh), then update the existing connector VIA the api to add this new table without resetting the connector. (And probably updating the database). and the continuing on with existing connector syncing. I’d love if there’s a fullproof step by step for this detailed somewhere. separately, has it been considered to create pointers for each table in a connector such that none of them are impacted when others are added/removed?
    r
    • 2
    • 3
  • r

    Ravi Kottu

    12/17/2021, 10:55 AM
    Hello Airbyte Team, Could you please help us to know if we can continue to use Airbyte Local/Docker installation to comply with the company security guidelines. Just checking if we are any way affected by recently running “Apache Log4j2 Issue (CVE-2021-44228)” issue. Ref: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-45046 https://www.docker.com/blog/apache-log4j-2-cve-2021-44228/
    • 1
    • 1
  • b

    Brian Olsen

    12/17/2021, 10:31 PM
    I signed up for the airbyte cloud waitlist. Is it available in beta yet?
    • 1
    • 2
  • t

    Titas Skrebė

    12/20/2021, 6:03 AM
    Hello, when upgrading from 0.34.1-alpha to 0.34.2-alpha,
    kubectl apply -f
    failed with
    Copy code
    The Pod "airbyte-bootloader" is invalid: spec: Forbidden: pod updates may not change fields other than `spec.containers[*].image`, `spec.initContainers[*].image`, `spec.activeDeadlineSeconds` or `spec.tolerations` (only additions to existing tolerations)`
    it appears bootloader pod needs to be deleted manually before executing command.
    b
    • 2
    • 1
  • b

    Boopathy Raja

    12/20/2021, 6:31 AM
    Got the error 
    Caused by: org.apache.kafka.connect.errors.DataException: Invalid value: null used for required field: "created_at", schema type: STRING
    . Is there a way, I can switch off the validation for the null values, or ignore the failing data alone?
1...8910...16Latest