https://linen.dev logo
Join Slack
Powered by
# feedback-and-requests
  • a

    Andrew Groh

    01/07/2022, 4:39 PM
    We are a SAS company. Our customers need to integrate data from a variety of ad platforms with our system (generally ones that airbyte currently supports so that is good). From our perspective, essentially we want the customer to create the source, and we then automatically create the connection and the destination so that it can flow into our system. Our customers have no idea about our internal data structure (nor should they) so they cannot create the destination (we use S3 for storing customer data. We want to own this process as much as possible so that we can monitor this data ingestion and fix issues when they arrive. Our solution was to write a wrapper on top of airbyte. The customer enters their connection information (possibly an oath flow), and then the wrapper sets up destination and connection in airbyte. Our customers never realize that they are interacting with airbyte. Just thought folks might be interested in our use case.
    b
    • 2
    • 2
  • p

    Pablo Tovar

    01/07/2022, 10:07 PM
    Hi everyone, I’m curious about if Airbyte can be the right fit to what we want to do at our startup. We are building a plataform that analyses customers data and gives them some guidance about how they are performing. In our current flow, we are only working with google sheets, so we ask the customers some credentials needed for connect to their sheets. After they provide the data we run some scripts to extract the data and store in our database, but after that it gets tricky because we have to run crons or other tasks to have the data updated regularly. My question would be if i can make airbyte to check if there are new credentials in our database, can it run automatically to the google sheet and store in out database for us? and then have it done on daily basis?
  • f

    flow

    01/08/2022, 6:23 PM
    Hello I have some questions to the C# Connector part. "https://github.com/mrhamburg/airbyte.cdk.dotnet" is there someone into the C# Airbyte-thing? there is some strange behavior, also with the already built Test-solution maybe someone has a running C# connector solution where i can look into? thanks
    • 1
    • 1
  • j

    Jeremy Owens

    01/10/2022, 3:38 PM
    Good morning. Is there a central place where we can see which versions have working helm charts? Right now, we're successfully using 0.33.12-alpha, but would like to know without just testing each version if/when we can upgrade with Helm
  • m

    Maxime edfeed

    01/10/2022, 7:23 PM
    Hello, is there a way to specify normalization pods resources in helm ? If not, where should I look to contribute ?
    j
    • 2
    • 6
  • j

    Jove Zhong

    01/12/2022, 12:24 AM
    some minor UI issue for the snowflake destination: the default option is truncated without a hover tooltip
    o
    • 2
    • 1
  • j

    Jonas Bolin

    01/12/2022, 8:18 AM
    For the FB Marketing connector, it seems to me that the docs would benefit from something to the effect of "The default tables from the Ads Insights table are very large and heavily throttled, so unless you currently have Advanced level API access, selecting only the fields you require with Custom Insights is probably the option for you", combined with a screenshot with a common example. I.e. this should be promoted as the primary option for everyone who wants to get started with the ads_insights table. The FB docs are so convoluted that it took me hours to figure out how to set up a basic report, and it appears that more people than me are running into this, given how often this question appears on these channels.
  • i

    Ihor Holoviy

    01/12/2022, 10:28 AM
    Hi there, Can somebody advise on how we can set up a data import from Exasol in Airbite? I see that this connector is not ready yet, but maybe there is an ability to use JDBC Driver ?
    r
    • 2
    • 2
  • r

    Rajesh Koilpillai

    01/12/2022, 10:35 AM
    When would Airbyte Java HTTP CDK be available? as mentioned here https://docs.airbyte.com/connector-development/cdk-python#coming-soon
    • 1
    • 1
  • j

    Jason Edwards

    01/12/2022, 6:47 PM
    Hi, I don’t really know where to put this message, but feedback seems like the most appropriate. I’m trying to sync a database with only a handful of tables (only 4 actually), but a couple of them are sizable: one with 50+ million rows and another with 10+ million. Using the “Incremental | Deduped + history” sync mode to load this into a Postgres database seems all but impossible. Part of the problem is because of how Airbyte/DBT loads and transforms in the destination, and the other part of it seems due to how Postgres handles updates (INSERT/UPDATE/DELETE statements). First Airbyte copies the rows over and writes them in raw/JSON format; that’s not so bad, it takes about 4 hours on my AWS t3.xlarge/db.t3.xlarge instances. Next it creates the
    _stg
    tables. After that the
    _scd
    tables are created. Then the final tables are created. This means each row is processed at least 4 times (I’m not including indexing those tables). In my case, working with about 70 million records, 280 million records are getting processed. Incremental syncs should be better. But this problem is compounded by Postgres. Quickly, for anyone who is unfamiliar with Postgres, when you do an update, Postgres actually writes a new record to disk and marks the old one as dead. Later a process (Autovacuum) scans through the table looking for dead records, catalogs them, then removes them. Deleting a record simply marks it as dead. The way DBT creates/updates/deletes records creates a lot of dead records. Like a significant portion of those 280 million records will need to be vacuumed. Vacuuming, of course takes resources, and the database instance grinds to a crawl under the load DBT transforms and vacuuming. So far the 50 million record table has never completely synced, even letting it churn for, literally, days. Would a database other than Postgres be a better choice for a warehouse? Probably, but for now it’s what I have to work with. Could tuning Postgres/autovacuum improve performance? Possibly, but that’s a bit beyond my Postgres skill/knowledge/experience. I’ve also wondered a different sync mode would work better. Sorry, that turned into a bit of a rant. But hopefully it gives you a sense of some of the pain points of, at least, an initial sync of a significant dataset. I don’t know if there’s any possibility in the future to cut down on the amount of processing that happens in the destination database.
    m
    t
    s
    • 4
    • 8
  • m

    Mijbel Alqattan

    01/12/2022, 8:56 PM
    Request: Login screen and user management 🙂
    t
    • 2
    • 3
  • t

    Titas Skrebė

    01/13/2022, 6:03 AM
    Hello, is there any ETA on when we could expect cron string support for scheduling?
    • 1
    • 2
  • a

    Angie Marable

    01/13/2022, 7:45 PM
    Hi - what is the status of this issue and a solution (for postgres)? It's a business impacting issue for our company and we'll need to replace airbyte with another solution within the next couple of weeks without a target date for a fix. https://github.com/airbytehq/airbyte/issues/8280
    u
    o
    +3
    • 6
    • 12
  • o

    Omar Ghalawinji

    01/14/2022, 9:59 AM
    Hello! Is there any documentation on how different configurations of BQ Destination would impact BQ query quota usage? Thanks!
    r
    • 2
    • 3
  • j

    Joseph Reis

    01/15/2022, 12:26 AM
    Is there anything on the horizon for an Opensea connector?
    r
    • 2
    • 1
  • s

    Surya Prakash

    01/15/2022, 11:27 PM
    I'm looking to connect with one of the employees of Airbyte
    r
    • 2
    • 1
  • j

    Joël Luijmes

    01/17/2022, 11:10 AM
    I’m running Airbyte using Helm on a Kubernetes cluster. In the architecutral overview, we have the worker component. I noticed that this worker is always running, even when there aren’t any jobs. As I understand it, it is the worker that actually performs the syncs. I suspect that the worker is currently some monolith that operates based on the input from Temporal. However, as I’m running in Kubernetes, I was expecting a worker that is only active when there is a job to perform. To this end I can dynamically scale up or down, depending on the load. Q: Am I correct in my understanding of the system? If so, would it be possible in future that a worker is only active for the job that needs to be performed?
    u
    • 2
    • 1
  • r

    Ronny Ritongadi

    01/17/2022, 2:55 PM
    Hi @Harshith (Airbyte), @Chris (deprecated profile) I'm running Airbyte 0.29.21-alpha, and it looks like the Airbyte is stuck at one particular error for all the connection to MySQL
    2022-01-17 14:52:33 ERROR () LineGobbler(voidCall):85 - Exception in thread "main" java.sql.SQLSyntaxErrorException: SELECT command denied to user ''@'%' for column 'organizationId' in table 'IDX_PUTTOLIGHT'
    (more complete log attached) I am very sure the DB user is having sufficient permission. Is it possible that Airbyte have stuck in one particular log causing every connection to error? Thank you, Ronny
    u
    b
    • 3
    • 16
  • b

    Blake Enyart

    01/17/2022, 9:07 PM
    Hi Airbyte Team, I've tried looking through the backlog of issues and can't quite find what I'm looking for so thought I would try here before creating an additional issue. In the UI, I have done the tedious work on selecting and configuring each table for an MS SQL server source in a connection. With that, if I need to add an additional table I have to perform the entire sequence again of selecting and configuring everything. Is there an issue already out for just saving the configuration settings that are done for any of the sources in Airbyte?
    u
    a
    e
    • 4
    • 14
  • e

    Elias Djurfeldt

    01/18/2022, 5:32 PM
    Hi there - what a great project you have here 👌 Got a quick question: are there any plans for being able to filter data in the source already before it leaves the source environment? Or like an offset perhaps? For example if I have some kind of SQL source and I don’t want to sync all of the data when first setting up my connection? I only want to sync data from now and onwards? Cheers!
    r
    • 2
    • 3
  • h

    Hitesh Khandelwal

    01/19/2022, 4:54 AM
    Hi Airbyte Team, Please Support scram-sha-512 authorization in kafka cluster.
    • 1
    • 1
  • r

    Ronny Ritongadi

    01/19/2022, 5:19 AM
    Hi @Harshith (Airbyte), @Chris (deprecated profile), @Noah Kawasaki, I'm still using Airbyte 0.29.21-alpha, last issue was solved but another issue emerged, during sync from MySQL to BigQuery, the data return like encrypted (attached), any idea why? it was fine before ..
    j
    n
    +8
    • 11
    • 39
  • j

    Jean-François Paccini

    01/19/2022, 9:56 AM
    Hello Airbyte Team & Community ! I have a question about your connectors for Google Ads, Facebook ads, ... and support for a large number of advertising accounts. My company (DeepReach) runs an automation saas enabling agencies/clients to operate advertising campaigns on several advertising channels. We collect detailed statistics on behalf of our clients with bespoke software and obviously Airbyte would be a great tool for us. That means we collect data for thousands of accounts, and the list is updated on a daily basis. Do you have any plans to support auto-discovery of accounts ? for google that would be all accounts linked to an MCC, for facebook all ad accounts linked to a business manager, etc... Thanks !
    • 1
    • 2
  • n

    Namer Medina

    01/19/2022, 12:19 PM
    Hey fellow airbyters... is there any way to set syncs to run weekly?
    • 1
    • 1
  • l

    Lukas Novotny

    01/21/2022, 11:24 AM
    Hello Airbyte team, we're testing your product in Kubernetes and it's working great! However, we are concerned about the unencrypted connection config storage in Postgres
    actor
    table. Are we missing some trivial config that switches to a different secret store? Can we load secrets from environment variables?
    j
    a
    +5
    • 8
    • 17
  • j

    Jens

    01/21/2022, 12:58 PM
    Is there any way to use custom transformations to add a custom column to each source extract? We'd like to do a datetimediff between max(_airbyte_normalized_at) and now() so that we can calculate the hours since the last successful ingestion for some monitoring and alerting purposes
    a
    • 2
    • 2
  • a

    Anouar Hnini

    01/21/2022, 1:46 PM
    Hello Airbyteam, I'm working with a customer using PostgreSQL (on a private network). Would it be possible to replicate from private network to outside (public this time) destination ?
    • 1
    • 2
  • j

    Jens

    01/24/2022, 8:39 AM
    Hello, we are testing the amazon ads source connector with a postgres destination. Is "incremental deduped + history" the proper sync mode to use for the *_report__stream tables with "reportDate" as the primary key?
    y
    l
    +3
    • 6
    • 6
  • y

    Yashasvi Chaudhary

    01/24/2022, 1:57 PM
    Hi Airbyte team! I have set up Airbyte locally, I was wondering if there is a login system available? I plan to host it on a server as I want my other team members to access it too.
    • 1
    • 2
  • l

    Lihan

    01/24/2022, 9:21 PM
    Hi Airbyte team. One feature suggestion. We have a somewhat large Airflow deployment (1000+ pipelines) and we leverage temporary workers in the cloud. Can Airbyte also use Kubernetes Pod as one off worker to execute tasks?
    • 1
    • 4
1...101112...16Latest