shameless-plugs
  • Lauren Kaufman

    Lauren Kaufman

    09/08/2022, 11:28 AM
    Hey everyone, I'm Lauren, a data ethics nerd who wants to unlock sensitive data for research and other data science collaboration applications in a privacy-preserving way. Airbyte is a great way to enable data operations, and we're hoping to extend dataops capabilities to enable data scientists and owners to collaborate safely and remove barriers to collaboration. I'm part of the community and team at Bitfount working to get feedback on our open beta product for federated data collaboration. Most of our applications are in healthcare and financial services. Would love to give the Airbyte community a sneak peek ahead of our launch! Feedback always welcome 😄 Check it out here for free! 👀 🚀
  • Sama Carlos Samamé

    Sama Carlos Samamé

    09/12/2022, 4:14 PM
    Hello everyone! We are sharing (absolutely free) a new building block to help SaaS products be enterprise-ready: open source Directory Sync - plug and play. Besides SSO and directory sync, it would be great to know what else are enterprise customers asking for. Please share any insights that you may have since it will help us shape things and most importantly help the ecosystem raise its security standards. Thank you!
  • Xiaofei Du

    Xiaofei Du

    09/12/2022, 4:14 PM
    Hello folks! Just want to share that we announced the release of our project VDP in Introducing VDP: open-source visual data ETL last month 🎉. Do you have a lot of visual data to be processed in your workflow? The goal of VDP is to streamline the end-to-end visual data processing pipeline: • Extract unstructured visual data from pre-built data sources such as cloud/on-prem storage, or IoT devices • Transform it into analysable structured data by Vision AI models imported from various ML platforms • Load the transformed data into warehouses, applications, or other destinations A BIG thanks to Airbyte as we’ve adopted Airbyte’s protocol in our project, so the transformed data can be synced into various destinations. ️ VDP on GiHub 📔 VDP documentation Tutorials: • Build a sharable object detection application with VDP and Streamlit

    Built an Cow Counter dashboard with VDP, PostgreSQL & Metabase

    .
  • r

    Ryhan Sunny

    09/14/2022, 9:48 PM
    Hello amazing people of Airbyte! My company Altinity is sponsoring Open Source Analytics Conference 2022 on November 15th. It is a virtual conference that brings together devs from across the analytic stack, including leaders like Max Beauchemin and Doug Cutting. The CfP is open and we would love to hear from your community. Check out the conference page: https://altinity.com/osa-con/. See you there!
  • Maria Silverhardt

    Maria Silverhardt

    09/20/2022, 1:02 AM
    Like 🆓 things? Care about the latest 💲#RevOps best practices? Want to see who's the best in the biz?🌟 #OpsStars is calling your name! Thought some of ya'll might be interested... September 21–22, 2022 — The San Francisco Mint Sign up here: 👉 https://bit.ly/3z5hLpD
  • Sonal Goyal

    Sonal Goyal

    09/20/2022, 9:32 AM
    Hello Everyone, My talk at Databricks Data and AI Summit just went live. In this talk, I have covered the needs and challenges of identity resolution, as well as major design decisions we made while building our open source entity resolution framework Zingg. I hope you find this short talk informative.

    https://youtu.be/F5Dw1QH0idQ

  • Mateusz Klimek

    Mateusz Klimek

    09/21/2022, 4:37 PM
    Hey all 👋 at re_data we recently launched our cloud to host, share and in the future collaborate on reports from different "data apps" like: dbt-docs, great-expectations, re_data and others 😊 We are super interested in Airbyte community feedback! If you would be interested in giving it a spin, you can create a free account here! https://www.getre.io/
  • a

    Atharva Bondre

    09/23/2022, 11:36 AM
    Hello! I'm Atharva from Scoutflo. I work in Product Research and Sales. At Scoutflo, we are working on increasing the awareness and adoption of commercial open-source products! 🚀 Our 3 stakeholders:1. COSS Products 2. Businesses (enterprise customers) 3. OSS Contributors We aim to build a series of products to solve the challenges of each stakeholder. Our 1st free product will go live in 2 months. Sign up for early access: https://www.scoutflo.com/ If Scoutflo sounds interesting 👇🏻 Learn about our long-term vision here: https://www.scoutflo.com/our-vision Follow us on our socials:- Twitter - https://twitter.com/scout_flo?t=_e4kjG3pKjG__51ByhuuBQ&s=31 - LinkedIn - https://www.linkedin.com/company/scoutflo/ - Medium - https://medium.com/scoutflo/whats-the-open-source-hype-all-about-6fb55d81e7e?s=31 P.S. Our 1st blog on Medium went live yesterday!
  • Sama Carlos Samamé

    Sama Carlos Samamé

    09/26/2022, 12:52 PM
    List of free open source Developer Security tools. Feedback is welcome 🙂 https://www.producthunt.com/posts/awesome-oss-developer-security-tools
  • Matthew Tovbin

    Matthew Tovbin

    09/27/2022, 7:11 PM
    Resurfacing this again for the folks who missed it - https://airbytehq.slack.com/archives/C023W76QGE4/p1660340548555459 👋 Hi, folks! Here is a simple bash CLI we've built (kudos to @Christopher Wu) to run any Airbyte source & destination locally without the need to setup an :airbyte: Airbyte server. It can be very handy when you just have a single connection to run and not bother with Airbyte server setup shenanigans, or perhaps when you're troubleshooting a new source/destination during development or testing. Get it here - https://github.com/faros-ai/airbyte-local-cli Usage example with Service Now source and Faros destination:
    ./airbyte-local.sh \
       --src 'farosai/airbyte-servicenow-source' \
       --src.username '<source_username>' \
       --src.password '<source_password>' \
       --src.url '<source_url>' \
       --dst 'farosai/airbyte-faros-destination' \
       --dst.faros_api_url '<faros_api_url>' \
       --dst.faros_api_key '<faros_api_key>' \
       --dst.graph 'default' \
       --state state.json \
       --check-connection
  • Sonal Goyal

    Sonal Goyal

    10/02/2022, 5:11 PM
    As we complete one year working on Zingg: open source identity resolution, penned some thoughts on the journey. I hope you enjoy reading it. https://www.learningfromdata.zingg.ai/p/zingg-turns-one?r=1adrn&amp;utm_campaign=post&amp;utm_medium=web&amp;utm_source=direct
  • Doris Xin

    Doris Xin

    10/06/2022, 5:52 PM
    Hi everyone! Wanted to share LineaPy, an OSS project my team created 🙂 Is translating dev code to data pipelines a time-consuming mess for you? Is it impossible to share data science work with teammates when everyone has their own environments and scripts and file systems? Frustrated that you have no idea how a model was trained, who trained it, and what data was used? LineaPy can help with all of these problems. LineaPy works locally out-of-the-box, but it’s much more powerful when used to collaborate. We’re excited to share this demo that shows you how to make LineaPy work for your team to support collaboration! To spin it up yourself, read our demo tutorial here: https://bit.ly/3LXBL2d See our

    YouTube video

    walkthrough of it. Or, see it live during our October 14th workshop! Register here: https://bit.ly/3C1Pwbw What sets LineaPy apart from other tools? LineaPy automatically refactors code and generates pipelines, saves the code and value for artifacts (models, datasets, scalars, charts, etc.) in one place for easy retrieval, and provides a central artifact store for all your data science work instead of having it spread across many notebooks and file systems. It does so without requiring you to change how you work. LineaPy captures everything automatically and performs program analysis for semantic understanding of your workflow to infer the structure of the data pipeline and refactor code accordingly.
  • a

    Aditya Prakash

    10/07/2022, 4:40 PM
    Hi all, We recently had a community meetup where users showcased their solutions on identity resolution. In this clip from the event, Jimmy Steinmetz , Co-Founder, Crossroads CX, talks about how they use open source Zingg to identify donors and recipients and bring transparency to campaign funding. The campaign datasets comprise of machine-readable and manually transcribed human-readable records. Manual transcription is necessary to digitize old, non-digital data, but it also introduces errors and inconsistencies in the data. Digital records also suffer from variations in names, addresses and other attributes which introduce a major challenge against the traditional ways of entity resolution. We hope this talk helps people looking to build identity solutions.

    https://youtu.be/AYWhsjRsJGA

  • r

    RK

    10/10/2022, 2:32 PM
    I thought someone here might find this interesting. This time we tried SQL Server to BigQuery using Airbyte and SQL Server CDC. https://blog.thecloudside.com/replicating-microsoft-sql-server-to-bigquery-using-airbyte-with-cdc-enabled-7993503e4739
  • r

    RK

    10/10/2022, 2:33 PM
    multiple pipelines have been running for the past month or so in production without any issue! 😮ctavia-loves: AirByte team for adding Cron 🤗
  • a

    Alex Izydorczyk

    10/15/2022, 11:13 PM
    Curious about data science and engineering in the hedge fund space? Interested in working with lots of external data? Check out: https://magis.substack.com/p/what-makes-alternative-data-scientists
  • a

    Atharva Bondre

    10/20/2022, 7:23 PM
    Hello, people! We at Scoutflo (twitter.com/scout_flo) just went live with another blog - A beginner's guide to OSS contribution. It explains various code-based and no-code contribution projects to contribute to (for Hacktoberfest) and tips to get started on the open-source journey. I'd highly appreciate some feedback from the community! Check out the blog here: https://medium.com/scoutflo/a-beginners-guide-to-open-source-contribution-f79b7143bd6c Thanks a ton! 🙂
  • Ravit Jain

    Ravit Jain

    10/21/2022, 9:35 AM
    How Data is used in different markets?https://bit.ly/3dlsrYs
  • Sonal Goyal

    Sonal Goyal

    10/31/2022, 2:13 AM
    If you like reading more than listening to podcasts/watching videos, this one is for you 😁. Transcript of my chat with Alexey Grigorev from DataTalksClub is out! We talk about building open source projects and startups. I hope you find some lessons from my journey. https://datatalks.club/podcast/s11e04-large-scale-entity-resolution.html
  • e

    Edward Louth

    11/07/2022, 10:30 AM
    Hello, we have recently launched Dasherry, a BI tool as code and we are super interested in Airbyte community feedback! Built on top of Apache Superset, Dasherry allows you to store your BI config alongside your DBT with review apps and automated tests, no more broken dashboards. Please take a look and keen to hear your thoughts. https://dasherry.com
  • Till Haug (Veezoo)

    Till Haug (Veezoo)

    11/07/2022, 2:43 PM
    Hey everyone - DataVault Builder & Veezoo are hosting a Webinar on Wednesday the 9th of November. We are going to showcase how to effortlessly setup a State-of-the-Art Self-Service Analytics Layer on top of a clean data layer.
    Fact-based decision-making is complex and brings many challenges. Datavault Builder helps you to integrate all your data accurately, fast, and reliably, providing you with a resilient and excellent data foundation of the best quality. Veezoo is a self-service analytics tool that helps your business users get answers to complex questions from this data simply by asking in plain English for it.
    
    In this webinar, we will show you how Datavault Builder integrates data from different sources and how Veezoo allows you to ask questions about your sales data in plain English and provide you answers, accompanied by tailored visualizations instantly.
    SIGN-UP FOR FREE HERE: https://us02web.zoom.us/webinar/register/3816674001185/WN_tIGIBlSTT9aB9FsC9EEH_A
  • e

    Everett Berry

    11/07/2022, 10:26 PM
    Heya folks, I'm a devrel for Vantage and we've been working with Datadog customers for a little while now on costs. Sharing some of the best practices we've learned here https://www.vantage.sh/blog/datadog-cost-optimization-tips
  • e

    Ethan Brouwer

    11/14/2022, 7:42 PM
    Hey everyone! I'm really unsure of the best place to put this, but settled on here as I guess I'm shamelessly plugging something I built haha. I built a Terraform Provider for Airbyte. It's somewhat limited, but lets you do most things with Airbyte in a terraform config. https://registry.terraform.io/providers/eabrouwer3/airbyte/0.1.12 Obviously I'm open to Issues and PRs and will try to get to things as I have time. I wrote this to use for a work project, but felt like it makes sense to share with the broader community, and maybe some day add it to the airbyte repo somewhere to be supported by Airbyte itself. Another note, I learned
    golang
    to write this and wrote it in about a week and a half but it's worked great for what we're doing at my company. Finally, the main reason I did this was because Octavia didn't totally handle everything that I needed. I needed to be able to handle custom connectors and didn't want to have to create those outside the IaC flow. Also, operations like dbt I don't think are supported by octavia-cli. And we're using terraform to deploy EKS and the Airbyte helm chart, so adding the airbyte config to terraform fit really nicely. Anyways, hope this is useful! And lmk if I should put this elsewhere!
  • a

    Alex Izydorczyk

    11/16/2022, 3:45 AM
    Shameless plug for those looking for data: https://magis.substack.com/p/simcity-and-data-commons
  • Sonal Goyal

    Sonal Goyal

    11/16/2022, 5:20 AM
    Entity resolution and fuzzy matching are powerful utilities for unifying data from multiple sources, but it has typically required custom development and training machine learning models. Do listen to this Data Engineering Podcast to see how open source Zingg can help build Customer 360 https://www.dataengineeringpodcast.com/zingg-open-source-entity-resolution-episode-339/
  • Zach Brak

    Zach Brak

    11/21/2022, 4:15 PM
    Hey all! Wanted to promote my talk at the move(data) conference! "Orchestrating Airbyte in GCP" If you're interested in Google Cloud and Airbyte, this would be a great talk for you!
  • Joey Taleño

    Joey Taleño

    11/22/2022, 4:23 AM
  • Sonal Goyal

    Sonal Goyal

    11/22/2022, 12:15 PM
    In case it is useful, I published a step by step guide for identity resolution using Python. hope it helps https://towardsdatascience.com/step-by-step-identity-resolution-with-python-and-zingg-e0895b369c50?source=rss-fde99eafef0f------2