https://linen.dev logo
Join Slack
Powered by
# good-reads
  • v

    Vivek Dubey

    05/06/2024, 6:37 AM
    ๐ŸŒŠ The Semantic Layer Movement: The Rise & Current State - Semantic Mistrust, The Reliable Semantic Stack, Data APIs & Products ๐Ÿ“ƒ In this article, the author delves into the emergence and current status of the semantic layer movement within the contemporary data landscape. Exploring a range of associated aspects, the article examines prevalent challenges, innovative solutions, and more. ๐Ÿ“ง Read the complete article here: https://moderndata101.substack.com/p/the-semantic-movement-the-story-of
    airbyte rocket 1
  • h

    Hugo Lu

    05/08/2024, 10:21 AM
    You just bought Snowflake. What next? https://medium.com/@hugolu87/you-just-bought-snowflake-what-next-your-top-5-priorities-38f01c39de78
  • v

    Vivek Dubey

    05/27/2024, 5:39 AM
    ๐ŸŽฏ How to Create a Governance Strategy That Fits Decentralisation Like a Glove: Need to Shift Right, Federated Governance Fundamentals, & Resolving Policy Conflicts We should update the way we do Data Governance. The data products approach has changed everything. The centralized governance no longer work. We see a bottleneck for the central data governance team, which is receiving demands from all business teams. Let's decentralize some of the responsibilities to data domains with specific new roles :๐Ÿ‘จ๐Ÿปโ€๐Ÿš€ A Data Domain Manager managing policies, standards and processes ๐Ÿ‘ฅ Data Stewards to maintain data quality and metadata of the trusted sources ๐Ÿ‘ฏโ€โ™‚๏ธ Data Product Owners to ensure data products meet users needs ๐Ÿ“ƒ In this article, Charlotte shared her expert insights on how a strong data governance strategy can drive successful decentralization in your organization. She emphasized three key areas: ๐Ÿ’  ๐’๐ก๐ข๐Ÿ๐ญ๐ข๐ง๐  ๐‘๐ข๐ ๐ก๐ญ ๐Ÿ’  ๐…๐ž๐๐ž๐ซ๐š๐ญ๐ž๐ ๐†๐จ๐ฏ๐ž๐ซ๐ง๐š๐ง๐œ๐ž ๐…๐ฎ๐ง๐๐š๐ฆ๐ž๐ง๐ญ๐š๐ฅ๐ฌ ๐Ÿ’  ๐‘๐ž๐ฌ๐จ๐ฅ๐ฏ๐ข๐ง๐  ๐๐จ๐ฅ๐ข๐œ๐ฒ ๐‚๐จ๐ง๐Ÿ๐ฅ๐ข๐œ๐ญ๐ฌ ๐Ÿ’Œ Read the complete article here: https://moderndata101.substack.com/p/how-to-create-a-governance-strategy
  • v

    Vivek Dubey

    06/03/2024, 6:41 AM
    ๐Ÿ—๏ธ Build Data Products With Snowflake | Part 1: Leveraging Existing Stacks: Optimising Snowflake Cost, Integrating Snowflake Sources, and Driving Faster Business Results! ๐Ÿ“ƒ In this series, we want to highlight the ease of leveraging your existing stack to get going with Data Products. This piece is ideally for data leaders who want to adopt the data product approach while staying rooted in big investments like Snowflake, Dbt, Databricks, or Tableau. Weโ€™ll kick this off with a favourite: Snowflake! ๐Ÿ’Œ Read the complete article here: https://moderndata101.substack.com/p/build-data-products-with-snowflake
  • g

    Gabriel Segers

    06/04/2024, 1:21 AM
    What is a good product alternatives for BigQuery for using as Data Warehouse taking into account: โ€ข Similar performance โ€ข Similar or better pricing โ€ข Using SQL to query huge amounts of data What are your guys personally using/are you liking it?
    e
    j
    • 3
    • 2
  • e

    Eric Zakariasson

    06/04/2024, 2:05 PM
    i wrote a little about using Cursor as IDE https://anyblockers.com/posts/how-i-became-3x-more-productive-in-30-minutes-with-cursor/
    u
    • 2
    • 2
  • v

    Vivek Dubey

    06/10/2024, 6:36 AM
    ๐Ÿ—๏ธ Snowflake for Data Products: Data Monetisation & Experience: Part 2 on Leveraging Existing Stacks by Optimising the Three Forks of Business: Cost Savings, Monetisation, and Experience of Data Citizens ๐Ÿ“ƒ In this part 2 of Snowflake for Data Products, author Dr. Balaji Muthusubramanian has discussed the following: โ€ข Monetisation & Experience of Data Citizens โ€ข How Snowflake Approaches Data Monetisation โ€ข How Data Products on Top of Snowflake Takes Off Limitations on Data Sharing + Seamless XP Within the scope of this piece, author has covered the monetisation & experience angle and all the associated aspects of it in detail! ๐Ÿ’Œ Read the complete article here: https://moderndata101.substack.com/p/snowflake-for-data-products-monetisation
  • w

    Willi

    06/11/2024, 12:02 PM
    Dear all, we made a case study where we took the Airbyte Zoom connector as a standard and replicated it using dlt's REST API Toolkit. Maybe interesting for the folks who like to compare the choices we have in open-source tools. Let me know if we missed something and what your view points are! https://untitleddata.company/blog/How-to-create-a-dlt-source-with-a-custom-authentication-method-rest-api-vs-airbyte-low-code
  • j

    Jim Barlow

    06/12/2024, 9:00 AM
    Hi all, this is a guide I wrote recently, arguing that the Modern Data Stack landscape is so complex, with so much overlapping functionality, integration complexity and unpredictable pricing models that there has to be a simpler way. I outline how to leverage native Google Cloud functionality to build a Simple Data Stack with BigQuery at the epicentre. https://medium.com/decode-data/how-to-build-a-simple-data-stack-on-bigquery-7f63d744b81d
  • c

    C1sluo

    06/14/2024, 3:17 PM
    hello team, we are using open source airbyte to load tables from 20+ mysql severs to snowflake. the mysql servers all have same table structure with different account. Is there an easier way to download the connections configuration ( selected data fields and sync mode from each table) from one connection and apply to all the other connections? it is very easy to make mistake when doing this manually, thanks for your suggestions!
    j
    • 2
    • 3
  • v

    Vivek Dubey

    06/18/2024, 6:24 AM
    ๐Ÿšง Usage Analytics Roadblocks: Solving with Model-First Data Products ๐Ÿ“ƒ The pangs of usage analytics, purpose-driven and measurable solutions, and a direct bridge between data and business like never before! In this article, authors have penned down their expert insights to lay emphasis on the following important aspects: ๐Ÿ’  Where does Usage Analytics Falter? ๐Ÿ’  The Solution: Enabling Strong Usage Analytics with Purpose-Driven Direction ๐Ÿ’  Addressing the Problems with Model-First Data Products ๐Ÿ’  Implementation of Model-First Data Products You'll get a detailed explanation on all the above aspects in detail inside the article! โœ‰๏ธ Read the complete article here: https://moderndata101.substack.com/p/usage-analytics-with-model-first-data-products
  • v

    Vivek Dubey

    06/24/2024, 7:18 AM
    ๐Ÿช™ Medallion Approach to Data Products: Beyond the Promised "Gold": Trustworthy Doesn't Mean Certified: The Different Degrees of Trust in a Data Mesh ๐Ÿ“ƒ In this article, our guest auhtor Francesco De Cassai has tried to shed light on the debate of the ๐ฆ๐ข๐ง๐ข๐ฆ๐ฎ๐ฆ ๐ญ๐ซ๐ฎ๐ฌ๐ญ๐ฐ๐จ๐ซ๐ญ๐ก๐ข๐ง๐ž๐ฌ๐ฌ ๐ซ๐ž๐ช๐ฎ๐ข๐ซ๐ž๐ for data assets to be considered data products. Inside this insightful article he talked about: โ€ข Fundamentals of โ€œTrustโ€: Discoverable, Addressable, Trustworthy, Secure, Interoperable, and Self-Describing โ€ข Different Degrees of Trustworthy โ€ข Data Policies to Rule Them All Get a detailed perspective and explanation on related aspects of these within the article. ๐Ÿ’Œ Read the complete article here: https://moderndata101.substack.com/p/medallion-approach-to-data-products
  • v

    Vivek Dubey

    07/01/2024, 7:47 AM
    ๐Ÿ’ก*dbt for Data Products: Cost Savings, Experience, & Monetisation* ๐Ÿ“ƒ This read is ideally suited for data leaders or data engineering leads who are focusing on optimising their dbt investments and want to enhance either of: Cost savings, Data monetisation efforts, Overall experience of users and data consumers. In this article you'll learn: โ€ข The Need to Shift Conversations from ETL to Data Products + Gaps in dbt โ€ข Data Products: One of many outcomes of Self-Service Platforms, but an Important One โ€ข How to Leverage Your Existing Stack (with dbt) to Build Data Products โ€ข Cost Savings โ–ช๏ธŽ Large dbt Models May Lead to High Compute Costs โ–ช๏ธŽ Infrastructure Costs โ–ช๏ธŽ Maintenance, Support, & Operational Costs โ€ข Increasing Appetite for Revenue โ–ช๏ธŽ Scale & Performance โ–ช๏ธŽ How transformations/ETL gains a new stage and is ready for scale โ€ข Enhancing Experience for All (customers & business operatives) ๐Ÿ’Œ Read the complete article here: https://moderndata101.substack.com/p/dbt-for-data-products-cost-monetisation-xp
  • e

    Eric Zakariasson

    07/04/2024, 9:54 AM
    hey! a bit off topic, but does anyone know of a simple, hosted, tool for monitoring data with sql queries and sending alerts when it hits a set threshold? can't seem to find any
    u
    j
    +3
    • 6
    • 16
  • s

    Simon Thelin

    07/05/2024, 9:28 AM
    Not sure if it is the right forum, but I did a review video of airbyte after using it now for 2+ years in production.

    https://www.youtube.com/watch?v=gHTxq6advBEโ–พ

    Hope it can help somebody potentially in evaluating if airbyte is a good fit for them.
    ๐Ÿ‘ 2
    h
    • 2
    • 3
  • v

    Vivek Dubey

    07/08/2024, 6:41 AM
    ๐Ÿ—ฃ๏ธ*Does your LLMs speak the Truth: Ensure Optimal Reliability of LLMs with the Semantic Layer:* Are you building your LLMs wrong? Bring in the semantics layer as the cushioning interface for the LLM. ๐Ÿ“ƒ In this article, authors have tried to provide a deeper understanding of why the common LLM challenges occur and how the semantic layer can enhance/improve their performance and reliability. ๐Ÿ’Œ Read the complete article here: https://moderndata101.substack.com/p/does-your-llms-speak-the-truth-ensure
  • n

    Nazarii Moskal

    07/09/2024, 8:45 AM
    Hey everyone, ๐Ÿ‘‹ I'm Nazarii and works as a Senior Data Engineer at Preply, the online learning marketplace. Currently, we work on a solution for logical replication to collect all the changes that happen on our application database. The existing solution is setup with Debezium and Kafka Connect. We have two goals: โ€ข reach full consistency of a table replica in data warehouse considering updates and deletes that usually may be missed if another approach is used โ€ข collect the history of changes for further ML needs like ability to calculate point-at-time features We faced a few issues working with the application database like high-frequency updates in the tables and those updates have no value for future needs of ML and BI. But there's no easy or optimal way to filter out those change events. I wonder if there's other approaches like Event Hub or anything else used in different companies. ๐Ÿค“ I'll appreciate any comments or recommendations โ˜บ๏ธ
  • a

    Ari Bajo

    07/11/2024, 2:26 PM
    Great read by @Jimmy Ma (Airbyte) on scaling Airbyte workloads across multiple Kubernetes clusters! https://airbyte.com/blog/load-balancing-airbyte-workloads-across-multiple-kubernetes-clusters
    airbyte rocket 3
  • b

    Boris Staal

    07/12/2024, 2:39 PM
    I'm slowly starting a project to build a TS compiler that can emit ILASM bytecode (an object-oriented assembler for .NET VM). The main reason behind it is interoperability with any .NET language. There's already https://github.com/praeclarum/Netjs that can infer Typing for a standard library. At least in theory. If anyone is interested, please feel free to reach out. It could be fun. I'd say there's a lot of potential for indirect optimization too. I don't think we can really make JS faster itself but the ability to tap into existing libraries developed in system languages is a huge win to cut edges here and there. It's not very AB-related. Just a good place to check for smart heads ๐Ÿ™‚
  • a

    Abhishek Singh

    07/15/2024, 12:45 AM
    https://medium.com/israeli-tech-radar/docker-image-for-building-custom-airbyte-connectors-07ef41685a9d This guide is very helpful in creating a env for airbyte connector builder in docker so there's no need for installing dependencies and stuff on your local machine
  • v

    Vivek Dubey

    07/15/2024, 6:22 AM
    ๐Ÿ› ๏ธ Tearing Down the Monolith | The Rise of Microservices & Modular Architecture in Data Engineering: Common Anti-Patterns, the Most Immediate Alternatives, and Existing Stack Boosters ๐Ÿ“ƒ In this article, author has explained in detail the different aspects on the topics: โ€ข Monolithic Application Basics in Data Engineering โ€ข The Rise of Distributed Computing and Microservices โ€ข The alternatives, and how can they be implemented? Get all these points in details and associated elements inside the article! ๐Ÿ’Œ Read the complete article here: https://moderndata101.substack.com/p/tearing-down-monolith-the-rise-of-microservices
  • p

    Paweล‚ Kociล„ski

    07/15/2024, 7:48 AM
    Hello, I was working on extending airflow airbyte operator with open lineage (calling config API). Here's the draft mr https://github.com/airbytehq/airbyte/discussions/41539. Also I posted the discussion thread on that https://github.com/airbytehq/airbyte/discussions/41539. What do you think about adding this functionality natively to airbyte ?
    u
    • 2
    • 1
  • v

    Vivek Dubey

    07/22/2024, 6:16 AM
    ๐Ÿš€ Semantics and Data Product Enablement - A Practitioner's Secret: Bare-bone Purpose & Key Components of the Semantic Layer, The Data Product Divergence and Impact, and Driving Business Value at Scale ๐Ÿ“ƒ In this article, guest author Frances O'Rafferty has discussed how the semantic layer translates the physical data stored in transactions into the data the business cares about, revenue and profit. Its role in simplifying data consumption and ensuring consistency makes it essential in the data product ecosystem. She also shared thoughts on the semantic as an enabler for Data Products in this article. ๐Ÿ’Œ Read the complete article here: https://moderndata101.substack.com/p/semantics-and-data-product-enablement
  • a

    Arslan

    08/03/2024, 9:20 AM
    #C01AB7G87NE https://www.linkedin.com/posts/arslanali434343_dataengineering-bigdata-datascience-activity-7225429486429700096-noOt?utm_source=share&utm_medium=member_android
  • v

    Vivek Dubey

    08/05/2024, 7:03 AM
    ๐Ÿ“ฆ Where Exactly Data Becomes Product: Illustrated Guide to Data Products in Action - Concrete Step-by-Step Journey, Debunking Confusion and Dilution, and Importance of Social / Cultural Structures ๐Ÿ“ƒ In this article, In this piece, we want to take a cut at explaining the concept and essence of data products, keeping superficial definitions aside and targeting concrete value. There has been much dilution of the value of data products due to the favourable hype around it, but we hope this piece helps cut through the fog. You'll learn - โ€ข Whatโ€™s Important? โ€ข โ€œImportanceโ€ is a Cultural Metric. โ€ข How to take account of โ€œImportanceโ€ as a Metric โ€ข A Quick View of the Right-to-Left Product Journey and more! ๐Ÿ’Œ Read the complete article here: https://moderndata101.substack.com/p/where-exactly-data-becomes-product
  • c

    Christopher Bergh

    08/06/2024, 7:35 PM
    Open Source Data Quality -- see it working https://datakitchen.io/datakitchens-data-quality-testgen-found-18-quality-issues-in-a-few-minutes-including-install-time-on-data-boston-gov-building-permit-data/
  • a

    Arslan

    08/08/2024, 8:09 AM
    https://www.linkedin.com/posts/arslanali434343_%3F%3F%3F%3F%3F%3F%3F%3F-%3F%3F-%3F%3F%3F%3F-%3F%3F%3F%3F%3F-activity-7227222085385822209-E8ZJ?utm_source=share&utm_medium=member_desktop
  • a

    Arslan

    08/10/2024, 8:49 AM
    I regularly share posts on Spark optimization in data engineering, featuring learning points and slides. This is the sixth installment in the series. I would appreciate your feedback and any additional insights you might have. Thank you! https://www.linkedin.com/posts/arslanali434343_%3F%3F%3F%3F%3F%3F%3F%3F%3F%3F%3F%3F-%3F%3F%3F%3F%3F%3F%3F%3F-activity-7227946790078042112-Dyv-?utm_source=share&utm_medium=member_android
  • v

    Vivek Dubey

    08/12/2024, 5:48 AM
    ๐Ÿค– AI Augmentation to Scale Data Products to a Data Product Ecosystem: Where AI Augments the Data Product Lifecycle, The Significance of User Experience, and The Ability to Focus on Advanced Verticals with Less Resources at Hand ๐Ÿ“ƒ In this article, we want to dedicatedly talk about optimising Data Product development through AI to build and scale Data Products more quickly, naturally, and effectively. You'll learn: โ€ข Classes of AI โ€ข Challenges Orgs Face While Building Data Products โ€ข Key Areas Where AI Complements The Data Product Journey โ€ข 0-1. Polishing Processes at the Semantic Layer and more! ๐Ÿ’Œ Read the complete article here: https://moderndata101.substack.com/p/ai-augmentation-to-scale-data-products
  • a

    Arslan

    08/15/2024, 10:09 AM
    Hey data engineers! I've compiled a recap of my Spark Optimization Series. Itโ€™s been an insightful journey, and I hope these tips help you optimize your skills. Letโ€™s elevate our data engineering expertise together! https://www.linkedin.com/posts/arslanali434343_%3F%3F%3F%3F%3F-%3F%3F%3F%3F%3F%3F%3F%3F%3F%3F%3F%3F-%3F%3F%3F%3F%3F%3F-activity-7229789127334871046-jQIk?utm_source=share&utm_medium=member_desktop
1...89101112Latest