https://datahubproject.io logo
Join Slack
Powered by
# random
  • f

    few-carpenter-93837

    09/19/2022, 12:48 PM
    Hey, just wanted to get confirmation regarding Field Level Lineage, is this already existing capability or is it part of the roadmap?
    g
    l
    • 3
    • 4
  • a

    acceptable-judge-21659

    09/22/2022, 6:31 AM
    Hi everybody, does anyone know how I can get the gms hostname and port from the datahub-web-react ?
    b
    • 2
    • 2
  • c

    clean-cpu-43303

    09/22/2022, 4:19 PM
    Column-Level Lineage is coming to DataHub - My reaction --> excited blob excited
    yay bear 6
    💯 13
    nod 2
    b
    b
    a
    • 4
    • 3
  • b

    big-carpet-38439

    09/23/2022, 10:22 PM
    Call for contribution! We've heard repeatedly that understanding the process to upgrade DataHub on Helm is non obvious. A step-by-step guide on updating DataHub would be a great start in addressing this problem. It could describe everything about the upgrading experience, from explaining image tag semantics to checking release notes for action items to downloading + installing new charts to verifying that everything looks good. This would be hugely impactful for not only existing organizations using, but the new cohort of adopters. Is this something you're interesting in helping out on? Let me know! 🙂 Cheers, John
    thank you 1
    teamwork 1
  • s

    steep-soccer-91284

    09/27/2022, 2:36 AM
    Is there any plan or something of translating documentation? I hope there is a chance of contributing for the datahub.
  • b

    better-orange-49102

    09/28/2022, 10:14 AM
    https://datahubproject.io/docs/quickstart/ is still saying to install docker-compose v1 (ie docker-compose). however, in https://github.com/datahub-project/datahub/commit/20138a32e56eb7dae3db61f967364b591ddd8be0 the quickstart has shifted to using v2. (docker compose) The documentation needs to be updated.
    d
    • 2
    • 2
  • m

    mysterious-application-34432

    09/28/2022, 4:01 PM
    Hi all, The CFP form is still open for 2 more days, if you want to present a talk at OSA Con 2022 and join industry leaders such as Peter Zaitsev, Maxime Beauchemin, and Michael Hausenblas in the discussion, then this is your last chance to share your ideas! Fill out the CFP Form and submit your session details for approval. The closing date for submitting a talk is September 30. - What is it? It’s an annual, free, single day, virtual conference that brings together devs and industry leaders from across the analytic stack to discuss Open Source projects and analytic applications. This year’s theme explores leveraging Open Source Analytics to deliver cost-efficient, fast business insights. - When is it? 15 November 2022 - What can I do? Submit a talk, become a community partner, or join us as an attendee. Find out more at altinity.com/osa-con/ Follow us for updates on social media: Slack | Twitter | Facebook | LinkedIn | YouTube
  • f

    few-carpenter-93837

    09/30/2022, 6:40 AM
    Hey guys, in the visual lineage, have you though about adding the schema tag to the entity items? Currently really hard to separate landing, transformation, business facing entities for the business side...
    g
    • 2
    • 1
  • w

    worried-branch-76677

    10/01/2022, 6:48 AM
    Hi Datahub, i am looking at implementing image/file upload for documentation. After some research, i notice that graphql multipart upload is not encouraged according to apollo graphql. What are the choices we have here, any guidance will be nice. As much as possible i dont think its sensible to spin up a new service to manage file uploads?
    b
    • 2
    • 3
  • b

    blue-boots-43993

    10/01/2022, 6:08 PM
    Hi guys...I have several topics to discuss with community so I would like to use this channel as a starting point, after which we could potentially create separate channels for more focused discussion if needed. Topic 1: QlikSense Source - I am preparing a contribution to OSS with my QS ingestion source. I am wondering how many people are using Qlik and what features would you like to see covered with this ingestion source. Topic 2: A lot of proprietary tools such as Precisely, Informatica etc have a concept of a Business Process. Do you think it would be beneficial to you and your company to have such an entity inside Datahub? E.g. Business Process could have child entities such as Business Process Stage/Step somewhat similar to DataFlow and DataJob entities. It could also have connections to datasets, where it could read/write datasets etc. The thing that would be different from DataFlow/Job would be the

    process diagram▾

    . Topic 3: How many of you use Azure? Specifically DataFactory. We are in process of writing this as an additional contribution after Qlik. Would like to sync around that
    💯 1
  • b

    better-orange-49102

    10/19/2022, 2:43 AM
    I'm using markdown to programmatically render some tables in description, hopefully not going to be disrupted 🙏. What's the rationale for doing away with markdown?
    plus1 1
    l
    • 2
    • 3
  • r

    ripe-truck-65149

    10/20/2022, 3:22 PM
    Data Engineering From Notebook To Production A high-level introduction to all of Data Engineering in three hours. In this fast-paced survey course, you’ll learn how to take a data project from prototype to production quality data platform with the modern data stack. • Get a complete overview of today’s data engineering tools • Learn best practices for data architecture and design • Explore techniques to scale up to big data sets and scale out to more features and larger teams • Benefit whether you are new to data engineering or a practicing data engineer looking to keep up with current tools and methods Two class formats to fit your schedule: one hour sessions Wednesday afternoons in November, or a half day December 1st. Registration open now
  • h

    helpful-elephant-58485

    10/26/2022, 6:01 PM
    Will the The DataHub October Town Hall tomorrow be recorded?
  • m

    mammoth-bear-12532

    10/26/2022, 6:26 PM
    Yup it always is (https://www.youtube.com/c/DataHubChannel)! However, the excitement of watching a live demo almost fail but then succeed spectacularly is something that we cannot really capture through YouTube 🙂
    😂 3
    plus1 4
  • b

    big-carpet-38439

    10/28/2022, 6:27 PM
    Any good Halloween costumes?
    🎃 3
    m
    • 2
    • 3
  • a

    able-noon-34556

    11/02/2022, 12:39 AM
    Even between analytics folks, it feels like we'd benefit from a little lingua-franca from time to time. 😄 👋 I'm Rhys (linkedIn) incase you ever want to chat data or growth.
    😅 1
    l
    b
    • 3
    • 3
  • m

    mysterious-application-34432

    11/02/2022, 8:50 PM
    Hi folks! We’ve just locked down the full schedule for this year’s Open Source Analytics Conference – OSA Con 2022 – and what a schedule it’s turned out to be! 20+ Talks 30+ Speakers 15 Community Sponsors 2 Keynotes, 2 Panel discussions, 2 Tracks on the same day! Thanks to the many people who have already signed up to join us on November 15th – and if you haven’t had a chance to register yet, please secure your spot for free as soon as possible!
  • c

    colossal-laptop-87082

    11/07/2022, 6:10 AM
    Hello team!! I'm new to the Datahub, I wanted to ingest CSV and make these observability checkpoints with the help of the Datahub, Is this possible for this? • Freshness • Volume • Scheme
  • b

    better-orange-49102

    11/11/2022, 1:36 AM
    i feel the feature-requests portal is not as timely updated as compared to putting up Github issues that are labelled as feature requests, which can link to specific PRs and more easily linked to other issues
    i
    a
    l
    • 4
    • 8
  • w

    worried-branch-76677

    11/15/2022, 4:48 AM
    Hi, for stateful ingestion. This calculation is incorrect right? https://github.com/datahub-project/datahub/blob/3f17408fe8abb5cf134e40724bb9f6461b[…]/datahub/ingestion/source/state/stale_entity_removal_handler.py should be
    Copy code
    return (1 - overlap_count_all / old_count_all) * 100.0
    instead of
    Copy code
    return (1 - overlap_count / old_count_all) * 100.0
    g
    • 2
    • 3
  • m

    mammoth-bear-12532

    11/17/2022, 7:33 AM
    A pretty technical guide to DataHub on AWS: https://aws.amazon.com/blogs/big-data/part-1-deploy-datahub-using-aws-managed-services-and-ingest-metadata-from-aws-glue-and-amazon-redshift/
    🎉 1
    🙏 1
    plus1 1
    w
    • 2
    • 1
  • m

    miniature-painting-28571

    11/21/2022, 10:18 PM
    hi - what SCANNER is being used to find PII data with DataHub?
    l
    j
    • 3
    • 4
  • m

    miniature-painting-28571

    11/21/2022, 10:18 PM
    or is that something on the roadmap for the future?
  • f

    full-raincoat-68234

    11/25/2022, 1:48 PM
    Hi Guys, Is lineage with snowflake automatic? And do I need the enterprise version?
    d
    a
    • 3
    • 12
  • m

    modern-artist-55754

    11/29/2022, 11:06 PM
    hi guys, does anyone know when v0.9.3 will be released?
    l
    • 2
    • 2
  • m

    mammoth-bear-12532

    12/01/2022, 4:42 PM
    It's really bleak out here in the SF Bay Area today... only bright spot ... DataHub TownHall ☀️ datahubbbb
    😆 7
  • w

    worried-branch-76677

    12/02/2022, 5:35 AM
    Hi Team, regarding stateful ingestion checkpoint. I realized that Job ID is pretty rigid. We are not really able to change this value from recipe config. Does it make more sense for it to be
    pipeline_name
    instead? Reason is that, I am building this connector that create a checkpoint for each workspace / namespace. Because of how
    JobID
    is created. I am not able to create checkpoint for each individual workspace / namespace in one recipe. I think some workaround is to run 1 workspace per recipe in a seperate python process.
    g
    • 2
    • 14
  • b

    better-orange-49102

    12/05/2022, 1:39 AM
    just stumbled upon on https://www.linen.dev/s/datahubspace/ 🙂
    l
    b
    • 3
    • 2
  • w

    worried-branch-76677

    12/08/2022, 1:03 AM
    Do we have a channel to report bug? Like this one? I think its hard for the team to track in troubleshoot. Normally there are alot of ppl asking there. https://datahubspace.slack.com/archives/C029A3M079U/p1670409819840909
    l
    • 2
    • 3
  • a

    acceptable-honey-21072

    12/12/2022, 1:17 PM
    Everything changed when companies like Google and Facebook started to analyze clickstream data for product decisions. You should give it a read if you are interested in understanding how the role of analytics, kept evolving: https://www.craft.do/s/klTxlmMfrsjWUK
    👍 1
1234567Latest