https://datahubproject.io logo
Join Slack
Powered by
# integrate-protobuf-datahub
  • l

    little-megabyte-1074

    01/13/2022, 7:53 PM
    set the channel description: Channel dedicated to conversation & collaboration to support protobuf in DataHub
  • l

    little-megabyte-1074

    01/13/2022, 7:55 PM
    Hi folks! Great to catch up this afternoon 🙂 I wasn’t able to find Rashmi - is she in the Slack community yet?
  • g

    gentle-night-56466

    01/13/2022, 7:55 PM
    Probably not yet
  • w

    white-lighter-1476

    01/13/2022, 7:57 PM
    👍
  • g

    green-football-43791

    01/13/2022, 8:00 PM
    blob wave
  • g

    gentle-night-56466

    01/13/2022, 8:00 PM
    I haven’t had the chance to really dig into the datahub metadata model yet, however I am anticipating some need to extend it in order to handle the nesting of schema that is quite common in protobuf. While the nested structure could be represented as a struct in each dataset, that makes updating a nested schema a pain to identify all datasets with a specific nesting. At a glance the inter-dataset relationships mostly involve lineage or foreignkey relations. I am wondering if any work or thought put into this?
    w
    • 2
    • 3
  • g

    gentle-night-56466

    03/09/2022, 4:56 PM
    Here was the output to the protobuf integration investigation: https://github.com/linkedin/datahub/pull/4178 see the README for overview.
  • g

    gentle-night-56466

    03/21/2022, 7:17 PM
    I am starting to consider the next steps around protobuf. 1. Domain support. For example, annotating a protobuf message to associate the dataset with a domain. 2. Ownership. Same idea but add the ownership. 3. Improve flexibility of the protobuf metadata annotations, for example using annotations like `meta.ownwership.team`and
    meta.ownership.organization
    instead of the
    meta.msg.team
    or
    meta.msg.organization
    . This would work exactly the same way as before but provide a way to group related annotations visually in the proto schema text. 4. List properties, allow repeated values to be collected into a json array string and stored as a property.
    m
    • 2
    • 3
  • g

    gentle-night-56466

    03/21/2022, 7:20 PM
    @mammoth-bear-12532 Interested in your perspective on the above ideas whenever you have a moment to consider. Thanks!
  • f

    future-smartphone-53257

    09/08/2022, 10:18 AM
    I don't quite understand why this file is checked into git: https://github.com/datahub-project/datahub/blob/b9068ffd2ef527a2b24b5d3828a9cc6c0f[…]gration/java/datahub-protobuf-example/libs/datahub-protobuf.jar
    m
    g
    • 3
    • 23
  • f

    future-smartphone-53257

    09/09/2022, 1:34 PM
    As far as I can tell only the first message in any protobuf file is being ingested, is this expected? can I somehow change this so it is not the case?
    m
    • 2
    • 2
  • f

    future-smartphone-53257

    09/09/2022, 1:35 PM
    I think ideally https://github.com/datahub-project/datahub/blob/master/metadata-integration/java/datahub-protobuf-example/schema/protobuf/meta/meta.proto should be in https://buf.build/explore - have you considered adding it there? And should I just log an issue for "feature" requests like this?
    m
    • 2
    • 6
  • f

    future-smartphone-53257

    09/09/2022, 2:29 PM
    Also still, this is really annoying:
    Copy code
    java -cp /home/iwana/d.x/github.com/datahub-project/datahub/metadata-integration/java/datahub-protobuf/build/libs/datahub-protobuf-0.8.45-SNAPSHOT.jar <http://datahub.protobuf.App|datahub.protobuf.App> var/generated/main.dsc proto/protobufmessage/price-scraper.proto
    SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
    SLF4J: Defaulting to no-operation (NOP) logger implementation
    SLF4J: See <http://www.slf4j.org/codes.html#StaticLoggerBinder> for further details.
    the SLF4J warnings
    m
    • 2
    • 4
  • m

    mammoth-bear-12532

    09/20/2022, 6:00 AM
    @future-smartphone-53257: we have some improvements in the ergonomics of using the standalone app based on some of your feedback. Specifically, support for custom platforms and subtypes and you can now point it to the root of the directory where your schema files live. Docs here. Latest jar is : here
    ❤️ 1
  • m

    microscopic-twilight-7661

    02/06/2023, 1:21 PM
    Hi everyone, I haven't got any answer in
    ingestion
    channel so I hope I can get some here. Is there a way to specify which message to emit to Datahub if we have multiple non-nested schemas in
    *.proto
    source file? I can see this line in the documentation:
    Copy code
    In addition, you can supply the root message in cases where a single protobuf source file includes multiple non-nested messages.
    However I don't see any parameter to specify this. I am using
    0.8.45
    version of the datahub-protobuf
  • m

    mammoth-bear-12532

    02/07/2023, 4:00 PM
    @brainy-tent-14503 might know the answer to this
  • b

    bulky-shoe-65107

    10/16/2023, 12:40 AM
    has renamed the channel from "integration-protobuf" to "integrate-protobuf-datahub"
  • s

    some-activity-69873

    11/13/2023, 10:41 AM
    Hello! Has anyone managed to ingest proto schema recently?
  • s

    steep-vr-39297

    01/08/2024, 1:45 AM
    Hello, I created an issue because there was a bug in the UI when ingest protobuf. Can you check it?
    • 1
    • 1
  • s

    steep-vr-39297

    01/18/2024, 5:34 AM
    @mammoth-bear-12532 I'm having trouble with the protobuf code and raised a PR, but haven't heard back yet, is there any chance you can check it out? https://github.com/datahub-project/datahub/pull/9318