I'm trying to bring in my Confluent Cloud instance...
# ingestion
f
I'm trying to bring in my Confluent Cloud instance to Datahub. Schemas are protobuf in Confluent Schema Registry. Most of our message keys are uuids, the schema for topic key in the registry is:
Copy code
syntax = "proto3";
package atech.proto.shared;

message Uuid {
  string value = 1;
}
What I'm seeing is its pulling in one topic, then erroring on the second. I think its telling me because of duped shared proto objects:
Copy code
TypeError: Couldn't build proto file into descriptor pool!
Invalid proto descriptor for file "shared/uuid.proto":
  atech.proto.shared.Uuid.value: "atech.proto.shared.Uuid.value" is already defined in file "atech_app_journey_ape_completed-key.proto".
  atech.proto.shared.Uuid: "atech.proto.shared.Uuid" is already defined in file "atech_app_journey_ape_completed-key.proto".
Seems that I might have hit this this limitation but can't think of a workaround, so not quite sure where to go from here? Any clues?
b
@helpful-optician-78938 Do you have any guidance for Chris (I know you've worked on this previously)
@fresh-garage-83780 Are you able to ingest the topics themselves, ie. to ignore the proto schemas?
f
I could live with that for now, but I couldn't see a config property to turn off the schema stuff
b
Oh - neither do I. I would definitely expect that to exist. Let me sink with a few folks to understand this a bit better
cc @dazzling-judge-80093 as well
f
No worries, thanks for having a look. I'm knocking together a quick python emitter together as a workaround which pulls all the topic names and posts a simple
datasetProperties
aspect for each
b
okay got it
sounds good
f
I hacked this gist together to move the project along for now, plugs the gap in our lineage