Join Slack
Powered by
Hi team, I’m trying to consume Kafka topics with ...
# contribute-code
d
delightful-beard-43126
05/09/2023, 6:11 PM
Hi team, I’m trying to consume Kafka topics with Avro schemas that were generated by Apache Flink. However Flink creates Avro records with the name property set to “record,” which breaks the ingestion of Avro data into DataHub. This is because “record” is recognized as a reserved name. e.g.
https://github.com/datahub-project/datahub/issues/2565
. Although it was indeed a limitation of the Python Avro library, its recent versions have a feature that allows disabling this check. Here is the Apache JIRA ticket:
https://issues.apache.org/jira/browse/AVRO-3680
. I was thinking, could we add this feature to DataHub? We could either disable the check entirely or add another configuration option to this function call:
https://github.com/datahub-project/datahub/blob/94e7e51175660afbfb7b5cf198a3263f30d56f62/metadata-ingestion/src/datahub/ingestion/extractor/schema_util.py#L509
.
m
mammoth-bear-12532
05/09/2023, 9:42 PM
Thanks for raising this. cc
@gray-shoe-75895
d
delightful-beard-43126
05/09/2023, 10:42 PM
I can work on a PR if that’s ok
m
mammoth-bear-12532
05/09/2023, 11:08 PM
that would be great. thanks!
Open in Slack
Previous
Next