Hi community, I was having a look at the the kafka...
# contribute-code
b
Hi community, I was having a look at the the kafka connector code these days in order to add stateful ingestion to it and I noticed that there are two cases that produce warnings: 1. Schema not found (https://github.com/linkedin/datahub/blob/master/metadata-ingestion/src/datahub/ingestion/source/kafka.py#L165-L166) 2. Schema is not AVRO (https://github.com/linkedin/datahub/blob/master/metadata-ingestion/src/datahub/ingestion/source/kafka.py#L175-L177) while I may see specific scenarios where this behaviour could be useful, both of these warnings could actually be misleading, as outcome of perfectly normal situations (i.e. a topic with no schema or using a different schema strategy and a topic with a non AVRO schema), when a commit policy of
ON_NO_ERRORS_AND_NO_WARNINGS
is selected. My proposal would be to either not report the warnings entirely, or to add a couple of configuration parameters to the source: •
ignore_warnings_on_missing_schema
ignore_warnings_on_schema_type
to selectively disable them. wdyt?
i
Hello Claudio! You bring up a very good point! I think the best way forward would to be add both of those configuration parameters, disabled by default to not break UX initially. As it gets used we can think of switching the defaults. Thoughts?
b
Hi Pedro, yes I agree with your analysis, shouldn’t be hard to achieve 😉
teamwork 1
😍 1