I'm looking at DataHub and there is something I ca...
# integrate-iceberg-datahub
m
I'm looking at DataHub and there is something I cannot explain. When you mouse-over a type in the UI, I think you are supposed to see the
native_data_type
from Avro. At least that's what I see (I have attached a screenshot with a
Time
type and a tooltip of
Timestampz
, which is the Iceberg
native_data_type
). But there is a field that is mapped to
Time
and the tooltip shows
Date
. I was expecting to see the type as
Date
and not
Time
. Here is the Iceberg metadata for that field:
Copy code
}, {
                "id" : 227,
                "name" : "date",
                "required" : false,
                "type" : "date"
              }, {
As you can see, its type is
date
and it will be mapped to
DateType
in Python. In my IcebergSource, I create the following Avro schema:
Copy code
elif isinstance(type, IcebergTypes.DateType):
        dateType : IcebergTypes.DateType = type
        return {
            "type": "int",
            "logicalType": "date",
            "native_data_type": repr(dateType),
            "_nullable": True,
        }
where
repr(dateType)
is
Copy code
def __repr__(self):
        return "date"
Is it because a logical Avro type of
date
is mapped to a
Time
type in the UI, or there is something broken on my side? I don't know if all of this makes sense without demo-ing it! Sorry if it's confusing.
h
Hi @modern-monitor-81461, it is hard to tell just from what you are seeing in the UI. I am assuming that you have integrated your changes with schema_util. When you run ingestion, can you check what the native_data_type is in the SchemaFields returned by avro_schema_to_mce_fields call?
m
That was a good idea, thanks @helpful-optician-78938. Found my problem, this was totally my fault!
h
Glad to hear that you were able to root-cause the issue! 🙂