<@U02QGCRKQ22> Hi! Seems like lots of people here...
# integrate-iceberg-datahub
f
@modern-monitor-81461 Hi! Seems like lots of people here are facing the same issue with integration between DataHub and Apache Iceberg using SQL or Hive meta catalog. As I understood, in the end of summer pyiceberg library was updated to 0.4.0 for DataHub and, I guess that connection should be possible at least with Hive Met Catalog. Would you be able please to let us know if you have any suggestions how we can resolve that error: “Apache Hive support are not installed”? I was able to “pip install pyicberg[hive]==0.4.0” on server itself and into docker container “DataHub-actions” if it makes sense, but error remains the same.
Just FYI, if someone will be searching for the same question. I just added thrift library to specific destination in datahub-actions docker container and it worked. pip install thrift --target=/tmp/datahub/ingest/venv-iceberg-f3c1b67e57e34548/lib/python3.10/site-packages
v
Hi @full-alligator-99452 thanks for suggestion. Have you able to ingest data of iceberg using Hive catalog.
I was also trying for same, any extra options you tried to success
Even for me @lively-appointment-50242 @full-alligator-99452 I have used hive catalog, somehow ingestion showing Failed. But the metadata is populating in datahub. Very Strange. In Ingestion logs it says below error
Copy code
'failures': {'general': ['Failed to create workunit: Property table_type missing, could not determine type: sf1000delta.store_sales', '
This error showing for all tables.
Thanks to both of you for insight on solution @lively-appointment-50242 @full-alligator-99452 👍
👍 1
f
@victorious-car-1170 Hi! Was on a short vacation :) Yes, I was able to ingest metadata from iceberg with hive meta catalog. Even in ingestion tasks have status “success”. As mentioned, I’ve just added that “thrift” library in specific folder inside “datahub-actions” docker container. Haven’t done anything else special.
👍 1
v
Yeah, did the same thing, but somehow my ingestion status is failed, but metadeta injestion happening. thanks
Did you tried starburst integration with datahub any time