Hi, I think there is a problem ingesting table co...
# ingestion
b
Hi, I think there is a problem ingesting table column comments using the DataBricks Hive metastore and Hive ingestor. If I use the delta lake ingestion it works but I can see the comments are in json format, not sure if this is due to delta lake ingestion or a data bricks standard. Anyone knows if maybe changing any parameter in the recipe it would works with the hive ingestion?
This is the recipe for hive + databricks
Copy code
source:
  type: hive
  config:
    host_port: <http://adb-4911762757593392.12.azuredatabricks.net:443|adb-4911762757593392.12.azuredatabricks.net:443>
    username: token
    password: dapi5d4d8bbae4a81e8b4d52b28dd2c94390-2
    scheme: 'databricks+pyhive'
    database: silver
    platform: delta-lake
    profiling:
      enabled: true
    table_pattern:
      allow:
        - silver.calendar_calendars

    options:
      connect_args:
        http_path: 'sql/protocolv1/o/4911762757593392/0426-111319-t6tf09gf'

sink:
  type: datahub-rest
  config: ...............
m
Hi @billions-twilight-48559 thanks for sharing this, looks like a small bug fix needed on the connector for this.
b
should I report anywhere?
m
We already merged the fix for this. Can you try it out with
acryl-datahub[hive]==0.8.41rc2
b
sure!
I think is not working
going to erase all metadata first
but incremental crawling not getting comments
nope sorry @mammoth-bear-12532 not working on datahub 0.8.41
comments not imported
m
Are you setting up ingestion in the UI or using the cli to ingest?
b
cli
Copy code
source:
  type: hive
  config:
    host_port: <http://adb-4911762757593392.12.azuredatabricks.net:443|adb-4911762757593392.12.azuredatabricks.net:443>
    username: token
    password: dapi5d4d8bbae4a81e8b4d52b28dd2c94390-2
    scheme: 'databricks+pyhive'
    database: silver
    platform: delta-lake
    table_pattern:
      allow:
        - silver.calendar_calendars

    options:
      connect_args:
        http_path: 'sql/protocolv1/o/4911762757593392/0426-111319-t6tf09gf'

sink:
  type: datahub-rest
  config:
    server:
m
What does
datahub —version
say
b
➜ crawler datahub --version acryl-datahub, version 0.8.41rc2
m
@careful-pilot-86309 could you check on this one?
c
@billions-twilight-48559 Fix is not present in your datahub version. please upgrade to atleast 0.8.41.2rc1 ( this is current latest version)
b
Nice @careful-pilot-86309 sorry
now its working
many thanks
c
Great. teamwork