Hello team, I try to use classification for snowfl...
# troubleshoot
m
Hello team, I try to use classification for snowflake and get this error
Failed to classify table columns
cli_version and gms_version are both '0.9.6.1' It seems everything is ok except classificaion. What should I do? Any help will be appreciated. Thank you!
1
a
@microscopic-room-90690 please refrain from tagging users directly for support as it’s against our community guidelines
b
You can view our Slack Gluidelines here: https://datahubproject.io/docs/slack/
a
Try updating to the most recent version and be sure to review the snowflake documentation again here: https://datahubproject.io/docs/generated/ingestion/sources/snowflake/#config-details
h
Hey @microscopic-room-90690 - if you continue to face the error, please run ingestion by enabling debug logs. The debug logs should help understand more details about the failure.
m
Thank you @hundreds-photographer-13496 The debug log shows
Failed to extract info type due to Failed basic checks for infotype
while the other steps runs successfully. Could you show me the way to fix it?
h
got it. In my experience, this usually happens when there is insufficient data (less than 50 non-null values) in table column to be able to classify with confidence. If this((less data) is the case for your tables, you can ignore these warnings.
m
Thanks @hundreds-photographer-13496 The classification config I used is
default
and at least the field
email
has more than 50 non-null values. And I'm wondering how to ingest classification successfully?
@hundreds-photographer-13496 Update by now: From the source table, I create a table containing the
email
field with all non-null values and get the correct result. While the source table can not get the result. So I guess the values of at least one field should be all non-null.
a
I am looking into this, will get back. I have a feeling that this was fixed but did not make it to the release.
h
Hey @microscopic-room-90690, we released classification library version 0.0.6 that has this fix. It'll take some time to bundle this version with datahub release. However, if you are using CLI-based ingestion and have control over its python environment, you can use newer version right away. you can install new classification library version manually in the python environment.
pip install acryl-datahub-classify==0.0.6
m
Thank you @hundreds-photographer-13496 I will try it