Hello friends, while playing with a Snowflake inge...
# ingestion
s
Hello friends, while playing with a Snowflake ingestion and the profiling configuration (it was making the ingestion super slow) I got the question (the usual one that comes up with automated profiling): Can we set things so columns containing ids (nominal variables encoded with numbers) or categorical variables encoded with numbers so we do not calculate profiling statistics on nominal and categorical variables? Moreover, are there plans for having configs
profiling.profile_table_size_limit
&
profiling.profile_table_row_limit
also for Snowflake? ... I see now it works only for BigQuery
c
Yes. There are plans to include these configs in snowflake and other sources as well eventually. you can use profiling.profile_if_updated_since_days, to reduce the execution time. Also more optimised snowflake source version is in progress. Please keep eye for update.
plus1 1
Also if you know the columns already, you can use profile_patterns field as described below: https://datahubspace.slack.com/archives/CUMUWQU66/p1656918807112199?thread_ts=1656568496.337289&cid=CUMUWQU66
plus1 1
s
Nice, thanks @careful-pilot-86309!