Slackbot
12/05/2022, 4:46 PMElijah Ben Izzy
12/05/2022, 4:51 PMSeth Terrell
12/05/2022, 4:58 PMElijah Ben Izzy
12/05/2022, 4:59 PMSeth Terrell
12/05/2022, 6:44 PMRequirements.txt
(I added some stuff):
dbt-core
dbt-snowflake
dbt-fal
dbt-python
scikit-learn
sf-hamilton
dataclasses
numpy
pandas
typing_inspect
Pandas is installed but dbt run
is giving an error "No module named 'pandas'".
The error now:
18:41:15 Running with dbt=1.3.1
18:41:15 Found 2 models, 0 tests, 0 snapshots, 0 analyses, 303 macros, 0 operations, 0 seed files, 0 sources, 0 exposures, 0 metrics
18:41:15
18:41:17 Concurrency: 4 threads (target='dev')
18:41:17
18:41:17 1 of 2 START sql table model HAMILTON.raw_passengers ........................... [RUN]
18:41:19 1 of 2 OK created sql table model HAMILTON.raw_passengers ...................... [SUCCESS 1 in 1.95s]
18:41:19 2 of 2 START python table model HAMILTON.train_and_infer ....................... [RUN]
18:41:21 2 of 2 ERROR creating python table model HAMILTON.train_and_infer .............. [ERROR in 1.26s]
18:41:21
18:41:21 Finished running 2 table models in 0 hours 0 minutes and 5.23 seconds (5.23s).
18:41:21
18:41:21 Completed with 1 error and 0 warnings:
18:41:21
18:41:21 Database Error in model train_and_infer (models\train_and_infer.py)
18:41:21 100357 (P0000): Python Interpreter Error:
18:41:21 Traceback (most recent call last):
18:41:21 File "_udf_code.py", line 5, in <module>
18:41:21 ModuleNotFoundError: No module named 'pandas'
18:41:21 in function TRAIN_AND_INFER__DBT_SP with handler main
18:41:21 compiled Code at target\run\cq4ds\models\train_and_infer.py
18:41:21
18:41:21 Done. PASS=1 WARN=0 ERROR=1 SKIP=0 TOTAL=2
Elijah Ben Izzy
12/05/2022, 6:46 PMSeth Terrell
12/05/2022, 6:48 PMElijah Ben Izzy
12/05/2022, 6:52 PMElijah Ben Izzy
12/05/2022, 6:52 PMElijah Ben Izzy
12/05/2022, 6:52 PMSeth Terrell
12/05/2022, 6:54 PMElijah Ben Izzy
12/05/2022, 6:54 PMElijah Ben Izzy
12/05/2022, 6:54 PMSeth Terrell
12/05/2022, 6:56 PMElijah Ben Izzy
12/05/2022, 6:57 PMElijah Ben Izzy
12/05/2022, 6:57 PMSeth Terrell
12/05/2022, 6:57 PMSeth Terrell
12/05/2022, 6:58 PMThis is because in materialization code we use pandas to determine whether the returned dataframe is a snowpark dataframe or a pandas dataframe, and do slightly different things based on that
Elijah Ben Izzy
12/05/2022, 6:59 PMElijah Ben Izzy
12/05/2022, 6:59 PMSeth Terrell
12/05/2022, 7:00 PMElijah Ben Izzy
12/05/2022, 7:01 PMStefan Krawczyk
12/05/2022, 7:48 PMSeth Terrell
12/05/2022, 7:52 PMAlways add pandas as the required dependencies.One doesn't need to install Python/Pandas with DBT directly in
packages.yml
.Stefan Krawczyk
12/05/2022, 8:10 PMSeth Terrell
12/05/2022, 10:25 PMdbt run
working with the sample data from Hamilton with DuckDB, instead of with Snowflake. (I have accepted the terms for Anaconda in Snowflake.)
This leads me to believe the issue is the way I have Snowflake configured and will try to work through that.Seth Terrell
12/05/2022, 10:44 PMElijah Ben Izzy
12/05/2022, 10:46 PMSeth Terrell
12/05/2022, 10:47 PMElijah Ben Izzy
12/05/2022, 10:54 PMSeth Terrell
12/06/2022, 6:56 PMpandas
, snowflake-snowpark-python
, hamilton
) needed to run the query.
dbt run
compiles the .py
script and sends it to Snowflake where it will fail with errors like:
Traceback (most recent call last):
File "_udf_code.py", line 6, in <module>
ModuleNotFoundError: No module named 'python_transforms'
in function TRAIN_AND_INFER__DBT_SP with handler main
According to various documentation, Snowflake should install the packages at run-time, but no matter what I try in the Snowflake UI, it will not recognize some of the packages, like hamilton
or sf-hamilton
.
I've accepted the Anaconda stuff in Snowflake. I've installed the required packages locally with pip install
.
This is very new functionality so there's not much out there on troubleshooting running Python on Snowflake. Ideas?
The first part of the script that gets run in Snowflake:Elijah Ben Izzy
12/06/2022, 6:57 PMSeth Terrell
12/06/2022, 6:58 PMpandas
and snowflake-snowpark-python
but other packages like hamilton
it cannot see.Elijah Ben Izzy
12/06/2022, 6:59 PMhamilton
as hamilton
or sf-hamilton
(the pypi name)?Elijah Ben Izzy
12/06/2022, 6:59 PMSeth Terrell
12/06/2022, 6:59 PMhamilton
as a "package" in the above code yields:
100357 (P0000): Cannot create a Python function with the specified packages. Please check your packages specification and try again.
Seth Terrell
12/06/2022, 6:59 PMhamilton
and sf-hamilton
Seth Terrell
12/06/2022, 7:00 PMStefan Krawczyk
12/06/2022, 7:01 PMsf-hamilton
(pypi) and https://anaconda.org/conda-forge/sf-hamiltonElijah Ben Izzy
12/06/2022, 7:01 PMsnowcli
? https://github.com/Snowflake-Labs/snowcliSeth Terrell
12/06/2022, 7:02 PMElijah Ben Izzy
12/06/2022, 7:02 PMStefan Krawczyk
12/06/2022, 7:02 PMSeth Terrell
12/06/2022, 7:03 PMElijah Ben Izzy
12/06/2022, 7:03 PMSeth Terrell
12/06/2022, 7:04 PMSeth Terrell
12/06/2022, 7:06 PMsf-hamilton
package and it'll likely need to be installed manually.
I'd suspected as much but I'll need to work through uploading and installing.Elijah Ben Izzy
12/06/2022, 7:07 PMSeth Terrell
12/06/2022, 7:09 PMElijah Ben Izzy
12/06/2022, 7:11 PMSeth Terrell
12/06/2022, 7:12 PMSeth Terrell
12/09/2022, 8:45 PMsf-hamilton
package not being part of the supported Anaconda packages, and it may not be readily possible to add additional third-party packages in Snowflake (see comment here). It may be possible to import the package as .gz.tar
but I've been unable to get that working in a Python function created in Snowflake. I've tried various paths to get this working but haven't been successful.
At this point we're looking into DBT's out-of-the-box Python functionality. If we can get that functionality working, I think it'll become apparent to us how Hamilton could take us to the next step. But right now, I think we need to get something simple working with respect to getting Python executable in Snowflake, with the models managed in our DBT project.
I greatly appreciate everyone here's great help and attitude! Hopefully we'll be circling back to Hamilton in the near future: it's really pretty cool (Snowflake packages not withstanding). Thank you!Elijah Ben Izzy
12/09/2022, 9:35 PM