This message was deleted Hamilton Open Source #general

Join Slack

This message was deleted.

# general

Slackbot

02/16/2023, 12:20 AM

This message was deleted.

👀 1

Stefan Krawczyk

02/16/2023, 12:31 AM

If you want DBT to orchestrate the process, dbt-fal seemed like the simplest way to integrate and have python dependency management taken care of.

Stefan Krawczyk

02/16/2023, 12:32 AM

Agreed you don't need dbt-fal to run Hamilton with DBT. But, you need to package the Hamilton code and make it accessible. Open to suggestion/ideas, we could have missed something.

Elijah Ben Izzy

02/16/2023, 12:42 AM

How are you currently orchestrating DBT? DBT cloud, airflow, etc..?

naoto

02/16/2023, 1:36 AM

We're not. Currently we run pandas in CI, and donot use dbt, but I'm looking to refactor so that we do not simply pass the downstream views as monolithic SQL scripts in variables into our python scripts

naoto

02/16/2023, 1:39 AM

Seems to me that dbt python support is still beta, and that the real gains on my team would be for repackaging our transforms with Hamilton. I'm just wondering how to take care of the SQL scripts, can't think of a better way than dbt

naoto

02/16/2023, 1:41 AM

I guess this is an ETLT stack with python, we try to avoid excessive SQL transforms after ingest, but it often is unavoidable. So better to keep our organized and visible in DAG

Stefan Krawczyk

02/16/2023, 2:05 AM

Interesting. Would you be interested in a call tomorrow? We are looking to provide better SQL support with Hamilton natively. We'd love to walk you through some ideas and get your feedback.

Elijah Ben Izzy

02/16/2023, 10:32 PM

Hey @naoto — we’d love thoughts on this — also outlines how you could currently use SQL within hamilton: https://hamilton-opensource.slack.com/archives/C03MANME6G5/p1676585637469939

👍 1

naoto

02/17/2023, 4:38 AM

I'm not sure where this falls into deploying it in CI, I assumed that. because we currently run our pipelines in CI as singular .py scripts which ETL+refresh materialized views, to add dbt I'm thinking we run dbt through these same piplines instead of piping the SQL in python files.

naoto

02/17/2023, 4:39 AM

which is why i was confused why anyone would run through fal, which is expecting an ELT setup I think. That isn't an option for me

Elijah Ben Izzy

02/17/2023, 6:11 AM

Ok I think I get the confusion — FAL is a company, but also an open source dbt plugin that makes python easier to use. We used it cause it let us install a requirements.txt file within dbt for our example. E.g. if you can run dbt, you can use fal to make it run pandas/Hamilton. So, your options are (all within some CI space) 1. Run all sql in dbt, then save somewhere then load/transform in Hamilton 2. Run all sql in dbt, use dbt + Hamilton (with maybe the fal plugin to make your life easier) 3. Run all in hamilton IMO you should use (3) if your SQL is simple and (2) if it’s more complex. Does this make sense?

naoto

02/17/2023, 6:21 AM

Yeah demoing it out as 1) for the time being, but i think if transition to (2) would be ideal

Elijah Ben Izzy

02/17/2023, 6:43 AM

Yeah I think that makes sense for a proof of concept, especially if loading/saving the data isn’t too expensive. Good luck! We’ll be here to help if you need it.

Open in Slack

Previous Next