This message was deleted Hamilton Open Source #hamilton-help

Join Slack

This message was deleted.

# hamilton-help

Slackbot

11/10/2023, 11:11 PM

This message was deleted.

👀 1

Alec Hewitt

11/10/2023, 11:15 PM

Another approach I've also considered is passing the value of the DbConnection into the Hamilton config instead of the

injector

. Eg:

Copy code

driver = Driver({'db_connection': injector.get(DbConnection)}, [data_loaders])

However if there are a lot of variables, this will be make the config quite long.

Elijah Ben Izzy

11/10/2023, 11:25 PM

Interesting. In a way they overlap in what they do, but in a way they may be compatible. The easiest solution would be passing in as inputs.

Copy code

driver = Driver({...}, [data_loaders])
driver.execute([...], inputs={'db_connection' : data_loaders})

This is similar to what you suggested. If you have a few simple dependencies this is easy, but you’ll have to pass in a lot at runtime. If you want to instantiate it once then run it many times, you could happily also pass in as config if that works. The next (slightly more complex) solution is pretty nifty IMO. You can define a

context.py

module (or something like that) with all the variables you’ll want. This effectively replaces the complexity of the injector with python functions.

Copy code

# context.py
def db_connection(env: str) -> DBConnection:
    if env == "prod":
        return create_prod_connection(...)
    else:
        return create_staging_connection

def other_thing_that_you_might_need(...) -> ...:
    ...

...

This way things get loaded dynamically (only if you need them). All you have to do is ensure that the right module gets added:

Copy code

driver = Driver({'env':  'staging'}, [data_loaders, context])

Note these don’t use the injector library at all, but you could easily adapt it to do what you want — exactly as you suggested — so use the injector to pass it in. While I agree that it\s generally bad practice to use

injector

just to pass it along — I think in this case its purely for bridging the two. That said, it could easily be somewhat verbose. You could use a parameterization to get this to be one function (just have one for each field:

Copy code

@parameterize(
    db_connection={"field" : value("db_connection")},
    second_thing_from_injector={"field" : value("second_thing_from_inject")},
    and_so_on=...
)
def injected_fields(injector: Injector) -> Any:
    ...

And

extract_columns

will do the same thing — it’ll just resolve them non-dynamically. It would allow you to do different types, I think.

Elijah Ben Izzy

11/10/2023, 11:33 PM

Then you can probably do something crazier if you want — but I think any of these will bridge it cleanly for you. That said — its more a question of why you’re doing this: 1. If you’re doing this just to move fast and not refactor any code, then great — do whatever’s the quickest and you’re comfortable with 2. If you want to use Hamilton + injector together, then I’d be careful to ensure you know exactly what should go in which framework — they do overlap some, and when developing down the road you’ll want to leave yourself with as little choice as possible. 3. If you want to use just Hamilton longer term — then you’ll want to do something like the

context

module. All this said, I think you could do another pretty easy approach that you might like:

Copy code

vars_ = dr.list_available_variables()
normal_inputs = {...} # your current inputs
external_inputs = [item for item in vars_ if item.is_external_input]
inject = {
    item.name: injector.get(item.type) for item in external_inputs
}
dr.execute([...], inputs = {**inject, **normal_inputs})

And bam! You just have to be careful to ensure that you’re just asking for types you have injections for.

Alec Hewitt

11/10/2023, 11:39 PM

Nice, thank you. That 3rd approach looks like it would work well. I hadn't considered the 2nd parameterize approach before ether. For that would you envisage it looks something like the below:

Copy code

@parameterize(
    db_connection={"field" : value(DbConnection)},
    table_name={"field" : value(TableName)},
)
def injected_fields(injector: Injector, field: Any) -> Any:
    return injector.get(field)

Elijah Ben Izzy

11/10/2023, 11:41 PM

Yep! The only problem with that is the typing — sometimes we don’t parse

Any

perfectly, but it’s worth a try, and we’re more than happy to fix if behavior is weird. Otherwise yeah, I think doing it before execution makes a lot of sense. It depends on if you want it to be lazily executed at all.

Alec Hewitt

11/10/2023, 11:47 PM

ok thank you, this has been really helpful. I am going to give a bit more thought to why we are using both Injector and Hamilton in the same code base as well. Currently we are using the Injector to return none DataFrame objects, eg instances of classes. We are using Hamilton to load data in and perform transformations on the data frames. However we may want to re think that.

Elijah Ben Izzy

11/10/2023, 11:51 PM

Yep — no one-size fits all tool. One other thing you might want to consider is using data adapters — slightly orthogonal to this but its an ergonomic way of representing data loading. Works with

source/value

, so you can pass in the db connections, etc... And you can create your own. https://hamilton.dagworks.io/en/latest/reference/io/available-data-adapters/

Alec Hewitt

11/10/2023, 11:59 PM

Thank you, that also looks pretty helpful. I'll take a closer look at that.

👍 1

Alec Hewitt

11/22/2023, 10:31 PM

Following up on this, just wanted to let you know I found another solution that I think lets the two frameworks work pretty well together. I refactored out the Python Injector to be in a different file to the Hamilton Driver. This then allows the Hamilton

inject

decorator to be used with the value decorator. Eg:

Copy code

# my_module/main_injector.py
from injector import Injector
from my_module import DbConnectionModule

injector = Injector([DbConnectionModule])

Copy code

# my_module/data_loaders.py
from hamilton.function_modifiers import inject, value
from my_module.main_injector import injector

@inject(db_connection=value(injector.get(DbConnection)))
def load_data(db_connection: DbConnection) -> pd.DataFrame:
	# Load data from the DbConnection

Copy code

# my_module/main.py
from hamilton.driver import Driver
from my_module import data_loaders

driver = Driver({}, [data_loaders])
result = driver.execute(...)

Thank you again for your help on this. Really appreciate how responsive you are on this Slack channel.

Stefan Krawczyk

11/22/2023, 10:43 PM

@Alec Hewitt that’s cool. Question on the design decision. Are you trying to hide what’s being injected? E.g. why not do it this way?

Copy code

# my_module/main_injector.py
from injector import Injector
from my_module import DbConnectionModule

injector = Injector([DbConnectionModule])

Copy code

# my_module/data_loaders.py
from hamilton.function_modifiers import inject, value

def load_data(db_connection: DbConnection) -> pd.DataFrame:
	# Load data from the DbConnection

Copy code

# my_module/main.py
from hamilton.driver import Driver
from my_module import data_loaders
from my_module.main_injector import injector

driver = Driver({}, [data_loaders])
result = driver.execute(..., 
    inputs={"db_conection": injector.get(DbConnection)})

Alec Hewitt

11/22/2023, 10:51 PM

Yep that was another solution I considered. I think that would work well if you only had a few variables that you need to pass to the inputs/config. However with a lot of variable that dictionary will become very long and it is an extra step that you have to configure. I also quite like the way you can easily see above the function which variables are coming from the Inject Modules vs a Hamilton input. Eg:

Copy code

@inject(db_connection=value(injector.get(DbConnection)))
def load_data(start: datetime, db_connection: DbConnection) -> pd.DataFrame:
	# Load data from the DbConnection

Here it is quite obvious that

start

is a parameter passed through from Hamilton, whereas

db_connection

is coming from the

injector

Stefan Krawczyk

11/22/2023, 11:41 PM

Cool. Makes sense and thanks for explaining.

Alec Hewitt

11/22/2023, 11:41 PM

no problem. Happy Thanksgiving!

🦃 1

Stefan Krawczyk

11/22/2023, 11:41 PM

You too! Thanks.

gratitude thank you 1

Elijah Ben Izzy

11/22/2023, 11:42 PM

Looks cool! Happy thanksgiving. Only tip is to ensure you have a test-environment for your injector so you can import your functions for use in testing/whatnot 🙂

Alec Hewitt

11/22/2023, 11:43 PM

Thanks! Yes makes sense Elijah simple smile

🫡 1

Open in Slack

Previous Next