This message was deleted.
# hamilton-help
s
This message was deleted.
👀 1
a
Another approach I've also considered is passing the value of the DbConnection into the Hamilton config instead of the
injector
. Eg:
Copy code
driver = Driver({'db_connection': injector.get(DbConnection)}, [data_loaders])
However if there are a lot of variables, this will be make the config quite long.
e
Interesting. In a way they overlap in what they do, but in a way they may be compatible. The easiest solution would be passing in as inputs.
Copy code
driver = Driver({...}, [data_loaders])
driver.execute([...], inputs={'db_connection' : data_loaders})
This is similar to what you suggested. If you have a few simple dependencies this is easy, but you’ll have to pass in a lot at runtime. If you want to instantiate it once then run it many times, you could happily also pass in as config if that works. The next (slightly more complex) solution is pretty nifty IMO. You can define a
context.py
module (or something like that) with all the variables you’ll want. This effectively replaces the complexity of the injector with python functions.
Copy code
# context.py
def db_connection(env: str) -> DBConnection:
    if env == "prod":
        return create_prod_connection(...)
    else:
        return create_staging_connection

def other_thing_that_you_might_need(...) -> ...:
    ...

...
This way things get loaded dynamically (only if you need them). All you have to do is ensure that the right module gets added:
Copy code
driver = Driver({'env':  'staging'}, [data_loaders, context])
Note these don’t use the injector library at all, but you could easily adapt it to do what you want — exactly as you suggested — so use the injector to pass it in. While I agree that it\s generally bad practice to use
injector
just to pass it along — I think in this case its purely for bridging the two. That said, it could easily be somewhat verbose. You could use a parameterization to get this to be one function (just have one for each field:
Copy code
@parameterize(
    db_connection={"field" : value("db_connection")},
    second_thing_from_injector={"field" : value("second_thing_from_inject")},
    and_so_on=...
)
def injected_fields(injector: Injector) -> Any:
    ...
And
extract_columns
will do the same thing — it’ll just resolve them non-dynamically. It would allow you to do different types, I think.
Then you can probably do something crazier if you want — but I think any of these will bridge it cleanly for you. That said — its more a question of why you’re doing this: 1. If you’re doing this just to move fast and not refactor any code, then great — do whatever’s the quickest and you’re comfortable with 2. If you want to use Hamilton + injector together, then I’d be careful to ensure you know exactly what should go in which framework — they do overlap some, and when developing down the road you’ll want to leave yourself with as little choice as possible. 3. If you want to use just Hamilton longer term — then you’ll want to do something like the
context
module. All this said, I think you could do another pretty easy approach that you might like:
Copy code
vars_ = dr.list_available_variables()
normal_inputs = {...} # your current inputs
external_inputs = [item for item in vars_ if item.is_external_input]
inject = {
    item.name: injector.get(item.type) for item in external_inputs
}
dr.execute([...], inputs = {**inject, **normal_inputs})
And bam! You just have to be careful to ensure that you’re just asking for types you have injections for.
a
Nice, thank you. That 3rd approach looks like it would work well. I hadn't considered the 2nd parameterize approach before ether. For that would you envisage it looks something like the below:
Copy code
@parameterize(
    db_connection={"field" : value(DbConnection)},
    table_name={"field" : value(TableName)},
)
def injected_fields(injector: Injector, field: Any) -> Any:
    return injector.get(field)
e
Yep! The only problem with that is the typing — sometimes we don’t parse
Any
perfectly, but it’s worth a try, and we’re more than happy to fix if behavior is weird. Otherwise yeah, I think doing it before execution makes a lot of sense. It depends on if you want it to be lazily executed at all.
a
ok thank you, this has been really helpful. I am going to give a bit more thought to why we are using both Injector and Hamilton in the same code base as well. Currently we are using the Injector to return none DataFrame objects, eg instances of classes. We are using Hamilton to load data in and perform transformations on the data frames. However we may want to re think that.
e
Yep — no one-size fits all tool. One other thing you might want to consider is using data adapters — slightly orthogonal to this but its an ergonomic way of representing data loading. Works with
source/value
, so you can pass in the db connections, etc... And you can create your own. https://hamilton.dagworks.io/en/latest/reference/io/available-data-adapters/
a
Thank you, that also looks pretty helpful. I'll take a closer look at that.
👍 1
Following up on this, just wanted to let you know I found another solution that I think lets the two frameworks work pretty well together. I refactored out the Python Injector to be in a different file to the Hamilton Driver. This then allows the Hamilton
inject
decorator to be used with the value decorator. Eg:
Copy code
# my_module/main_injector.py
from injector import Injector
from my_module import DbConnectionModule

injector = Injector([DbConnectionModule])
Copy code
# my_module/data_loaders.py
from hamilton.function_modifiers import inject, value
from my_module.main_injector import injector

@inject(db_connection=value(injector.get(DbConnection)))
def load_data(db_connection: DbConnection) -> pd.DataFrame:
	# Load data from the DbConnection
Copy code
# my_module/main.py
from hamilton.driver import Driver
from my_module import data_loaders

driver = Driver({}, [data_loaders])
result = driver.execute(...)
Thank you again for your help on this. Really appreciate how responsive you are on this Slack channel.
s
@Alec Hewitt that’s cool. Question on the design decision. Are you trying to hide what’s being injected? E.g. why not do it this way?
Copy code
# my_module/main_injector.py
from injector import Injector
from my_module import DbConnectionModule

injector = Injector([DbConnectionModule])
Copy code
# my_module/data_loaders.py
from hamilton.function_modifiers import inject, value

def load_data(db_connection: DbConnection) -> pd.DataFrame:
	# Load data from the DbConnection
Copy code
# my_module/main.py
from hamilton.driver import Driver
from my_module import data_loaders
from my_module.main_injector import injector

driver = Driver({}, [data_loaders])
result = driver.execute(..., 
    inputs={"db_conection": injector.get(DbConnection)})
a
Yep that was another solution I considered. I think that would work well if you only had a few variables that you need to pass to the inputs/config. However with a lot of variable that dictionary will become very long and it is an extra step that you have to configure. I also quite like the way you can easily see above the function which variables are coming from the Inject Modules vs a Hamilton input. Eg:
Copy code
@inject(db_connection=value(injector.get(DbConnection)))
def load_data(start: datetime, db_connection: DbConnection) -> pd.DataFrame:
	# Load data from the DbConnection
Here it is quite obvious that
start
is a parameter passed through from Hamilton, whereas
db_connection
is coming from the
injector
.
s
Cool. Makes sense and thanks for explaining.
a
no problem. Happy Thanksgiving!
🦃 1
s
You too! Thanks.
gratitude thank you 1
e
Looks cool! Happy thanksgiving. Only tip is to ensure you have a test-environment for your injector so you can import your functions for use in testing/whatnot 🙂
a
Thanks! Yes makes sense Elijah simple smile
🫡 1