Hello. I wondered whether it is possible to exclud...
# hamilton-help
a
Hello. I wondered whether it is possible to exclude functions from using the RAY backend. I had a single pipeline with elastic queries and transformation code. I had to separate out the elasticsearch queries because it didn't work correctly with RAY. Now I have two services: elastic running with default driver and the transformation code running with the RAY driver. I believe Hamilton works on the backend by adding a RAY decorator to each function. Is is possible to exclude functions from adding this decorator? This is more for future reference.
e
Hey! So currently this is the case (that they’re all running on ray), however, it should be an easy(ish) fix. First, do you mind sharing how you’re calling your code? I assume you’re using the Ray Graph Adapter?
At a high level, if you’re using the ``RayGraphAdapter` we could easily add a flag
only_use_ray_if_decorated
(with a better name, then change this line to respect it, running locally if we haven’t gotten anything from the ray decorator: https://github.com/DAGWorks-Inc/hamilton/blob/d89b03e059143eb9581c50b624265185848ca782/hamilton/plugins/h_ray.py#L122. It feels like a nice extension. We also have the task-based orchestration which can help group/assign different ones, but its a bit more complex.
a
Hi Elijah, this is the relevant part of my run script:
Copy code
@main.command()
def run():

    if config["ray_backend"]:
        output_type = h_ray.RayGraphAdapter(result_builder=base.PandasDataFrameResult())
    else:
        output_type = base.SimplePythonDataFrameGraphAdapter()

    logger_hook = lifecycle.default.PrintLn(print_fn=<http://logger.info|logger.info>)

    dr = (
        driver.Builder()
        .with_modules(
            pipe_load_data,
            pipe_prep_data,
            pipe_entity_features,
            pipe_add_features,
            pipe_risk_metric,
        )
        .with_config(config)
        .with_adapters(logger_hook, output_type)
        .build()
    )
What you suggest sounds great.
I don't think I have looked at "task-based orchestration", will take a look at docs
e
Got it! OK, so yeah, this should be an easy change. If you’re interested in contributing we’d love contributions, but we can also take a stab at some point soon. FWIW I think this is a common pattern, and extends nicely what @Fran Boon did. For task-based orchestration its much more powerful but less well documented TBH.
Task-based uses the following: https://hamilton.dagworks.io/en/latest/concepts/parallel-task/. But it doesn’t need a dynamic # of tasks. I think that the ray one is good unless you really want something more powerful TBH (or need a dynamic set of tasks/nodes).
a
Thanks Elijah, I will definitely take a look at task-based docs soon. Id like to contribute if I find some time. I am currently deep into some integration testing but will return to this when I have some headspace.
e
Great! Just reach out when you need it and we can help you out.
🙌 1