https://github.com/stitchfix/hamilton logo
Join Slack
Powered by
# contribute
  • e

    Elijah Ben Izzy

    07/02/2022, 6:56 PM
    Hey folks! Cool new release upcoming that should make it easier for everyone to contribute. We're adding the ability to build ad-hoc DAGs from functions (rather than modules). This makes it super easy to draft up a unit test without handling resources.
  • e

    Elijah Ben Izzy

    07/02/2022, 6:56 PM
    Here's the PR -- planning to merge this weekend! https://github.com/stitchfix/hamilton/pull/145
    👍 1
  • e

    Elijah Ben Izzy

    07/13/2022, 9:56 PM
    Hey folks! Released the RC version for data quality, planning to release tomorrow. Wanted to give y'all a chance to test it out!
    pip install sf-hamilton==1.9.0rc0
  • e

    Elijah Ben Izzy

    08/16/2022, 3:40 AM
    Hey folks! New RC version -- would love testers! Some features include: • AsyncDriver for hamilton in a web service • New default validators (allowing
    None
    for the output) • Support for
    Union
    type • A refactor of the
    parametrize*
    family -- this includes a new
    @parameterize
    decorator that can parameterize across dependency sources/values • Misc. bug fixes. Reach out if you have any questions! Installing is easy -- just run
    pip install sf-hamilton==1.10.0rc0
    !
  • s

    Slackbot

    01/10/2023, 5:48 PM
    This message was deleted.
    s
    s
    • 3
    • 8
  • s

    Slackbot

    12/22/2023, 9:00 PM
    This message was deleted.
    t
    r
    +2
    • 5
    • 19
  • k

    Konstantin Tyapochkin

    03/18/2024, 8:11 PM
    @Stefan Krawczyk @Elijah Ben Izzy Hi guys! Is it ok that I created a draft PR about integrations with AWS so I can continue working on it? If not, I can just remove it and recreate it after it is ready. The PR: https://github.com/DAGWorks-Inc/hamilton/pull/768
    👀 2
    🔥 2
    👍 1
    s
    e
    • 3
    • 11
  • t

    Tom Barber

    03/21/2024, 2:26 PM
    So on this polars lazyframe stuff @Stefan Krawczyk I've basically added another polars plugin to the source which uses lazyframe instead of dataframe. First up a) is that the correct path for whats missing in Hamilton, it seems to work here now I can toss lazyframes around and then run collect at the end and have it materialize them b) if I can get the other readers and writers squared away do you want it as a PR into the codebase?
    e
    s
    • 3
    • 13
  • f

    Fran Boon

    04/01/2024, 4:13 PM
    As per my Intro, we need to compile our models to several different platforms. We want to be able to set ray.remote options to target the right nodes (currently using Custom Resources , e.g.
    resources={"A2":1}
    ) Currently it seems that this could be pretty easily achieved by (ab)using Hamilton's Tags feature:
    Copy code
    @tag(**{"ray.resources": json.dumps({"A2": 1}))
    def my_hamilton_node_fn_which_needs_an_A2(...) -> ...:
       ...
    RayGraphAdapter.execute_node() would be modified to:
    Copy code
    ray_options = {tag[4:]: json.loads(value) for tag, value in tags.items() if tag.startswith("ray.")}
    return ray.remote(raify(node.callable), **ray_options).remote(**kwargs)
    Any concerns with taking this approach? Any better options?
    t
    e
    s
    • 4
    • 23
  • s

    Stefan Krawczyk

    04/02/2024, 5:23 PM
    @Konstantin Tyapochkin some of the code we sketched for reference:
    Copy code
    dr = driver.Builder().with_modules(data_loading, feature_engineering, model_training, model_evaluation).with_adapter(...).build()
    
    # one sagemaker job on small machine
    data_set = dr.execute(["data_set_v1"], inputs={...})
    
    # one sage maker job on large machine with GPU
    model = dr.execute(["model_v1"], inputs={...}, override={"data_set_v1": data_set})
    
    # one sagemaker job on small machine
    evaluation = dr.execute(["evaluation_v1"], inputs={...}, override={"model_v1": model})
    
    # some ideas on config structure?
    config = {
        "tasks": [{"name": "data_set_v1", "sagemaker": ["machine.small"], "artifacts": ["data_set_v1"]},
                  {"name": "model_v1", "sagemaker": ["machine.gpu"], "artifacts": ["model1"]},
                  {"name": "model_v2", "sagemaker": ["machine.gpu"], "artifacts": ["model2"]},
                  {"name": "evaluation_v1", "sagemaker": ["machine.small"]}]
    }
    
    sagemaker_pipeline_code = SageMakerPipelineBuilder(dr, config).compile()
    airflow_pipeline_code = AirflowPipelineBuilder(dr, config).compile()
    👍 1
    👀 1
  • j

    Jay

    05/15/2024, 4:29 PM
    Hi, is there a way to get all the graphs that are loaded into the driver?
    e
    t
    • 3
    • 17
  • s

    Stefan Krawczyk

    05/21/2024, 7:05 PM
    @Thierry Jean @Gilad Rubin we can chat here around the experimentation and hyper parameter stuff
    👍 1
    t
    g
    • 3
    • 10
  • j

    Jernej Frank

    07/24/2024, 2:37 PM
    Hello, I needed to make a small change to the backend Docker for our orchestration system: https://github.com/DAGWorks-Inc/hamilton/pull/1065 let me know if I should change anything to get it merged. Thanks!
    s
    • 2
    • 3
  • i

    Iliya R

    08/07/2024, 4:33 PM
    I've just created a (draft) PR for adding a
    pyproject.toml
    . Will appreciate any feedback, especially with regard to testing this.
    ❤️ 2
    s
    t
    • 3
    • 18
  • i

    Iliya R

    08/07/2024, 8:08 PM
    Are you guys particularly attached to flake8, or can we switch to ruff?
    s
    • 2
    • 17
  • i

    Iliya R

    08/08/2024, 6:53 AM
    Re import sorting - do we want
    hamilton_sdk
    to be its own section, or 1st party (i.e. grouped with
    hamilton
    ) or 3rd party (grouped with
    pytest
    etc)? There's some inconsistency in the files with that regard.
    s
    • 2
    • 2
  • i

    Iliya R

    08/08/2024, 11:12 PM
    I created another PR to enable ruff. The sdk unit tests are failing, but I'm not sure why.
    👀 2
    s
    e
    t
    • 4
    • 43
  • i

    Iliya R

    08/20/2024, 7:11 PM
    I've been looking at
    parameterize_extract_columns
    and saw that it requires
    ParameterizedExtract
    objects. Suggestion: have it accept some sort of named/ordered fields (e.g. list of dicts, or list of tuples) then wrap them internally in
    ParameterizedExtract
    . It saves an import and is a tiny bit more elegant imho. wdyt?
    e
    • 2
    • 6
  • s

    Slackbot

    08/22/2024, 6:52 PM
    This message was deleted.
    i
    • 2
    • 1
  • j

    Jernej Frank

    08/23/2024, 11:49 PM
    I installed the new pre-commit hooks and keep running into a weird ruff error. I'm not familiar with ruff, any ideas?
    Copy code
    ruff.....................................................................Failed
    - hook id: ruff
    - exit code: 2
    
    error: TOML parse error at line 176, column 1
        |
    176 | [tool.ruff.format]
        | ^^^^^^^^^^^^^^^^^^
    wanted exactly 1 element, more than 1 element
    
    black....................................................................Passed
    trim trailing whitespace.................................................Passed
    fix end of files.........................................................Passed
    fix requirements.txt.................................(no files to check)Skipped
    check python ast.........................................................Passed
    s
    • 2
    • 7
  • f

    Fran Boon

    08/26/2024, 9:08 AM
    Custom CA cert support for [Async]HamiltonTracker: https://github.com/DAGWorks-Inc/hamilton/pull/1105 I wonder if we should share a single session object across all the functions in both these...reusing the session object generally improves performance. Also wonder if we can switch to modern-style Type Hints by using:
    from __future__ import annotations
    e
    s
    • 3
    • 14
  • f

    Fran Boon

    09/01/2024, 3:01 PM
    The AsyncDriver currently just works with the AsyncGraphAdapter. I would like it to work with a RayGraphAdapter. I am aware that Ray Tasks cannot themselves be async (if they wish to benefit from this then they need to start an async event loop inside the task) However I can see some benefit (not yet measurable, so I may be wrong!) in having all the coordination be async: HamiltonTracker, MLFlow, Ray task submission. Is this something that you are already considering? I am happy to take a look if not. Would your guidance be to extend the RayGraphAdapter to auto-detect when it is running in an Event loop or to subclass as AsyncRayGraphAdapter?
    s
    e
    • 3
    • 26
  • j

    Jernj Frank

    09/12/2024, 7:10 PM
    Added the ability to override nodes from later imported modules: https://github.com/DAGWorks-Inc/hamilton/pull/1134 The only part I am unsure about is how to update docs.
    t
    e
    • 3
    • 11
  • i

    Iliya R

    09/19/2024, 7:08 AM
    Quick question - why do we have
    "sqlalchemy==1.4.49; python_version == '3.7.*'",
    in pyproject.toml, if the minimum supported python version is 3.8?
    s
    • 2
    • 1
  • i

    Iliya R

    09/19/2024, 7:43 AM
    Second question - regarding python 3.13 support - what are your plans for adding that (to CI + docs)? According to the python website, the released RC2 "is expected to become the final 3.13.0 release" - so any tests can already be done with that version.
    Call to action
    We strongly encourage maintainers of Python projects to prepare their projects for 3.13 compatibilities during this phase
    s
    • 2
    • 1
  • v

    Viktor

    10/03/2024, 11:15 AM
    I have found this collection with Python DE resources. They are yet missing Hamilton. This may be a good spot to be featured for free. There's 10 Forks and 73 Stars on it. https://github.com/vajol/python-data-engineering-resources/blob/main/resources/orchestration-tools.md
    👀 1
    🙌 1
    s
    • 2
    • 1
  • i

    Iliya R

    10/20/2024, 1:39 PM
    Hi guys, can you please fix this very minor issue (this is a warning raised by pytest):
    Copy code
    hamilton\function_modifiers\macros.py:1522: SyntaxWarning: invalid escape sequence '\*'
    What needs to be done is add a
    r
    to the beginning of the docstring (i.e.
    """
    ->
    r"""
    ) and change
    \*\*
    on the aforementioned line to
    **
    .
    👀 1
    e
    j
    • 3
    • 7
  • j

    Jernj Frank

    11/18/2024, 1:11 AM
    Hey, quick question: is there a guide somewhere how to add google colab and github badges to example notebooks?
    s
    • 2
    • 2