This message was deleted.
# hamilton-help
s
This message was deleted.
👀 1
s
are you explicitly setting a name on the series?
or?
j
I'm not
s
can you share the code snippet? and how you’re getting the series back?
since you shouldn’t need to explicitly set the name
j
Looks something like this:
Copy code
def remove_profanity(
    strip_whitespace: pd.Series, profanity_list_path: pathlib.Path
) -> pd.Series:
    profanity_list = _read_file_as_list(profanity_list_path)
    profanity.load_censor_words(profanity_list)
    return strip_whitespace.apply(profanity.censor)


def dlp_remove_pii(
    remove_profanity: pd.Series, google_dlp_service: GoogleDlpService
) -> pd.Series:
    return google_dlp_service.deidentify_series(remove_profanity)


def response_value(dlp_remove_pii: pd.Series) -> pd.Series:
    # TODO For some reason the rename is require as the name isn't being set - why?
    print(dlp_remove_pii.name)
    return _apply_regex_substitutes(
        dlp_remove_pii, pii_regex
    )  # .rename("response_value")
The GoogleDlpService is responsible for turning the series into a request to a Google API and turning the response back into a series.
When I
print(dlp_remove_pii.name)
I get
None
s
yep that seems to make sense. So while the DAG is executing, the Series objects that are passed don’t have to have a name attached.
what’s the driver code?
and where is this causing problems for you?
j
Good question
Copy code
config = {
        "google_dlp_service": dlp_service,
        "profanity_list_path": profanity_list_path,
        "sentiment_service": sentiment_service,
    }

    dr = driver.Driver(input_df, transforms)
    output_columns = [field for field in FeedbackContainer.__fields__]
    output_data = dr.execute(inputs=config, final_vars=output_columns)
That's the code to execute the dag
The problem it causes is that my final transform looks like this:
Copy code
def _nest_series(**series: pd.Series) -> pd.Series:
    df = pd.concat(my_series, axis=1)
    return df.apply(pd.Series.to_dict, axis=1)

@does(_nest_series)
def feedback(
    prompt_value: pd.Series,
    prompt_type: pd.Series,
    response_type: pd.Series,
    response_value: pd.Series,
    sentiment: pd.Series,
) -> pd.Series:
    pass
The purpose is to turn a given set of series into a single series containing a dictionary containing name:value pairs representing the input series
And the issue is that if the name portion of the name:value pair is missing, I can't export to BigQuery
To give an example of the effect I'm after:
Copy code
Col A     Col B     Col C           Target output
"A"       "B"       "C"             {"Col A":"A", "Col B":"B", "Col C":"C"}
👍 1
s
We should be able to use the
kwarg
keys and set the names then
let me write some code
j
Sure thanks mate
s
🤔 this should do as expected (you had a minor typo in the code above)
Copy code
def _nest_series(**series: pd.Series) -> pd.Series:
    df = pd.concat(series, axis=1)
    return df.apply(pd.Series.to_dict, axis=1)
because this is what it should be getting in:
Copy code
a = pd.Series([1,2,3])
b = pd.Series([4,5,6])
# first line creates a dataframe
pd.concat({'a': a, 'b': b}, axis=1)
   a  b
0  1  4
1  2  5
2  3  6
# next line creates a series of dicts, where the dict keys relate to the series/column names
pd.concat({'a': a, 'b': b}, axis=1).apply(pd.Series.to_dict, axis=1)
0    {'a': 1, 'b': 4}
1    {'a': 2, 'b': 5}
2    {'a': 3, 'b': 6}
dtype: object
j
Spot on mate that's fixed it. Thanks!!
👌 1