https://linen.dev logo
c

charles

05/21/2021, 4:46 PM
@Jared Rhizor (Airbyte) I was reviewing our truncation / hashing logic in normalization and I had a question come up. We have these two conditions:
Copy code
# if everything fits except for the parent, just truncate the parent
    elif (len(norm_child) + len(json_path_hash) + len(norm_suffix)) < (max_length - min_parent_length):
        max_parent_length = max_length - len(norm_child) - len(json_path_hash) - len(norm_suffix)
        return f"{norm_parent[:max_parent_length]}_{json_path_hash}_{norm_child}{norm_suffix}"
    # otherwise first truncate parent to the minimum length and middle truncate the child
    else:
        norm_child_max_length = max_length - min_parent_length - len(json_path_hash) - len(norm_suffix)
        trunc_norm_child = name_transformer.truncate_identifier_name(norm_child, norm_child_max_length)
        return f"{norm_parent[:min_parent_length]}_{json_path_hash}_{trunc_norm_child}{norm_suffix}"
https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/bases/base-n[…]rmalization/normalization/transform_catalog/stream_processor.py Could you help me understand why we would ever want to prefer this elif case over the else case? I think anything that the elif case hands can be handled by the else case and it has the added benefit of including at least some part of the parent.
u

user

05/25/2021, 8:36 AM
1. the
elif
part tries to include as many characters as possible from the parent, and thus truncates just enough in order to stay under the destination character limits without touching the child names. 2. the
else
case always truncates
MINIMUM_PARENT_LENGTH
which is 10 characters from the parent (or less if it doesnt have 10 characters) but then truncates the child name too
u

user

05/25/2021, 4:16 PM
oh. i think i was misreading 1
u

user

05/25/2021, 4:18 PM
would it be accurate if i changed this comment:
Copy code
# if everything fits except for the parent, just truncate the parent
to:
Copy code
# if everything fits except for the parent, just truncate the parent (still guarantees parent is of length MINIMUM_PARENT_LENGTH)
u

user

05/25/2021, 4:20 PM
yes
u

user

05/25/2021, 4:21 PM
thanks. i'll add that and then stop asking dumb questions 🤪
u

user

05/25/2021, 4:21 PM
why are you adding things? are you reworking that code anyway?
u

user

05/25/2021, 4:22 PM
otherwise i can add it but i might change it with the hashing ticket
u

user

05/25/2021, 4:29 PM
oh
u

user

05/25/2021, 4:29 PM
you can add it.
3 Views