Hello, is there a helper function in the Python SD...
# advice-metadata-modeling
f
Hello, is there a helper function in the Python SDK that can figure out a column’s
field_path
from the database+table+column combination?
a
Hi @flat-engineer-75197, I don’t believe there is- it would make a great contribution!
f
Thanks Paul, we opted to use the DatahubGraph to extract this information. Here’s a rough snippet of our solution
Copy code
from datahub.ingestion.graph.client import DatahubClientConfig, DataHubGraph
from datahub.metadata.schema_classes import SchemaMetadataClass

gms_server = ""
table_urn = "urn:li:dataset:(urn:li:dataPlatform:platform_name,platform_instance.database_name.table_name,env)"
column_name = "test_column"

graph = DataHubGraph(DatahubClientConfig(server=gms_server))
current_schema_metadata = graph.get_aspect(
    entity_urn=table_urn,
    aspect_type=SchemaMetadataClass,
)

# Quickly verify the column_name has a corresponding field in the graph and set field_path
for field_info in current_schema_metadata.fields:
    field_path = field_info.fieldPath

    # If it's not a v2, we assume this is a simple path. This might be bugged but it seems to work for us.
    # Otherwise, clean out the [version=2.0] and [type=...] stuff
    if not field_path.startswith("[version=2.0]"):
        simple_field_path = field_path
    else:
        tokens = [t for t in field_path.split(".") if not (t.startswith("[") or t.endswith("]"))]
        simple_field_path =  ".".join(tokens)

    if column_name == simple_field_path:
        column_found = True
        break

# Now we can work with the field_path