https://datahubproject.io logo
#troubleshoot
Title
# troubleshoot
w

witty-butcher-82399

09/29/2022, 8:12 AM
Hi datahubers! I have just enabled stateful ingestion in DBT connector and the process failed with following exception
Copy code
│ File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/source/state/sql_common_state.py", line 35, in _get_lightweight_repr                                                                                                                                  │     31   def _get_lightweight_repr(dataset_urn: str) -> str:                                                                                                                                                                                                         │     32       """Reduces the amount of text in the URNs for smaller state footprint."""                                                                                                                                                                               │     33       SEP = BaseSQLAlchemyCheckpointState._get_separator()                                                                                                                                                                                                    │     34       key = dataset_urn_to_key(dataset_urn)                                                                                                                                                                                                                   │ --> 35       assert key is not None                                                                                                                                                                                                                                  │     36       return f"{key.platform}{SEP}{key.name}{SEP}{key.origin}"                                                                                                                                                                                                │     ..................................................                                                                                                                                                                                                               │      dataset_urn = 'urn:li:assertion:98375e72b6e0e0303961b3ac35fa3559'                                                                                                                                                                                               │      SEP = '||'                                                                                                                                                                                                                                                      │      key = None                                                                                                                                                                                                                                                      │     ..................................................
Having a look to the code in
dataset_urn_to_key
, it requires to be a dataset URN and definitely the assertion does not match the pattern
Copy code
def dataset_urn_to_key(dataset_urn: str) -> Optional[DatasetKeyClass]:
    pattern = r"urn:li:dataset:\(urn:li:dataPlatform:(.*),(.*),(.*)\)"
    results = re.search(pattern, dataset_urn)
    if results is not None:
        return DatasetKeyClass(platform=results[1], name=results[2], origin=results[3])
    return None
This is with Datahub v0.8.40 and sounds like a bug when committing the checkpoint. Has these been fixed in later versions or do you want me to create an issue in github? thankyou1
m

mammoth-bear-12532

09/29/2022, 9:55 PM
Has been fixed in recent versions via https://github.com/datahub-project/datahub/pull/5540
7 Views