witty-butcher-82399
09/29/2022, 8:12 AM│ File "/usr/local/lib/python3.9/site-packages/datahub/ingestion/source/state/sql_common_state.py", line 35, in _get_lightweight_repr │ 31 def _get_lightweight_repr(dataset_urn: str) -> str: │ 32 """Reduces the amount of text in the URNs for smaller state footprint.""" │ 33 SEP = BaseSQLAlchemyCheckpointState._get_separator() │ 34 key = dataset_urn_to_key(dataset_urn) │ --> 35 assert key is not None │ 36 return f"{key.platform}{SEP}{key.name}{SEP}{key.origin}" │ .................................................. │ dataset_urn = 'urn:li:assertion:98375e72b6e0e0303961b3ac35fa3559' │ SEP = '||' │ key = None │ ..................................................
Having a look to the code in dataset_urn_to_key
, it requires to be a dataset URN and definitely the assertion does not match the pattern
def dataset_urn_to_key(dataset_urn: str) -> Optional[DatasetKeyClass]:
pattern = r"urn:li:dataset:\(urn:li:dataPlatform:(.*),(.*),(.*)\)"
results = re.search(pattern, dataset_urn)
if results is not None:
return DatasetKeyClass(platform=results[1], name=results[2], origin=results[3])
return None
This is with Datahub v0.8.40 and sounds like a bug when committing the checkpoint.
Has these been fixed in later versions or do you want me to create an issue in github?
thankyou1mammoth-bear-12532