Hi, I have issues with Looker `soft` delete. Since...
# advice-metadata-modeling
c
Hi, I have issues with Looker
soft
delete. Since, Datahub still does not support stateful ingestion for Looker and LookML. I’m thinking of a way to have its own “stateful” ingestion by using soft delete and then ingest back the metadata. However, I noticed that after
soft
delete, not all Looker components are showing as expected. For eg: before
soft
delete, there will be 6.6k items and only 500++ showed up. Is something wrong with the
elasticsearch
indexes that causing this? Could this also be another bug?? Regardless, what is the best way to manage Looker metadata ingestion?
m
Hi @crooked-rose-22807 this message might be better suited in the #ingestion channel. When you click-into the Looker card what entities do you see?
Since Looker platform spans datasets, charts, dashboards, it is possible you cleaned out a few of those entity types but not all
c
This is my command to soft delete them
datahub delete --entity_type dashboard --platform looker --force && datahub delete --entity_type chart --platform looker --force && datahub delete --platform looker --force
They’re ALL SOFT deleted as expected. However, when I ingest the data back after the soft delete, the count of dashboards and charts entities decreased a lot. It seems like the ingestion has issue to retrieve back the dashboards and charts entities after being soft deleted. I don’t encounter this with Datasets. Only for dashboards and charts entities. I think its a bug
Attached before and after soft deleted and reingestion.
m
Thanks! We are taking a look
m
+1
m
@crooked-rose-22807 and @miniature-policeman-55414: this PR should fix the issue when merged.
c
Great! Thank you!
a
Hi everyone and thanks for the update @mammoth-bear-12532 we tried to run the above deletion and ingestion with the latest Python version of 0.8.43.6 (assuming the above PR fix is included there). While it seems to solve the above issue with the re-ingestion (after soft deletions) the looker ingestion produced a lot of warnings for the dashboards and charts (while it seems successful from the number of artifacts ingested) with the message
java.lang.RuntimeException: Unknown aspect inputFields for entity dashboard
or
java.lang.RuntimeException: Unknown aspect inputFields for entity chart
which did not seem to be the case before. Any clues about these warnings and the impact of those? Thanks
Copy code
...
...

{'warning': 'Unable to emit metadata to DataHub GMS',
               'info': {'exceptionClass': 'com.linkedin.restli.server.RestLiServiceException',
                        'stackTrace': 'com.linkedin.restli.server.RestLiServiceException [HTTP Status:500]: java.lang.RuntimeException: Unknown '
                                      'aspect inputFields for entity chart\n'
                                      '\tat com.linkedin.metadata.restli.RestliUtil.toTask(RestliUtil.java:42)\n'
                                      '\tat com.linkedin.metadata.restli.RestliUtil.toTask(RestliUtil.java:50)',
                        'message': 'java.lang.RuntimeException: Unknown aspect inputFields for entity chart',
                        'status': '500',
                        'id': 'urn:li:chart:(looker,dashboard_elements.5d477df629d88a5e1bd3e94f4c11db00)'}},
              {'warning': 'Unable to emit metadata to DataHub GMS',
               'info': {'exceptionClass': 'com.linkedin.restli.server.RestLiServiceException',
                        'stackTrace': 'com.linkedin.restli.server.RestLiServiceException [HTTP Status:500]: java.lang.RuntimeException: Unknown '
                                      'aspect inputFields for entity dashboard\n'
                                      '\tat com.linkedin.metadata.restli.RestliUtil.toTask(RestliUtil.java:42)\n'
                                      '\tat com.linkedin.metadata.restli.RestliUtil.toTask(RestliUtil.java:50)',
                        'message': 'java.lang.RuntimeException: Unknown aspect inputFields for entity dashboard',
                        'status': '500',
                        'id': 'urn:li:dashboard:(looker,dashboards.finance::seu_executive_gemstone)'}}],
 'failures': [],
 'start_time': '2022-08-31 16:11:53.025862',
 'current_time': '2022-08-31 16:35:37.729545',
 'total_duration_in_seconds': '1424.7',
 'gms_version': 'v0.8.43',
 'pending_requests': '0'}

Pipeline finished with 8487 warnings ; produced 14109 events
additional to the above update: that seems to affect the downstream lineage visualization (including lookml and snowflake metadata) which wasn't the case before. Seems that no looker explores could be ingested from the latest looker ingestion (everything seems fine if we ingest by using the previous 0.8.43 version)
hey there @mammoth-bear-12532 would appreciate your take on the above issues when you have time. Tag you for visibility. Thanks a ton 🙏🏼
m
Hey @adamant-van-21355 the warning on
inputFields
is expected as the server (I'm assuming you are running 0.8.43) doesn't have that model. This warning should go away if you upgrade to 0.8.44