square-activity-64562
08/04/2021, 4:52 AM('Unable to emit metadata to DataHub GMS', {'exceptionClass': 'com.linkedin.restli.server.RestLiServiceException', 'stackTrace': 'com.linkedin.restli.server.RestLiServiceException [HTTP Status:500]: java.lang.NullPointerException: Cannot set field lastObserved of com.linkedin.mxe.SystemMetadata to null
This field should be optional as per this file
https://github.com/linkedin/datahub/blob/aa253f5b3b6c92dc919a0037008ec54c23a50a95/[…]ata-models/src/main/pegasus/com/linkedin/mxe/SystemMetadata.pdl
Am I incorrect?square-activity-64562
08/04/2021, 4:54 AMsquare-activity-64562
08/04/2021, 4:54 AM('Unable to emit metadata to DataHub GMS', {'exceptionClass': 'com.linkedin.restli.server.RestLiServiceException', 'stackTrace': 'com.linkedin.restli.server.RestLiServiceException [HTTP Status:500]: java.lang.NullPointerException: Cannot set field lastObserved of com.linkedin.mxe.SystemMetadata to null\\n\\tat com.linkedin.metadata.restli.RestliUtils.toTask(RestliUtils.java:39)\\n\\tat com.linkedin.metadata.resources.entity.EntityResource.ingest(EntityResource.java:165)\\n\\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\\n\\tat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\\n\\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\\n\\tat java.lang.reflect.Method.invoke(Method.java:498)\\n\\tat com.linkedin.restli.internal.server.RestLiMethodInvoker.doInvoke(RestLiMethodInvoker.java:172)\\n\\tat com.linkedin.restli.internal.server.RestLiMethodInvoker.invoke(RestLiMethodInvoker.java:326)\\n\\tat com.linkedin.restli.internal.server.filter.FilterChainDispatcherImpl.onRequestSuccess(FilterChainDispatcherImpl.java:47)\\n\\tat com.linkedin.restli.internal.server.filter.RestLiFilterChainIterator.onRequest(RestLiFilterChainIterator.java:86)\\n\\tat com.linkedin.restli.internal.server.filter.RestLiFilterChainIterator.lambda$onRequest$0(RestLiFilterChainIterator.java:73)\\n\\tat java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:670)\\n\\tat java.util.concurrent.CompletableFuture.uniAcceptStage(CompletableFuture.java:683)\\n\\tat java.util.concurrent.CompletableFuture.thenAccept(CompletableFuture.java:2010)\\n\\tat com.linkedin.restli.internal.server.filter.RestLiFilterChainIterator.onRequest(RestLiFilterChainIterator.java:72)\\n\\tat com.linkedin.restli.internal.server.filter.RestLiFilterChain.onRequest(RestLiFilterChain.java:55)\\n\\tat com.linkedin.restli.server.BaseRestLiServer.handleResourceRequest(BaseRestLiServer.java:218)\\n\\tat com.linkedin.restli.server.RestRestLiServer.handleResourceRequestWithRestLiResponse(RestRestLiServer.java:242)\\n\\tat com.linkedin.restli.server.RestRestLiServer.handleResourceRequest(RestRestLiServer.java:211)\\n\\tat com.linkedin.restli.server.RestRestLiServer.handleResourceRequest(RestRestLiServer.java:181)\\n\\tat com.linkedin.restli.server.RestRestLiServer.doHandleRequest(RestRestLiServer.java:164)\\n\\tat com.linkedin.restli.server.RestRestLiServer.handleRequest(RestRestLiServer.java:120)\\n\\tat com.linkedin.restli.server.RestLiServer.handleRequest(RestLiServer.java:132)\\n\\tat com.linkedin.restli.server.DelegatingTransportDispatcher.handleRestRequest(DelegatingTransportDispatcher.java:70)\\n\\tat com.linkedin.r2.filter.transport.DispatcherRequestFilter.onRestRequest(DispatcherRequestFilter.java:70)\\n\\tat com.linkedin.r2.filter.TimedRestFilter.onRestRequest(TimedRestFilter.java:72)\\n\\tat com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnRequest(FilterChainIterator.java:146)\\n\\tat com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnRequest(FilterChainIterator.java:132)\\n\\tat com.linkedin.r2.filter.FilterChainIterator.onRequest(FilterChainIterator.java:62)\\n\\tat com.linkedin.r2.filter.TimedNextFilter.onRequest(TimedNextFilter.java:55)\\n\\tat com.linkedin.r2.filter.transport.ServerQueryTunnelFilter.onRestRequest(ServerQueryTunnelFilter.java:58)\\n\\tat com.linkedin.r2.filter.TimedRestFilter.onRestRequest(TimedRestFilter.java:72)\\n\\tat com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnRequest(FilterChainIterator.java:146)\\n\\tat com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnRequest(FilterChainIterator.java:132)\\n\\tat com.linkedin.r2.filter.FilterChainIterator.onRequest(FilterChainIterator.java:62)\\n\\tat com.linkedin.r2.filter.TimedNextFilter.onRequest(TimedNextFilter.java:55)\\n\\tat com.linkedin.r2.filter.message.rest.RestFilter.onRestRequest(RestFilter.java:50)\\n\\tat com.linkedin.r2.filter.TimedRestFilter.onRestRequest(TimedRestFilter.java:72)\\n\\tat com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnRequest(FilterChainIterator.java:146)\\n\\tat com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnRequest(FilterChainIterator.java:132)\\n\\tat com.linkedin.r2.filter.FilterChainIterator.onRequest(FilterChainIterator.java:62)\\n\\tat com.linkedin.r2.filter.FilterChainImpl.onRestRequest(FilterChainImpl.java:96)\\n\\tat com.linkedin.r2.filter.transport.FilterChainDispatcher.handleRestRequest(FilterChainDispatcher.java:75)\\n\\tat com.linkedin.r2.util.finalizer.RequestFinalizerDispatcher.handleRestRequest(RequestFinalizerDispatcher.java:61)\\n\\tat com.linkedin.r2.transport.http.server.HttpDispatcher.handleRequest(HttpDispatcher.java:101)\\n\\tat com.linkedin.r2.transport.http.server.AbstractR2Servlet.service(AbstractR2Servlet.java:105)\\n\\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:790)\\n\\tat com.linkedin.restli.server.spring.ParallelRestliHttpRequestHandler.handleRequest(ParallelRestliHttpRequestHandler.java:63)\\n\\tat org.springframework.web.context.support.HttpRequestHandlerServlet.service(HttpRequestHandlerServlet.java:73)\\n\\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:790)\\n\\tat org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:852)\\n\\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:544)\\n\\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\\n\\tat org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:536)\\n\\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)\\n\\tat org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)\\n\\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1581)\\n\\tat org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)\\n\\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1307)\\n\\tat org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)\\n\\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:482)\\n\\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1549)\\n\\tat org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)\\n\\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1204)\\n\\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\\n\\tat org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221)\\n\\tat org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)\\n\\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)\\n\\tat org.eclipse.jetty.server.Server.handle(Server.java:494)\\n\\tat org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:374)\\n\\tat org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:268)\\n\\tat org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)\\n\\tat org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)\\n\\tat org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)\\n\\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)\\n\\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)\\n\\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)\\n\\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)\\n\\tat org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:367)\\n\\tat org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:782)\\n\\tat org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:918)\\n\\tat java.lang.Thread.run(Thread.java:748)\\nCaused by: java.lang.NullPointerException: Cannot set field lastObserved of com.linkedin.mxe.SystemMetadata to null\\n\\tat com.linkedin.data.template.RecordTemplate.checkPutNullValue(RecordTemplate.java:475)\\n\\tat com.linkedin.data.template.RecordTemplate.putDirect(RecordTemplate.java:175)\\n\\tat com.linkedin.mxe.SystemMetadata.setLastObserved(SystemMetadata.java:105)\\n\\tat com.linkedin.metadata.entity.ebean.EbeanEntityService.lambda$ingestAspectToLocalDB$7(EbeanEntityService.java:242)\\n\\tat com.linkedin.metadata.entity.ebean.EbeanAspectDao.runInTransactionWithRetry(EbeanAspectDao.java:475)\\n\\tat com.linkedin.metadata.entity.ebean.EbeanEntityService.ingestAspectToLocalDB(EbeanEntityService.java:227)\\n\\tat com.linkedin.metadata.entity.ebean.EbeanEntityService.ingestAspect(EbeanEntityService.java:197)\\n\\tat com.linkedin.metadata.entity.EntityService.lambda$ingestSnapshotUnion$7(EntityService.java:316)\\n\\tat java.util.ArrayList.forEach(ArrayList.java:1259)\\n\\tat com.linkedin.metadata.entity.EntityService.ingestSnapshotUnion(EntityService.java:314)\\n\\tat com.linkedin.metadata.entity.EntityService.ingestEntity(EntityService.java:273)\\n\\tat com.linkedin.metadata.resources.entity.EntityResource.lambda$ingest$4(EntityResource.java:166)\\n\\tat com.linkedin.metadata.restli.RestliUtils.toTask(RestliUtils.java:27)\\n\\t... 81 more\\n', 'message': 'java.lang.NullPointerException: Cannot set field lastObserved of com.linkedin.mxe.SystemMetadata to null', 'status': 500}
mammoth-bear-12532
square-activity-64562
08/04/2021, 4:56 AMsquare-activity-64562
08/04/2021, 4:56 AMdef send_lineage(gms_url, base_url, dag_id, task_id, upstream_urns: List[str], downstream_urns: str):
input_datasets=[_get_dataset_urn(string) for string in upstream_urns]
output_datasets=[_get_dataset_urn(string) for string in downstream_urns]
flow_urn = builder.make_data_flow_urn("airflow", dag_id)
job_urn = builder.make_data_job_urn_with_flow(flow_urn, task_id)
flow_url = f"{base_url}/airflow/tree?dag_id={dag_id}"
job_url = f"{base_url}/taskinstance/?flt0_dag_id_equals={dag_id}&flt3_task_id_equals={task_id}"
flow_mce = models.MetadataChangeEventClass(
proposedSnapshot=models.DataFlowSnapshotClass(
urn=flow_urn,
aspects=[
models.DataFlowInfoClass(
name=dag_id,
externalUrl=flow_url,
)
],
)
)
job_mce = models.MetadataChangeEventClass(
proposedSnapshot=models.DataJobSnapshotClass(
urn=job_urn,
aspects=[
models.DataJobInfoClass(
name=task_id,
type=models.AzkabanJobTypeClass.COMMAND,
externalUrl=job_url,
),
models.DataJobInputOutputClass(
inputDatasets=input_datasets,
outputDatasets=output_datasets,
inputDatajobs=[],
)
],
)
)
force_entity_materialization = [
models.MetadataChangeEventClass(
proposedSnapshot=models.DatasetSnapshotClass(
urn=iolet,
aspects=[
models.StatusClass(removed=False),
],
)
)
for iolet in input_datasets + output_datasets
]
mces = [
flow_mce,
job_mce,
*force_entity_materialization
]
emitter = DatahubRestEmitter(gms_url)
for mce in mces:
<http://logger.info|logger.info>(f"mce is {mce}")
emitter.emit_mce(mce)
mammoth-bear-12532
square-activity-64562
08/04/2021, 4:58 AMFROM linkedin/datahub-ingestion:v0.8.7
mammoth-bear-12532
square-activity-64562
08/04/2021, 4:59 AMclick
and added this python file. Was working until 0.8.6. It is possible https://github.com/linkedin/datahub/blob/master/metadata-ingestion/src/datahub_provider/_lineage_core.py would also be broken. Because most of code is used from theresquare-activity-64562
08/04/2021, 5:05 AMmammoth-bear-12532
green-football-43791
08/04/2021, 5:45 PMgreen-football-43791
08/04/2021, 5:46 PM