<@U01GCJKA8P9> Facing issues in Actions container ...
# troubleshoot
n
@big-carpet-38439 Facing issues in Actions container due to the schema change as part of this PR - https://github.com/datahub-project/datahub/commit/1a31f7888adaef954a066d62c3aa7b21ac7be7ed Error logs in thread ๐Ÿงต
b
Unfortunately aware of the issue
It can be solved by upgrading acryl-datahub to 0.8.38
thankyou 1
n
Copy code
[2022-06-10 17:30:29,204] ERROR    {datahub_actions.pipeline.pipeline:205} - Failed to process event after 0 retries. event type: MetadataChangeLogEvent_v1, pipeline name: ingestion_executor. Handling failure...
[2022-06-10 17:30:29,205] ERROR    {datahub_actions.pipeline.pipeline_manager:44} - Caught exception while running pipeline with name ingestion_executor: Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/datahub_actions/pipeline/pipeline.py", line 283, in _append_failed_event_to_file
    json = enveloped_event.as_json()
  File "/usr/local/lib/python3.9/site-packages/datahub_actions/event/event.py", line 65, in as_json
    result = f'{{ "event_type": "{self.event_type}", "event": {self.event.as_json()}, "meta": {meta_json if meta_json is not None else "null"} }}'
  File "/usr/local/lib/python3.9/site-packages/datahub_actions/event/event_registry.py", line 46, in as_json
    return json.dumps(self.to_obj())
avro.io.AvroTypeException: The datum MetadataChangeLogEvent({'auditHeader': None, 'entityType': 'chart', 'entityUrn': 'urn:li:chart:(looker,dashboard_elements.00652013f76ce03119b766ad39c5c178)', 'entityKeyAspect': None, 'changeType': 'RESTATE', 'aspectName': 'browsePaths', 'aspect': GenericAspectClass({'value': b'{"paths":["/looker/dashboard_elements.00652013f76ce03119b766ad39c5c178"]}', 'contentType': 'application/json'}), 'systemMetadata': SystemMetadataClass({'lastObserved': 1653405120262, 'runId': '2e37ccc2-0acb-4074-8c2b-9866ee4df4bc', 'registryName': None, 'registryVersion': None, 'properties': None}), 'previousAspectValue': None, 'previousSystemMetadata': None, 'created': AuditStampClass({'time': 1654882228799, 'actor': 'urn:li:corpuser:__datahub_system', 'impersonator': None})}) is not an example of the schema {
  "type": "record",
  "name": "MetadataChangeLog",
  "namespace": "com.linkedin.pegasus2avro.mxe",
  "fields": [
    {
      "type": [
        "null",
        {
          "type": "record",
          "name": "KafkaAuditHeader",
          "namespace": "com.linkedin.events",
          "fields": [
            {
              "compliance": [
                {
                  "policy": "EVENT_TIME"
                }
              ],
              "type": "long",
              "name": "time",
              "doc": "The time at which the event was emitted into kafka."
            },
            {
              "compliance": "NONE",
              "type": "string",
              "name": "server",
              "doc": "The fully qualified name of the host from which the event is being emitted."
            },
            {
              "compliance": "NONE",
              "type": [
                "null",
                "string"
              ],
              "name": "instance",
              "default": null,
              "doc": "The instance on the server from which the event is being emitted. e.g. i001"
            },
            {
              "compliance": "NONE",
              "type": "string",
              "name": "appName",
              "doc": "The name of the application from which the event is being emitted. see go/appname"
            },
            {
              "compliance": "NONE",
              "type": {
                "type": "fixed",
                "name": "UUID",
                "namespace": "com.linkedin.events",
                "size": 16
              },
              "name": "messageId",
              "doc": "A unique identifier for the message"
            },
            {
              "compliance": "NONE",
              "type": [
                "null",
                "int"
              ],
              "name": "auditVersion",
              "default": null,
              "doc": "The version that is being used for auditing. In version 0, the audit trail buckets events into 10 minute audit windows based on the EventHeader timestamp. In version 1, the audit trail buckets events as follows: if the schema has an outer KafkaAuditHeader, use the outer audit header timestamp for bucketing; else if the EventHeader has an inner KafkaAuditHeader use that inner audit header's timestamp for bucketing"
            },
            {
              "compliance": "NONE",
              "type": [
                "null",
                "string"
              ],
              "name": "fabricUrn",
              "default": null,
              "doc": "The fabricUrn of the host from which the event is being emitted. Fabric Urn in the format of urn:li:fabric:{fabric_name}. See go/fabric."
            },
            {
              "compliance": "NONE",
              "type": [
                "null",
                "string"
              ],
              "name": "clusterConnectionString",
              "default": null,
              "doc": "This is a String that the client uses to establish some kind of connection with the Kafka cluster. The exact format of it depends on specific versions of clients and brokers. This information could potentially identify the fabric and cluster with which the client is producing to or consuming from."
            }
          ],
          "doc": "This header records information about the context of an event as it is emitted into kafka and is intended to be used by the kafka audit application.  For more information see go/kafkaauditheader"
        }
      ],
      "name": "auditHeader",
      "default": null,
      "doc": "Kafka audit header. Currently remains unused in the open source."
    },
    {
      "type": "string",
      "name": "entityType",
      "doc": "Type of the entity being written to"
    },
    {
      "java": {
        "class": "com.linkedin.pegasus2avro.common.urn.Urn"
      },
      "type": [
        "null",
        "string"
      ],
      "name": "entityUrn",
      "default": null,
      "doc": "Urn of the entity being written"
    },
    {
      "type": [
        "null",
        {
          "type": "record",
          "name": "GenericAspect",
          "namespace": "com.linkedin.pegasus2avro.mxe",
          "fields": [
            {
              "type": "bytes",
              "name": "value",
              "doc": "The value of the aspect, serialized as bytes."
            },
            {
              "type": "string",
              "name": "contentType",
              "doc": "The content type, which represents the fashion in which the aspect was serialized.\nThe only type currently supported is application/json."
            }
          ],
          "doc": "Generic record structure for serializing an Aspect"
        }
      ],
      "name": "entityKeyAspect",
      "default": null,
      "doc": "Key aspect of the entity being written"
    },
    {
      "type": {
        "type": "enum",
        "symbolDocs": {
          "CREATE": "NOT SUPPORTED YET\ninsert if not exists. otherwise fail",
          "DELETE": "NOT SUPPORTED YET\ndelete action",
          "PATCH": "NOT SUPPORTED YET\npatch the changes instead of full replace",
          "UPDATE": "NOT SUPPORTED YET\nupdate if exists. otherwise fail",
          "UPSERT": "insert if not exists. otherwise update"
        },
        "name": "ChangeType",
        "namespace": "com.linkedin.pegasus2avro.events.metadata",
        "symbols": [
          "UPSERT",
          "CREATE",
          "UPDATE",
          "DELETE",
          "PATCH"
        ],
        "doc": "Descriptor for a change action"
      },
      "name": "changeType",
      "doc": "Type of change being proposed"
    },
    {
      "type": [
        "null",
        "string"
      ],
      "name": "aspectName",
      "default": null,
      "doc": "Aspect of the entity being written to\nNot filling this out implies that the writer wants to affect the entire entity\nNote: This is only valid for CREATE, UPSERT, and DELETE operations."
    },
    {
      "type": [
        "null",
        "com.linkedin.pegasus2avro.mxe.GenericAspect"
      ],
      "name": "aspect",
      "default": null,
      "doc": "The value of the new aspect."
    },
    {
      "type": [
        "null",
        {
          "type": "record",
          "name": "SystemMetadata",
          "namespace": "com.linkedin.pegasus2avro.mxe",
          "fields": [
            {
              "type": [
                "long",
                "null"
              ],
              "name": "lastObserved",
              "default": 0,
              "doc": "The timestamp the metadata was observed at"
            },
            {
              "type": [
                "string",
                "null"
              ],
              "name": "runId",
              "default": "no-run-id-provided",
              "doc": "The run id that produced the metadata. Populated in case of batch-ingestion."
            },
            {
              "type": [
                "null",
                "string"
              ],
              "name": "registryName",
              "default": null,
              "doc": "The model registry name that was used to process this event"
            },
            {
              "type": [
                "null",
                "string"
              ],
              "name": "registryVersion",
              "default": null,
              "doc": "The model registry version that was used to process this event"
            }
b
But the container for actions does indeed need a re-release
With the higher version
n
I was guessing the same but i couldn't find any dependency on the datahub-actions over the metadata-schema version - https://github.com/acryldata/datahub-actions/blob/main/build.gradle Thanks for the quick response. We are working on upgrading the version.
b
In the dockerfile
i believe we install some version of acryl-datahub
n
So with every datahub version will the actions docker image have to be released as well?
b
no
this is a one time fix
๐Ÿ‘ 1
At least, that's the intention. The library should be tolerant of backwards compatible change
New enum symbols is an interesting example
Which may be a bit more challenging than other types
n
Thanks for clarifying!
@big-carpet-38439 Updated datahub-gms and datahub-frontend release to 0.8.38 Still see the same issue in the actions container even after restarting the container... Any idea what should be done to fix this issue?
b
This may not have been clear - but the actions lib itself depends on acryl-datahub
the python sdk
and that sdk needs to be 0.8.38
n
I don't seem to find the reference for the python sdk within the actions Docker - https://github.com/acryldata/datahub-actions/blob/main/docker/datahub-actions/Dockerfile
@big-carpet-38439 Is this a change that i can make through the Dockerfile or should i wait for the release to have this fixed?
b
Ah good catch!
And then re-releasing
n
@big-carpet-38439 https://github.com/acryldata/datahub-actions/pull/19 I hope this automatically triggers a build and pushes an image right?