DataHub #troubleshoot

best-eve-12546

03/02/2023, 4:20 PM

Hi! I’ve been working on an OpenAPI integration on v0.9.5. Looking at the raw OpenAPI Json Schema, I’ve noticed some OneOf schemas which seem to be missing the

OneOf

Copy code

"OneOfSchemaMetadataPlatformSchema":
{
    "required":
    [
        "__type"
    ],
    "type": "object",
    "properties":
    {
        "__type":
        {
            "type": "string"
        }
    },
    "description": "The native schema in the dataset's platform.",
    "discriminator":
    {
        "propertyName": "__type"
    }
},

It seems like this is missing a OneOf — I see in the Golden Test Data that this is actually a nested structure, with no “__type” field. I see the same for a bunch of other OneOf types like

OneOfSchemaFieldDataTypeType

. Checked out the YAML schema and it’s the same. Am I supposed to be instantiating these somehow, or should I be sending raw json in “__type” or?

nice-match-35259

03/02/2023, 4:23 PM

Hello all ! I am facing some issues to add a validation layer to Datahub with great expectations. I added the following in the checkpoint config :

Copy code

yaml_config = f"""
name: {my_checkpoint_name}
config_version: 1.0
class_name: SimpleCheckpoint
run_name_template: "%Y%m%d-%H%M%S-my-run-name-template"
validations:
  - batch_request:
      datasource_name: bigquery_datasource
      data_connector_name: default_inferred_data_connector_name
      data_asset_name: raw_prod.applicative_database_deposit
      data_connector_query:
          index: -1
    expectation_suite_name: suite_deposit_test2
    action_list:
      - name: datahub_action
        action:
            module_name: datahub.integrations.great_expectations.action
            class_name: DataHubValidationAction
            server_url: {server_url}
            token: {gms_token}
            extra_headers:
                - Proxy-Authorization: Bearer {iap_token}
"""

The checkpoint runs correctly but the metadata is not sent to datahub gms. The error I got is in the attached .png. Did somebody face the same problem ?

nutritious-bird-77396

03/02/2023, 6:02 PM

Looking to get some thoughts to unit test this PR - https://github.com/datahub-project/datahub/pull/7476

cuddly-butcher-39945

03/02/2023, 10:01 PM

Anyone out there who could tell me what's wrong with this? I am trying to get a list of all the AD groups I ingested into Datahub after setting up SSO, I want to get a list of Groups in a query, then create a mutation to delete them.

Copy code

query ListCorpGroups {
  search(input: { type: corpgroup, query: "*"}) {
    total
    count
    searchResults {
      entity {
        urn
        type
        ... on corpgroup {
          properties {
            name
          }
        }
      }
    }
  }
}

Getting the following error:

Copy code

{
  "errors": [
    {
      "message": "Validation error (WrongType@[search]) : argument 'input.type' with value 'EnumValue{name='corpgroup'}' is not a valid 'EntityType' - Expected enum literal value not in allowable values -  'EnumValue{name='corpgroup'}'.",
      "locations": [
        {
          "line": 2,
          "column": 10
        }
      ],
      "extensions": {
        "classification": "ValidationError"
      }
    },
    {
      "message": "Validation error (UnknownType@[search/searchResults/entity]) : Unknown type 'corpgroup'",
      "locations": [
        {
          "line": 9,
          "column": 16
        }
      ],
      "extensions": {
        "classification": "ValidationError"
      }
    }
  ],
  "data": null,
  "extensions": {}
}

Thanks In advance!

✅ 1

numerous-account-62719

03/03/2023, 4:32 AM

Hi Team/

numerous-account-62719

03/03/2023, 4:33 AM

Hi Team I am working on dataset to dataset lineage It supports one dataset as input and one as output But i have 2 inputs and 1 output. How to handle this? Please help me out here

microscopic-room-90690

03/03/2023, 8:49 AM

Hello team, I try to use classification for snowflake and get this error

Failed to classify table columns

cli_version and gms_version are both '0.9.6.1' It seems everything is ok except classificaion. What should I do? Any help will be appreciated. Thank you!

✅ 1

shy-dog-84302

03/03/2023, 2:02 PM

Hi! I have noticed following exception in (🧵) DataHub Metadata Service. A little digging reveals a possible bug here with seek to negative offsets when current offset on a partition is 0. I have configured all my backend Kafka topics with 3 partitions. Anyone else experienced similar error?

✅ 1

best-umbrella-88325

03/06/2023, 7:41 AM

Hello community! We've been trying to install the latest 0.10.0 version of DataHub and we feel there is a bug. Please correct me if we are wrong. We currently have enabled the metadata_service_authentication flag in the value.yaml in the helm installation. We are now moving to 0.10.0 using chart version 0.2.154. When metadata_service_authentication value is true, the system-update job fails with the CreateConfigError since the 'datahub-auth-secrets' secret doesn't get created. On the other hand, if we put metadata_service_authentication as false, the system update job passes and the secret is also created successfully. Maybe an issue with the helm templates which creates the secret. Unsure about the root cause, but I think this could be a potential problem. Please let me know if this is the desired behavior or we are missing something. Thanks in advance

gifted-diamond-19544

03/06/2023, 12:13 PM

Hello all. Sorry for crossposting, not sure wheter to post here or in #all-things-deployment. We are having trouble running the the update container. More details in the link below: https://datahubspace.slack.com/archives/CV2UVAPPG/p1678104384250809

✅ 2

future-dog-77968

03/06/2023, 6:17 PM

hey y’all! We ran into a DataHub issue where we can’t select text from a search result unless you very carefully start from before/after the text itself. • We’re on datahub 0.10, and this video recording is from the datahub demo instance. • Is it fair to say that this is a bug, or is it by design? ◦ If the former, we couldn’t find any Github issues and we’re happy to open one (and even take a stab at fixing it!)

Screen Recording 2023-03-06 at 1.17.16 PM.mov

✅ 1

handsome-football-66174

03/06/2023, 9:25 PM

Hi Team, Trying to upgrade to 0.9.3 version of Datahub, but getting following error for the datahub-datahub-upgrade-job pod ( we use k8s deployment ) . Any suggestions ?

Copy code

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See <http://www.slf4j.org/codes.html#StaticLoggerBinder> for further details.
ERROR SpringApplication Application run failed
 org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'upgradeCli': Unsatisfied dependency expressed through field 'noCodeUpgrade'; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'ebeanServer' defined in class path resource [com/linkedin/gms/factory/entity/EbeanServerFactory.class]: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [io.ebean.EbeanServer]: Factory method 'createServer' threw exception; nested exception is java.lang.NullPointerException

proud-soccer-58887

03/07/2023, 6:08 AM

Hi Team, I have completed the integration between DataHub and Apache Ranger and am currently testing it. I have confirmed the platform level privileges, but I'm not sure how to set policies for metadata privileges in Apache Ranger. Any Examples?

✅ 1

rich-pager-68736

03/07/2023, 6:39 AM

Hi all, while trying to restore our indicies from the DB to a fresh OpenSearch cluster, some messages could not be processed due to:

Copy code

2023-02-27 15:06:41.598 ERROR 1 --- [ool-10-thread-1] c.l.m.dao.producer.KafkaHealthChecker    : Failed to emit MCL for entity urn:li:dataHubExecutionRequest:#Snowflake-2023_02_20-09_43_34

org.apache.#Kafka.common.errors.RecordTooLargeException: The message is 1633361 bytes when serialized which is larger than 1048576, which is the value of the max.request.size configuration.

I've already increased the allowed message size for the topic (

max.message.bytes

) and the Kafka cluster (

replica.fetch.max.bytes

). I do not find any config parameter to adjust the producer's

max.request.size

, i.e., datahub-upgrade, though. Same for consumer side - how to increase

max.partition.fetch.bytes

for MCL consumer? Any help here?

great-branch-515

03/07/2023, 7:36 AM

Hi Team GMS service is not getting stable we are getting repetitive warnings in logs

Copy code

2023-03-07 07:27:35,738 [ThreadPoolTaskExecutor-1] WARN  o.apache.kafka.clients.NetworkClient:1077 - [Consumer clientId=consumer-generic-duhe-consumer-job-client-1, groupId=generic-duhe-consumer-job-client] Error while fetching metadata with correlation id 2775 : {DataHubUpgradeHistory_v1=UNKNOWN_TOPIC_OR_PARTITION}
2023-03-07 07:27:35,738 [ThreadPoolTaskExecutor-1] WARN  o.apache.kafka.clients.NetworkClient:1077 - [Consumer clientId=consumer-generic-duhe-consumer-job-client-1, groupId=generic-duhe-consumer-job-client] Error while fetching metadata with correlation id 2775 : {DataHubUpgradeHistory_v1=UNKNOWN_TOPIC_OR_PARTITION}
2023-03-07 07:27:35,839 [ThreadPoolTaskExecutor-1] WARN  o.apache.kafka.clients.NetworkClient:1077 - [Consumer clientId=consumer-generic-duhe-consumer-job-client-1, groupId=generic-duhe-consumer-job-client] Error while fetching metadata with correlation id 2776 : {DataHubUpgradeHistory_v1=UNKNOWN_TOPIC_OR_PARTITION}
2023-03-07 07:27:35,839 [ThreadPoolTaskExecutor-1] WARN  o.apache.kafka.clients.NetworkClient:1077 - [Consumer clientId=consumer-generic-duhe-consumer-job-client-1, groupId=generic-duhe-consumer-job-client] Error while fetching metadata with correlation id 2776 : {DataHubUpgradeHistory_v1=UNKNOWN_TOPIC_OR_PARTITION}
2023-03-07 07:27:35,940 [ThreadPoolTaskExecutor-1] WARN  o.apache.kafka.clients.NetworkClient:1077 - [Consumer clientId=consumer-generic-duhe-consumer-job-client-1, groupId=generic-duhe-consumer-job-client] Error while fetching metadata with correlation id 2777 : {DataHubUpgradeHistory_v1=UNKNOWN_TOPIC_OR_PARTITION}
2023-03-07 07:27:35,940 [ThreadPoolTaskExecutor-1] WARN  o.apache.kafka.clients.NetworkClient:1077 - [Consumer clientId=consumer-generic-duhe-consumer-job-client-1, groupId=generic-duhe-consumer-job-client] Error while fetching metadata with correlation id 2777 : {DataHubUpgradeHistory_v1=UNKNOWN_TOPIC_OR_PARTITION}

When we try to login in frontend we get error

Copy code

Failed to perform post authentication steps. Error message: Failed to provision user with urn

that is caused by

Copy code

java.lang.RuntimeException: Failed to provision user with urn urn:li:corpuser:atul.atri@chegg.com.

Any ideas?

busy-analyst-35820

03/07/2023, 10:33 AM

Hi , We use v0.9.2 version of Datahub. We gave the expiry for Bearer token as "_*never"*_, but still it expired. The token is no longer working and it gives forbidden. Is this option ("never" ) for bearer token expiry is valid? Even though we opt for "never" option under expiry , in the next screen it shows as in the screen shot given below cc: @melodic-match-38516

elegant-salesmen-99143

03/07/2023, 11:34 AM

Hi. We can't figure out a working command for CLI to get all dataset urns from within a certaing container. There is such finctionality in UI - Download button in a contaner that gives you a CSV file with info for Datasets: urns and metadata for them (like owners, tags, terms, domains). How do I get the same result using CLI?

delightful-sugar-63810

03/07/2023, 1:47 PM

Hey team 👋🏻 I am not sure if this is a valuable feedback but we observed that Datahub fails to return the impact analysis(transitive downstream consumers) of an entity if that entity has more than around 3K downstreams. The effect becomes more visible with the entites with 5-7K downstream dependencies. I know thesse numbers seems very high but I think makes sense when you want to get the downstreams of a very core table, also feeding looker(looker has many entity types such as dashbaords). I don't think that the infrastructure we serve Datahub is the bottleneck here but this is always a possibility.

hallowed-shampoo-52722

03/07/2023, 2:53 PM

Hi Team, I have an issue with an ingestion in the QA instance.. its been pending since 3 days. Other environments are working fine! I dont see any issues with existing PODS.. Could you please help how I can debug this?

microscopic-application-63745

03/07/2023, 2:56 PM

Hi team, I hope you are all doing great! I am working on datahub 0.9.5 and I am trying to run an S3 Data Lake custom recipe and according to the documentation I can use the config property

verify_ssl

but whenever I add it I get the following error:

Copy code

[2023-03-07 15:10:23,049] ERROR    {logger:26} - Please set env variable SPARK_VERSION
[2023-03-07 15:10:23,543] ERROR    {datahub.ingestion.run.pipeline:127} - 1 validation error for DataLakeSourceConfig
verify_ssl
  extra fields not permitted (type=value_error.extra)

Please note that without the

verify_ssl

the recipe ingests just fine.

✅ 2

green-hamburger-3800

03/07/2023, 4:32 PM

Hello folks! I wanted to create a policy to allow some users to edit everything about one specific EntityType. Is that possible? I tried to use the resource part for it but it wasn't possible to do it for

MLMODEL

This was my query:

Copy code

mutation CreatePolicy($input: PolicyUpdateInput!) {
  createPolicy(input: $input)
}

This was my payload:

Copy code

{
  "input": {
    "type": "METADATA",
    "name": "MLP - Service Account",
    "state": "ACTIVE",
    "description": "Test",
    "privileges": [
      "EDIT_ENTITY_TAGS",
      "EDIT_ENTITY_GLOSSARY_TERMS",
      "EDIT_ENTITY_OWNERS",
      "EDIT_ENTITY_DOCS",
      "EDIT_ENTITY_DOC_LINKS",
      "EDIT_ENTITY_STATUS",
      "EDIT_DOMAINS_PRIVILEGE",
      "EDIT_DEPRECATION_PRIVILEGE",
      "EDIT_ENTITY",
      "EDIT_DATASET_COL_DESCRIPTION",
      "EDIT_DATASET_COL_TAGS",
      "EDIT_DATASET_COL_GLOSSARY_TERMS",
      "EDIT_ENTITY_ASSERTIONS",
      "EDIT_LINEAGE",
      "EDIT_ENTITY_EMBED",
      "EDIT_TAG_COLOR"
    ],
    "actors": {
      "users": [
        "urn:li:corpuser:mlp_user"
      ],
      "allUsers": false,
      "allGroups": false,
      "resourceOwners": false
    },
    "resources": {
      "type": "MLMODEL",
      "allResources": true
    }
  }
}

✅ 1

important-processor-44077

03/07/2023, 11:37 PM

@astonishing-answer-96712 added a workflow to this channel: *Community Support Bot *.

best-wire-59738

03/08/2023, 11:44 AM

Hi Team, We are facing Kakfa clients re-balancing issue, At this point of time our UI is also freezed as the consumer is going in a re-balancing loop and its not consuming offsets and offset lag keeps on Increasing as Ingestion is pulling more info. Upon debugging we found that the datahub is using MetadataChangeLog_Versioned_v1 topic for all the changes made to Metadata Graph using UI and also while using kafka sink for Ingestions. So for this reason our UI is in freezed state till the consumer (

generic-mae-consumer-job-client

) reads all the partitions from the topic as the change made to UI is also some where in the queue in the kafka topic. 1. Can we use seperate topic for all the changes we made using UI so that our UI be free from freeezing issue? 2. Also how can we let ourselves come out from the re-balancing of groups issue and speed up our ingestion, As kafka is Asynchronous . MCE consumers are slow in reading the offsets. we are yet to create standalone MCE and MAE Consumers. Hope it increase the speed of the Ingestion but yet to find solution for re-balancing issue.

gentle-camera-33498

03/08/2023, 12:47 PM

Hello everyone, I'm having problemns with my 0.10.0 deployment. Context: Before updating the version, I decided to soft delete datasets, charts and dashboards. With this, I could delete all entities and force reingestion to ingest new ones. Problem: I'm receiving a log of exception messages of the BulkListener like the below:

Copy code

[I/O dispatcher 2] ERROR c.l.m.s.e.update.BulkListener:44 - Failed to feed bulk request. Number of events: 5 Took time ms: -1 Message: failure in bulk execution:
[1]: index [datasetindex_v2_1678278613797], type [_doc], id [urn...], message [[datasetindex_v2_1678278613797/YrBRraPeT6OLr7JvUNdy6A][[datasetindex_v2_1678278613797][0]] ElasticsearchException[Elasticsearch exception [type=document_missing_exception, reason=[_doc][urn...]: document missing]]]

I'm unsure if it is the cause, but I do not see any dataset in the UI. NOTE: Yes, I tried to run the restoreIndices job, but nothing changed.

✅ 2

nice-river-27843

03/08/2023, 1:47 PM

hey, i am tying to connect oidc using azure and keycloak, after setting up everything i am redirected to the login page in azure finishing successfully ( according to azure) but then when redirecting back to my local frontend it looks like it fails and retry several times , and in the frontend logs i see

Copy code

2023-03-08 13:37:28,465 [application-akka.actor.default-dispatcher-13] ERROR o.p.core.engine.DefaultCallbackLogic - Unable to renew the session. The session store may not support this feature

✅ 1

acceptable-evening-60358

03/08/2023, 2:58 PM

Unable to run quickstart - the following issues were detected: - quickstart.sh or dev.sh is not running If you think something went wrong, please file an issue at https://github.com/datahub-project/datahub/issues or send a message in our Slack https://slack.datahubproject.io/ Be sure to attach the logs from C:\Users\lozza\AppData\Local\Temp\tmpl_pih9ix.log

✅ 1

acceptable-evening-60358

03/08/2023, 2:59 PM

Hi All newbie here, attempted to deploy my first instance, but met this error, any support would be a great help!

able-city-76673

03/09/2023, 6:24 AM

Hello, we have deplopyed datahub in azure kubernetes service. we aren't able to configure ingress as getting 404. is there any document on deploying datahub on azure or helping in ingress configuration for azure application gateway?

✅ 1

agreeable-belgium-70840

03/09/2023, 9:35 AM

Hello, I am trying to update datahub from 0.9.5 to 0.10.0. I ran the system upgrade job, and now GMS is giving me this error:

Copy code

2023-03-09 09:29:44,122 [I/O dispatcher 1] INFO c.l.m.s.e.update.BulkListener:47 - Successfully fed bulk request. Number of events: 1 Took time ms: -1
2023-03-09 09:30:23,729 [R2 Nio Event Loop-1-1] WARN c.l.r.t.h.c.c.ChannelPoolLifecycle:139 - Failed to create channel, remote=localhost/127.0.0.1:8080
io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:8080
Caused by: java.net.ConnectException: Connection refused
at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:777)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:337)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:776)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at java.base/java.lang.Thread.run(Thread.java:829)

Any ideas?

✅ 1

plus1 7

+17

freezing-architect-85960

03/09/2023, 9:58 AM

Hello, team, I am trying to emit airflow data to datahub by using kafka based hook, but the airflow task report some errors, it looks like the producer was terminated by task and had no enough time to flush msgs to kafka

%4|1678351593.679|TERMINATE|rdkafka#producer-2| [thrd:app]: Producer terminating with 23 messages (9368 bytes) still in queue or transit: use flush() to wait for outstanding message delivery

Any help ideas about this? I didn't find the flush action in the datahub-airflow-plugin emit function