DataHub #troubleshoot

full-computer-98125

06/15/2023, 4:34 PM

Hello, I'm trying to troubleshoot an issue where the mae consumer is failing to process messages and commit offests:

Copy code

2023-06-15 15:02:33,033 [kafka-coordinator-heartbeat-thread | generic-mae-consumer-job-client] INFO  o.a.k.c.c.i.AbstractCoordinator:979 - [Consumer clientId=consumer-generic-mae-consumer-job-client-5, groupId=generic-mae-consumer-job-client] Member consumer-generic-mae-consumer-job-client-5-b17bdbb2-720c-4813-9e33-6ad46574892c sending LeaveGroup request to coordinator "coordinator" (id: 2147483646 rack: null) due to consumer poll timeout has expired. This means the time between subsequent calls to poll() was longer than the configured <http://max.poll.interval.ms|max.poll.interval.ms>, which typically implies that the poll loop is spending too much time processing messages. You can address this either by increasing <http://max.poll.interval.ms|max.poll.interval.ms> or by reducing the maximum size of batches returned in poll() with max.poll.records

Reading the docs I see it may be possible to set this config via spring boot. I tried adding

Copy code

- name: SPRING_KAFKA_PROPERTIES_CONSUMER_MAX_POLL_RECORDS
  value: "10"

as the env var to configure it, but I receive:

Copy code

2023-06-15 16:21:15,835 [main] WARN  o.a.k.c.consumer.ConsumerConfig:355 - The configuration 'consumer.max.poll.records' was supplied but isn't a known config.

Does anyone here know the proper way to adjust that config value? spring reference here

plus1 1

adorable-lawyer-88494

06/16/2023, 6:22 AM

Hey All I was upgrading datahub project from java 11 to java 17 where i am stuck with

Copy code

The :li-utils:compileMainGeneratedDataTemplateJava task failed.

where i was thinking it is coming because of Pegasus so can anyone please tell me that does Pegasus latest version support java-17?if yes then which Version

✅ 1

melodic-lighter-39433

06/16/2023, 10:04 AM

hello,guys! I am new to datahub,and when i run python -m datahub docker quickstart --quickstart-compose-file /root/.docker/quickstart/docker-compose.quickstart.yml commnad,it always show this error, i google and try to solve it but all of these did not work! And i paste snapshot

✅ 1

lemon-yacht-62789

06/16/2023, 10:06 AM

Hi all, I am running DataHub

v0.10.1

and am having some difficulties setting up a Looker ingestion source. Our ingestion has started failing and I assumed this might be down to an outdated config, so have tried setting up a new connection from scratch via the UI. When entering the base URL, client id and secret I am able to validate the connection OK - all ticks are returned green. However, when actually triggering the pipeline the following error appears in the log which seems to indicate it's the API version at issue:

Copy code

PipelineInitError: Failed to configure the source (looker): Failed to connect/authenticate with looker - check your configuration: b'{"message":"API 3.x requests are prohibited. Request: POST /api/3.1/login","documentation_url":"<https://cloud.google.com/looker/docs/>"}'

Datahub

v0.10.1

release notes indicate support for the v4 Looker API, so I'm wondering if it's perhaps the credentials 🤔 As in, these were originally generated for a v3 Looker connection so my theory is I need to generate new credentials for the v4 API. I do not have admin access to Looker in our organisation, so am unable to test this theory out yet. I am curious if anyone has had any similar issues.

✅ 1

freezing-oxygen-20989

06/16/2023, 10:15 AM

hi guys, we’re trying to setup DataHub on AWS managed services, including AWS managed ElasticSearch as well. In the process of setup, we have a successful

datahub-elasticsearch-setup-job

, but then

datahub-system-update-job

fails with the following error message:

Copy code

Suppressed: org.elasticsearch.client.ResponseException: method [POST], host [HOSTNAME], URI [/datahubpolicyindex_v2/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true], status line [HTTP/1.1 404 Not Found]

DataHub version:

v0.10.4

OpenSearch version:

OpenSearch 2.5

Has anyone came across similar issues before? Thanks

swift-dream-78272

06/16/2023, 12:14 PM

Hey all! Is there a possibility to fetch all dataset urns? I’m trying to do it with graphql api and it works for small subsets but for example if I’d like to fetch bigger subset it throws me

error code. My api query looks like below and I’d tried to paginate using

start

and

count

parameters but at some point it also throws me

. Ideally I’d want to not pass a query parameter but even if I narrow it down to snowflake platform, I cannot get all datasets urns.

Copy code

{
  searchAcrossEntities(
    input: {types: DATASET, query: "snowflake", start: 0, count: 5000}
  ) {
    start
    count
    total
    searchResults {
      entity {
        urn
      }
    }
  }
}

API error response:

Copy code

{
  "servlet": "apiServlet",
  "message": "Service Unavailable",
  "url": "/api/graphql",
  "status": "503"
}

DataHub version:

0.9.6.1

handsome-football-66174

06/16/2023, 7:39 PM

Team - Facing the following issue in prod ( GMS is unable to come up ). Using k8s deployment via Helm charts (schema registry is up and running) . Version 10.1

Copy code

org.apache.kafka.common.errors.SerializationException: Error serializing Avro message
Caused by: javax.net.ssl.SSLHandshakeException: No subject alternative DNS name matching <schema-registry URL> found.
        at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:131) ~[na:na]
        at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:353) ~[na:na]
        at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:296) ~[na:na]
        at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:291) ~[na:na]

flat-table-17463

06/17/2023, 5:34 PM

Hello everyone. I' m trying get_urns_by_filterbut I confused that using function query parameters like below :

Copy code

for tag in graph.get_urns_by_filter(entity_types=["dataset"],query="reserved"):
    print(tag)

outputs :

Copy code

urn:li:dataset:(urn:li:dataPlatform:postgres,customerservice.public.customer_reserved,PROD)
urn:li:dataset:(urn:li:dataPlatform:postgres,accountservice.public.account_blocked,PROD)
urn:li:dataset:(urn:li:dataPlatform:postgres,accountservice.public.account_blocked_transaction,PROD)

why this result contains name that "blocked"?

average-nail-72662

06/17/2023, 10:11 PM

Hi guys, I’m getting an error when run ingest ownership

✅ 1

better-sunset-65466

06/19/2023, 9:04 AM

Hello, I am trying to activate the data profiling for some tables in Bigquery. However, after activating the feature during the ingestion step, the stats/validation tabs are greyed out. Is there anything I am missing?

✅ 1

jolly-tent-78213

06/19/2023, 12:25 PM

Hello @here I'm facing an issue while deploying datahub on kubernetes using the Helm Chart version 0.2.145 with fleet. I followed instructions here In my case, all the dependencies (prerequisites services) has been already deployed. Right now I'm facing two main issues related to ElasticSearch: • The first one is that the elastic setup job didn't manage to create the indexes needed for the GMS backend by example

datahubpolicyindex_v2

(maybe it isn't the role of the job to create indexes). So the indexes listed here hasn't been created in my ES. You can see attached image as well. • The 2nd one is that the pod GMS is failing at querying ES. By querying the

datahub_usage_event

index GMS is failling. Preventing the pod to enter in READY state. I added as well the logs of the GMS pod. I would be happy to have some help about my issue.

logs-190623-4.txt

incalculable-portugal-45517

06/19/2023, 11:59 PM

A separate issue (which we were trying to resolve by upgrading as mentioned ^), CLI ingestion for tableau and superset in our running instance was completing successfully but returning

"events_produced": "0",

with no assets ingested, using version 0.9.3

nutritious-salesclerk-57675

06/20/2023, 5:25 AM

Good day. I am trying to run my bigquery metadata ingestion pipeline on cloud composer in a k8s pod. The pipeline produces events and extracts all data. But towards the end I get the following error:

Copy code

[2023-06-20, 04:03:24 UTC] {pod_manager.py:197} INFO - ERROR:root:('Unable to get metadata from DataHub', {'message': '401 Client Error: Unauthorized for url: https://<url-to-gms>/aspects?action=getTimeseriesAspectValues'})

Is this something related to permissions? Can someone help understand the cause of this error? PS: I have token based authentication enabled.

enough-football-92033

06/20/2023, 11:09 AM

Hello Team! During

datahub-ingestion-base

image build I started to get next error:

Package 'openjdk-11-jre-headless' has no installation candidate

Details:

Copy code

#8 4.213 E: Package 'openjdk-11-jre-headless' has no installation candidate
183
#8 ERROR: executor failed running [/bin/sh -c apt-get update && apt-get install -y     && apt-get install -y -qq     make     python3-ldap     libldap2-dev     libsasl2-dev     libsasl2-modules     libaio1     libsasl2-modules-gssapi-mit     krb5-user     wget     zip     unzip     ldap-utils     openjdk-11-jre-headless     && python -m pip install --upgrade pip wheel setuptools==57.5.0     && python -m pip install --upgrade awscli     && curl -Lk -o /root/librdkafka-${LIBRDKAFKA_VERSION}.tar.gz <https://github.com/edenhill/librdkafka/archive/v${LIBRDKAFKA_VERSION}.tar.gz>     &&  tar -xzf /root/librdkafka-${LIBRDKAFKA_VERSION}.tar.gz -C /root     &&  cd /root/librdkafka-${LIBRDKAFKA_VERSION}     &&  ./configure --prefix /usr && make && make install && make clean && ./configure --clean     && apt-get remove -y make]: exit code: 100

I was no changes from my side in the code base, can help me resolve it?

✅ 1

dazzling-airport-31275

06/20/2023, 11:35 AM

Hey all, we are facing an issue that has already been addressed previously on this channel but none of the proposed solutions worked for us. We have deployed a Datahub fresh and clean installation (using the helm chart version: 0.2.175, but customizing the following versions: ES version: 7.17.3 and 7.10.2(we are using this one right now) Kafka: Conlfuent cloud postgresql: 11.19 Datahub version: 0.10.3) But then the datahub user does not have any permission, we also try to create additional users but all of them does not have any kind of permission. Do you have any suggestion on how to solve this issue?

✅ 1

salmon-exabyte-77928

06/20/2023, 12:35 PM

Hello all, The issue with logout with SSO (Keycloak): Datahub version 0.10.4 (helm), Keycloak version 21.1.1. Used sample config similar to https://www.syscrest.com/2022/11/datahub-oidc-identity-group-managment-with-keycloak/ All is ok, idp authorizes users, but if I press the "Sign Out" button in the UI it just redirects to /login page. Then click the button "Sign in with SSO", you don't need to enter a password in Keycloak, you will directly log in to Datahub successfully. Any ideas on how can users logout?

adorable-airline-30358

06/20/2023, 12:45 PM

Hello team, I performed hard deletion operation on metadata from looker. But still can see looker related metadata in recently edited tabs, while clicking on that it says

Copy code

Sorry, we are unable to find this entity in DataHub

which is expected.

✅ 1

better-sunset-65466

06/20/2023, 1:47 PM

Hello! I am trying to add a transformer to add a domain to all the bigquery tables from a defined source. However, when adding :

Copy code

transformers:
    type: simple_add_dataset_domain
    config:
        replace_existing: true
        domains:
            - 'urn:li:domain:data_observatory'

I keep on getting this error:

Copy code

~~~~ Execution Summary - RUN_INGEST ~~~~
Execution finished with errors.
{'exec_id': 'edebb278-baf5-4497-aac6-73d520af6af9',
 'infos': ['2023-06-20 13:44:44.482415 INFO: Starting execution for task with name=RUN_INGEST',
           "2023-06-20 13:44:48.557174 INFO: Failed to execute 'datahub ingest'",
           '2023-06-20 13:44:48.557409 INFO: Caught exception EXECUTING task_id=edebb278-baf5-4497-aac6-73d520af6af9, name=RUN_INGEST, '
           'stacktrace=Traceback (most recent call last):\n'
           '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 122, in execute_task\n'
           '    task_event_loop.run_until_complete(task_future)\n'
           '  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete\n'
           '    return future.result()\n'
           '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 231, in execute\n'
           '    raise TaskError("Failed to execute \'datahub ingest\'")\n'
           "acryl.executor.execution.task.TaskError: Failed to execute 'datahub ingest'\n"],
 'errors': []}

~~~~ Ingestion Logs ~~~~
Obtaining venv creation lock...
Acquired venv creation lock
venv setup time = 0
This version of datahub supports report-to functionality
datahub  ingest run -c /tmp/datahub/ingest/edebb278-baf5-4497-aac6-73d520af6af9/recipe.yml --report-to /tmp/datahub/ingest/edebb278-baf5-4497-aac6-73d520af6af9/ingestion_report.json
[2023-06-20 13:44:46,739] INFO     {datahub.cli.ingest_cli:173} - DataHub CLI version: 0.10.4
1 validation error for PipelineConfig
transformers
  value is not a valid list (type=type_error.list)

✅ 1

purple-forest-88570

06/21/2023, 4:02 AM

Hello everyone, I am concerned about the performance for concurrent users. I am conducting a performance test with 100 concurrent users on the "query search" API.

Test 1

I sent 100 search requests to GMS using Jmeter.

It took 4 seconds.

When monitoring the thread pool of ElasticSearch, max 2 active threads were observed.

Test 2

I sent 100 search requests to ElasticSearch using Jmeter. the search request was same one sent by GMS in Test 1,

It took 0.2 seconds.

When monitoring the thread pool of ElasticSearch, max 15 active threads were observed.

Based on these results, it seems that GMS is only sending search requests to ElasticSearch in batches of 2. I also checked the connection between GMS and ElasticSearch using tcpdump and netstat, and found that they are only connected through 2 ports. Could you please provide any advice or suggestions regarding this issue? Thank you.

better-gigabyte-38217

06/21/2023, 6:29 AM

I am currently installing DataHub (newbie) using this link https://datahubproject.io/docs/developers/. However, after the installation, it seems that the default username 'datahub' doesn't have the necessary permissions to perform ingestion or glossary creation. Below is my docker-compose and my DataHub image. I would appreciate your support.

acoustic-quill-54426

06/21/2023, 10:12 AM

Howdy! We enabled retention but it seems it’s not doing anything. I can see the

dataHubRetentionConfig

and

dataHubRetentionKey

correctly created in the db, but after restarting the gms containers, we still have thousands of aspects that should have been deleted

colossal-waitress-83487

06/21/2023, 10:49 AM

When I use the python3 -m datahub docker quickstart command, I encounter the following situation, has anyone ever encountered it

bandicam 2023-06-21 18-30-46-055.mp4

cuddly-dinner-641

06/21/2023, 12:55 PM

there seem to be a couple scenarios where the search/graph indexes can become out-of-sync with the SQL database: 1) if the MCL publish fails, it seems entityService increments a metric and moves on without rolling back the SQL persist 2) if something fails during elasticsearch updates, the mae-consumer increments a metric and skips past the MCL event am I overlooking anything there, or should we be monitoring those metrics carefully to ensure we manually re-index any failed aspects?

adamant-furniture-37835

06/21/2023, 3:04 PM

Hi, I have done an earlier post for the same topic but didn't get the answers. Here is the detailed observation of VIEW functionality and our expectations. PLease clarify if our expectations are invalid or maybe we need to do some extra config that we missed. I created a public VIEW with two filters i.e. 1. Platform is any of Vertica, Tableau 2. Tag is any of MY_TAG and selected "show results that match all filters". I make this view as my default view. Then I logout, login and land at home page. Homepage Behavior : At homepage, I see all the platforms, datasets and respective count of datasets. Results on homepage aren't filtered according to the default view. Under platforms, I can navigate inside Vertica and Tableau but for every other platform, this message is flashed on screen : "No results found for """ Expectation : Only Vertica & Tableau should be visible, in accordance with default view. When I intercept graphql api call from browser, I see that filters, orFilters are sent empty. Datasets Navigation Behavior : When I navigate inside datasets, I am able to navigate to all the datasets and their entity details page even though view is selected in the top header dropdown. Expectation : I shouldn't see datasets that doesn't pass through the filters defined in View i.e. I should only see datasets belonging to Vertica or Tableau and which are tagged with MY_TAG. When I navigate to : datasets -> {ENV} -> {PLATFORM} , I see that filter attribute sent to graphql query is empty. Beside this I don't see any errors in the log files. Please let me know if you need further details. Datahub version is 0.10.3 and it's running on Kubernetes cluster. Big Thanks

plus1 2

shy-dog-84302

06/21/2023, 3:10 PM

Hi! I have recently upgraded from 0.10.1 to 0.10.4. With the upgrade I am missing ownership-types info on the frontend-ui. More about the issue in 🧵

✅ 1

important-minister-98629

06/21/2023, 6:37 PM

Hi Everyone, Dummy question. Anyone know how to actually change the environment of the asset we ingest ?

✅ 1

worried-solstice-95319

06/21/2023, 8:31 PM

Hi! Was wondering if anyone would be able to help me troubleshoot why I'm unable to declare Custom Ownership Types? I have full admin access but the option is unavailable to me.

✅ 1

quaint-belgium-35390

06/22/2023, 2:48 AM

hi everyone, I am trying to integrate between airflow with celery worker, great expectations and datahub, so that we can see

Validation

tab in datahub, but there are some errors that make the datahubvalidationaction failed errors

Copy code

Sql parser failed on {query} with daemonic processes are not allowed to have children

this is my requirements.

Copy code

acryl-datahub[great-expectations]==0.10.3.2
acryl-datahub-airflow-plugin==0.10.3.2
great-expectations==0.15.41
airflow-provider-great-expectations==0.2.6

this is my action list

Copy code

"action_list": [
                    {
                        "name": "datahub_action",
                        "action": {
                            "module_name": "datahub.integrations.great_expectations.action",
                            "class_name": "DataHubValidationAction",
                            "server_url": "<http://host_IP:9002/>",
                            "parse_table_names_from_sql": True,
                            "retry_max_times": 1,
                            "graceful_exceptions": False,
                            "env": "STG",
                        },
                    },

if any of you had encounter this issue, please help me solve this, thank you

helpful-student-10263

06/22/2023, 7:16 AM

Hello, I want to hide jetty version response. (datahub v0.10.0 in EKS) so, I found that. in datahub-gms pod, /datahub/datahub-gms/scripts/jetty.xml

Copy code

as is)
...
    <New id="httpConfig" class="org.eclipse.jetty.server.HttpConfiguration">
      <Set name="requestHeaderSize"><Property name="jetty.httpConfig.requestHeaderSize" deprecated="jetty.request.header.size" default="16384" /></Set>
    </New>
...

to be)
...
    <New id="httpConfig" class="org.eclipse.jetty.server.HttpConfiguration">
      <Set name="requestHeaderSize"><Property name="jetty.httpConfig.requestHeaderSize" deprecated="jetty.request.header.size" default="16384" /></Set>
      <Set name="sendDateHeader"><Property name="jetty.httpConfig.sendDateHeader" deprecated="jetty.send.date.header" default="false" /></Set>
    </New>
...

but, How to apply this using helm chart?

✅ 1

ancient-policeman-73437

06/22/2023, 8:22 AM

Dear DataHub support, We are developing an automatic logic to fill Business Glossary in order with imported from Looker descriptions and faced an issue by GraphiQL usage. We have assigned one of Terms to Looker Explore manually and try to get his url by GraphiQL, the system returns Null, the same happens if query Term itself. You might find examples in the pictures. Thank you in advance!