bitter-waitress-17567
12/26/2022, 8:56 AMgentle-camera-33498
12/26/2022, 7:32 PM***************************
APPLICATION FAILED TO START
***************************
Description:
Field kafkaHealthChecker in com.linkedin.gms.factory.kafka.DataHubKafkaEventProducerFactory required a bean of type 'com.linkedin.metadata.dao.producer.KafkaHealthChecker' that could not be found.
The injection point has the following annotations:
- @javax.inject.Inject()
- @javax.inject.Named(value="noCodeUpgrade")
Action:
Consider defining a bean of type 'com.linkedin.metadata.dao.producer.KafkaHealthChecker' in your configuration.
millions-hydrogen-95879
12/27/2022, 3:39 AMdatahub version
DataHub CLI version: 0.9.4
datahub docker quickstart --arch m1
Getting the errors as below:
Unable to run quickstart - the following issues were detected:
- kafka-setup is still running
- datahub-gms is still starting
- zookeeper is not running
powerful-cat-68806
12/27/2022, 8:10 AMechoing-needle-51090
12/27/2022, 8:19 AMechoing-needle-51090
12/27/2022, 8:23 AMelegant-salesmen-99143
12/27/2022, 12:19 PMlate-ability-59580
12/27/2022, 2:56 PMentities_enabled.sources: 'NO'
, the sources, which are then considered as datasets, don't get the snowflake symbol
2. My dbt entities always have their urns in uppercase. Oddly, when ingesting snowflake with convert_urns_to_lowercase: False
, the snowflake entities are separate from their dbt counterpart. When not using that flag, they merge together.
3. When ingesting snowflake with lowercase urns, some tables appear twice in the lineage view as the up/downstream of each other (two, identical squares - same stats, urns, etc,).
I would appreciate any tip regarding any of these issuesglamorous-wire-83850
12/28/2022, 11:43 AMValidation error (FieldUndefined@[nonSiblingDatasetFields/privileges]) : Field 'privileges' in type 'Dataset' is undefined (code undefined)
gentle-camera-33498
12/28/2022, 2:33 PMERROR c.l.m.s.e.query.ESSearchDAO:72 - Search query failed
java.lang.RuntimeException: error while performing request
...
Caused by: java.util.concurrent.TimeoutException: Connection lease request time out2.
ERROR c.l.d.g.r.r.ListRecommendationsResolver:66 - Failed to get recommendations for input com.linkedin.datahub.graphql.generated.ListRecommendationsInput@361dc5ce
java.lang.RuntimeException: error while performing request
...
Caused by: java.util.concurrent.TimeoutException: Connection lease request time out3.
[ThreadPoolTaskExecutor-1] INFO c.l.m.k.t.DataHubUsageEventTransformer:74 - Invalid event type: HomePageViewEvent
140700.296 [ThreadPoolTaskExecutor-1] WARN c.l.m.k.DataHubUsageEventsProcessor:56 - Failed to apply usage events transform to record: {"type":"HomePageViewEvent","actorUrn":"urnlicorpuser:patrick.braz","timestamp":1672236418541,"date":"Wed Dec 28 2022 110658 GMT-0300 (Horário Padrão de Brasília)","userAgent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36","browserId":"a449948d-c76e-4545-8bf1-6aac1047583d"}Could anyone please help me understandd why?
powerful-cat-68806
12/28/2022, 3:34 PMastonishing-animal-7168
12/28/2022, 4:15 PMrhythmic-lock-29204
12/28/2022, 10:01 PMsource:
type: mssql
config:
password: '${secretPass}'
database: DatabaseName2
host_port: '192.168.1.1:9999'
username: '${secretUser}'
The sink configuration is no problem. I've tinkered with it extensively and it was working fine for us to ingest data at one point. Unfortunately we tried to compose DataHub again and lost this functionality on the current image.
When running UI ingestion, this is the relevant error I see:
'[2022-12-28 19:53:58,671] ERROR {datahub.ingestion.run.pipeline:127} - mssql is disabled; try running: pip install '"'acryl-datahub[mssql]'\n"
We have installed the plugin as requested by the error message and also followed the steps to set up the ODBC driver, as well as pyodbc just in case that is an issue.
I have attached three files here:
1. The Error Log
2. Output of datahub check plugins --verbose
3. Config showing ODBC driver installed
Any idea what I'm doing wrong, or advice on how to further diagnose the issue?best-rose-86507
12/29/2022, 10:41 AMinclude_table_lineage
& include_column_lineage
to true
in the yaml recipe.
The lineage appears fine in databricks unity catalog as seen in the image attached, but when looking at the lineage for the same table in datahub, the lineage is not visible 😕
Would really appreciate it if someone can direct me on how to fix thisglamorous-wire-83850
12/30/2022, 9:28 AMflaky-camera-29314
12/30/2022, 6:15 PMpython3 -m datahub version
Traceback (most recent call last):
File "/Users/obritto/opt/anaconda3/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/Users/obritto/opt/anaconda3/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/Users/obritto/opt/anaconda3/lib/python3.7/site-packages/datahub/__main__.py", line 1, in <module>
from datahub.entrypoints import main
File "/Users/obritto/opt/anaconda3/lib/python3.7/site-packages/datahub/entrypoints.py", line 11, in <module>
from datahub.cli.check_cli import check
File "/Users/obritto/opt/anaconda3/lib/python3.7/site-packages/datahub/cli/check_cli.py", line 7, in <module>
from datahub.cli.json_file import check_mce_file
File "/Users/obritto/opt/anaconda3/lib/python3.7/site-packages/datahub/cli/json_file.py", line 3, in <module>
from datahub.ingestion.source.file import GenericFileSource
File "/Users/obritto/opt/anaconda3/lib/python3.7/site-packages/datahub/ingestion/source/file.py", line 17, in <module>
from datahub.ingestion.api.common import PipelineContext
File "/Users/obritto/opt/anaconda3/lib/python3.7/site-packages/datahub/ingestion/api/common.py", line 7, in <module>
from datahub.emitter.mce_builder import set_dataset_urn_to_lower
File "/Users/obritto/opt/anaconda3/lib/python3.7/site-packages/datahub/emitter/mce_builder.py", line 13, in <module>
from datahub.configuration.source_common import DEFAULT_ENV as DEFAULT_ENV_CONFIGURATION
File "/Users/obritto/opt/anaconda3/lib/python3.7/site-packages/datahub/configuration/source_common.py", line 49, in <module>
class DatasetSourceConfigBase(PlatformSourceConfigBase, EnvBasedSourceConfigBase):
File "pydantic/main.py", line 324, in pydantic.main.ModelMetaclass.__new__
File "/Users/obritto/opt/anaconda3/lib/python3.7/abc.py", line 126, in __new__
cls = super().__new__(mcls, name, bases, namespace, **kwargs)
TypeError: multiple bases have instance lay-out conflict
powerful-cat-68806
01/01/2023, 10:25 AMvpce-xxxxx-xxxx
and not a standard RS endpoint)
Also - to which pod I should connect to routing to DSs?
10x 🙂powerful-cat-68806
01/02/2023, 7:31 AMproud-policeman-19830
01/02/2023, 7:58 AMdatahub docker quickstart
)?
Specifically, I'd like to add a root cert to db connection (postgres), I think I could do it by setting sslrootcert
on a connection uri, but how do I get the cert into datahub to pick the cert up, and what would be the path for sslrootcert
? If i have to build one of the images myself to do this, which one would it be?rough-gold-15434
01/02/2023, 1:41 PMrough-gold-15434
01/02/2023, 1:42 PMpowerful-cat-68806
01/02/2023, 4:11 PMdatahub-frontend-xxx
pod? If so, I can’t find the path docker/datahub-frontend/env/docker.env
Pls. assist 🙂brave-waitress-14748
01/03/2023, 5:25 AM0.9.3
to a GKE cluster, (with an Istio service mesh) and are noticing some strange behaviour when trying to run CLI ingestion commands.
When I configure DATAHUB_GMS_URL
to point directly to the GMS service (via an ingress, or port forwarding), all works as expected.
But when I point the CLI to the frontend service (via an ingress), with suffix /api/gms
, I get a 401
error.
<html>\n<head>\n<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>\n<title>Error 401 Unauthorized to perform this action.</title>\n</head>\n<body><h2>HTTP ERROR 401 Unauthorized to perform this action.</h2>\n<table>\n<tr><th>URI:</th><td>/entities</td></tr>\n<tr><th>STATUS:</th><td>401</td></tr>\n<tr><th>MESSAGE:</th><td>Unauthorized to perform this action.</td></tr>\n<tr><th>SERVLET:</th><td>restliRequestHandler</td></tr>\n</table>\n<hr/><a href="<https://eclipse.org/jetty>">Powered by Jetty:// 9.4.46.v20220331</a><hr/>\n\n</body>\n</html>
This is initially hard to see initially as the HTML response causes the JSON parser to barf (see cli_utils.py
, L207), but I narrowed it down as only occurring under the above conditions. Note that I am using a token generated by the root datahub
user, and have set METADATA_SERVICE_AUTH_ENABLED
to "true"
for both the frontend, and gms deployments.
I can work with things as they stand and communicate with the GMS service directly, but would like to know if this is intended behaviour, as it contradicts the documentation which suggests I should be able to use the frontend proxy
... we will be shifting to the recommendation that folks direct all traffic, whether it's programmatic or not, to the DataHub Frontend Proxy, as routing to Metadata Service endpoints is currently available at the pathThanks in advance!/api/gms
quick-student-61408
01/03/2023, 1:42 PMagreeable-belgium-70840
01/03/2023, 3:26 PMError opening zip file or JAR manifest missing : opentelemetry-javaagent-all.jar
Error occurred during initialization of VM
agent library failed to init: instrument
2023/01/03 15:25:19 Command exited with error: exit status 1
quiet-smartphone-60119
01/03/2023, 3:46 PMmelodic-dress-7431
01/03/2023, 4:39 PMmelodic-dress-7431
01/03/2023, 4:39 PMdatahub-frontend-react | play.api.UnexpectedException: Unexpected exception[NullPointerException: Null stream]
melodic-dress-7431
01/03/2023, 4:40 PMdocker/dev.sh
melodic-dress-7431
01/03/2023, 4:40 PM