brief-ability-41819
02/01/2023, 7:33 AM0.9.1
to 0.9.2
(or newer).
It seems that datahub-acryl-datahub-actions
is the problem, it throws:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/usr/local/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.10/site-packages/datahub_actions/pipeline/pipeline_manager.py", line 42, in run_pipeline
pipeline.run()
File "/usr/local/lib/python3.10/site-packages/datahub_actions/pipeline/pipeline.py", line 166, in run
for enveloped_event in enveloped_events:
File "/usr/local/lib/python3.10/site-packages/datahub_actions/plugin/source/kafka/kafka_event_source.py", line 154, in events
msg = self.consumer.poll(timeout=2.0)
File "/usr/local/lib/python3.10/site-packages/confluent_kafka/deserializing_consumer.py", line 139, in poll
raise ValueDeserializationError(exception=se, kafka_message=msg)
confluent_kafka.error.ValueDeserializationError: KafkaError{code=_VALUE_DESERIALIZATION,val=-159,str="HTTPConnectionPool(host='prerequisites-cp-schema-registry', port=8081): Max retries exceeded with url: /schemas/ids/2 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8c1ad1e8f0>: Failed to establish a new connection: [Errno 111] Connection refused'))"}
%4|1675236299.041|MAXPOLL|rdkafka#consumer-1| [thrd:main]: Application maximum poll interval (10000ms) exceeded by 170ms (adjust <http://max.poll.interval.ms|max.poll.interval.ms> for long-running message processing): leaving group
Ingestions seems to go into pending state and nothing happens. I haven’t changed anything apart from app version.great-computer-16446
02/01/2023, 10:46 AMgentle-camera-33498
02/01/2023, 12:18 PMmicroscopic-machine-90437
02/01/2023, 1:26 PMlemon-scooter-69730
02/01/2023, 2:34 PMYour client version 0.8.43.5 is newer than your server version 0.9.6. Downgrading the cli to 0.9.6 is recommended.
damp-ambulance-34232
02/01/2023, 5:12 PMfierce-garage-74290
02/01/2023, 7:07 PMdatahub ingest -c recipes/glossaries/glossary_recipe.yml
I'd like to learn from the output how many definitions got actually changed. But I am afraid that whenever I run this recipe all the terms get overwritten and I am always getting in the total_records_written
the number of records in the recipe.
Question: how I can determine if any glossary got modified? I need this to configure notifications for the business team (they would like to be informed whenever any glossary changes).
Thanks!numerous-ram-92457
02/01/2023, 9:20 PMgentle-camera-33498
02/01/2023, 10:08 PMrefined-energy-76018
02/01/2023, 10:35 PMfailure_mode
set which means it should default to CONTINUE
. Any ideas how to debug this?wooden-breakfast-17692
02/02/2023, 8:53 AM./gradlew build -x test -x yarnTest -x testQuick
. Everything seems to work fine but at 99% of building it fails at task :metadata-ingestion:docGen
I seem to get a seg fault: ./scripts/docgen.sh: line 10: 51078 Segmentation fault: 11 python scripts/docgen.py --out-dir ${DOCS_OUT_DIR} --extra-docs ${EXTRA_DOCS_DIR} $@
. Now the strange thing is that the script actually succeeds in generating the docs, it exits with 0. I’m using my system’s python, which is 3.9.6
. Any suggestions? Cheers!rich-policeman-92383
02/02/2023, 1:45 PMmetrics_com_linkedin_metadata_resources_entity_EntityResource_search_Count
metrics_com_linkedin_metadata_resources_entity_EntityResource_search_failed_Count
...... and a few others
rough-car-65301
02/02/2023, 3:23 PMUnable to run quickstart - the following issues were detected:
- datahub-gms is still starting
- elasticsearch-setup is still running
- elasticsearch is running but not healthy
rough-car-65301
02/02/2023, 3:24 PMhandsome-football-66174
02/02/2023, 3:38 PMFile "/root/.venvs/airflow/lib/python3.7/site-packages/airflow/lineage/__init__.py", line 103, in apply_lineage
_backend = get_backend()
File "/root/.venvs/airflow/lib/python3.7/site-packages/airflow/lineage/__init__.py", line 52, in get_backend
clazz = conf.getimport("lineage", "backend", fallback=None)
File "/root/.venvs/airflow/lib/python3.7/site-packages/airflow/configuration.py", line 675, in getimport
f'The object could not be loaded. Please check "{key}" key in "{section}" section. '
airflow.exceptions.AirflowConfigException: The object could not be loaded. Please check "backend" key in "lineage" section. Current value: "datahub_provider.lineage.datahub.DatahubLineageBackend".
We are using the following configurations :
[lineage]
backend = datahub_provider.lineage.datahub.DatahubLineageBackend
datahub_kwargs = {
"enabled": true,
"datahub_conn_id": "datahub_rest_default",
"cluster": "prod",
"capture_ownership_info": true,
"capture_tags_info": true,
"graceful_exceptions": true }
wide-afternoon-79955
02/02/2023, 4:03 PMEditor - Metadata policy
which is not editable and comes with the default source package. It gives Editors All privileges even on the objects which they don't own. We have managed to make this policy editable and deactivate it via update query on the DB. (query is in the thread). The problem which we are facing every time the POD restart it re-loads the default policies from policy.json
overwriting our updated value. Is there a trick where I can either
1. Deactivate Editor - Metadata policy
policy by default
2. Make Editor - Metadata policy
editable
Note : I am trying to avoid for forking out a new project from and build a new custom image just for this tiny config change.gentle-portugal-21014
02/02/2023, 5:35 PMminiature-exabyte-80137
02/02/2023, 8:37 PMUnable to run quickstart - the following issues were detected:
- broker is not running
- datahub-gms is still starting
- zookeeper is not running
If you think something went wrong, please file an issue at <https://github.com/datahub-project/datahub/issues>
or send a message in our Slack <https://slack.datahubproject.io/>
Be sure to attach the logs from /tmp/tmp3qcg5vb6.log
i killed all containers and ran docker system prune but still get this error. still debugging this but lmk if you have any ideas, thanks!great-toddler-2251
02/03/2023, 12:21 AMSLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See <http://www.slf4j.org/codes.html#StaticLoggerBinder> for further details.
and that it was fixed. Well, not in Java datahub-client. I am using the latest and greatest
implementation 'io.acryl:datahub-client:0.9.6-3'
and yet (trivial Boot app from start.spring.io)
$ ./gradlew bootRun
> Task :bootRun
. ____ _ __ _ _
/\\ / ___'_ __ _ _(_)_ __ __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
\\/ ___)| |_)| | | | | || (_| | ) ) ) )
' |____| .__|_| |_|_| |_\__, | / / / /
=========|_|==============|___/=/_/_/_/
:: Spring Boot :: (v3.0.2)
2023-02-02T16:11:01.813-08:00 INFO 58440 --- [ main] c.e.demo.DemoLoggingIssueApplication : Starting DemoLoggingIssueApplication using Java 17.0.1 with PID 58440 (/private/tmp/demo-logging-issue/build/classes/java/main started by raysuliteanu in /private/tmp/demo-logging-issue)
2023-02-02T16:11:01.815-08:00 INFO 58440 --- [ main] c.e.demo.DemoLoggingIssueApplication : No active profile set, falling back to 1 default profile: "default"
2023-02-02T16:11:02.110-08:00 INFO 58440 --- [ main] c.e.demo.DemoLoggingIssueApplication : let's create a DataHub RestEmitter!
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See <http://www.slf4j.org/codes.html#StaticLoggerBinder> for further details.
2023-02-02T16:11:02.436-08:00 INFO 58440 --- [ main] c.e.demo.DemoLoggingIssueApplication : Started DemoLoggingIssueApplication in 0.877 seconds (process running for 1.126)
2023-02-02T16:11:02.437-08:00 INFO 58440 --- [ main] c.e.demo.DemoLoggingIssueApplication : that was fun
BUILD SUCCESSFUL in 2s
4 actionable tasks: 4 executed
I have attached the example project; just unzip and run gradlew. Needless to say, without creating a RestEmitter
, there is no SLF4J error. Suggestions?microscopic-room-90690
02/03/2023, 6:22 AMERROR {datahub.ingestion.run.pipeline:112} - failed to write record with workunit container-urn:li:container:73b796f6a931c3fbf572bf7a011dfca8-to-urn:li:dataset:(urn:li:dataPlatform:database.table,PROD) with Expecting value: line 1 column 1 (char 0) and info {}
Any help will be appreciated. Thank you!bland-appointment-45659
02/03/2023, 7:15 AMrich-pager-68736
02/03/2023, 8:46 AMprojects:
- 'Common Analytics Domain/.*'
to no avail. Any idea how to narrow the ingestion down to selected top level projects? Thanks!rhythmic-quill-75064
02/03/2023, 8:59 AMhelm search repo
does not show this version :
$ helm search repo datahub --versions
[...]
datahub/datahub 0.2.116 0.9.1
datahub/datahub 0.2.114 0.9.1
datahub/datahub 0.2.113 0.9.1
[...]
Then helm
commands fail, for example :
$ helm template --debug datahub datahub/datahub -n <NS> --version 0.2.115
[...]
install.go:192: [debug] Original chart version: "0.2.115"
Error: chart "datahub" matching 0.2.115 not found in datahub index. (try 'helm repo update'): no chart version found for datahub-0.2.115
The repo is up to date. There are other "holes" in the versions.
Is this normal ?many-solstice-66904
02/03/2023, 9:30 AMgms.grqphql
file under resources but I am unable to locate this file anywhere in the repository. Could it be that this page is out-of-date?tall-dentist-87295
02/03/2023, 1:09 PMValidation error (FieldUndefined@[searchResultFields/datasetProfiles/sizeInBytes]) : Field 'sizeInBytes' in type 'DatasetProfile' is undefined (code undefined)
incalculable-manchester-41314
02/03/2023, 1:27 PMcalm-balloon-31412
02/03/2023, 4:49 PMAvroException: ('Datum union type not in schema: %s', None)
when running graph.get_aspect_v2(entity_urn=urn, aspect="dataJobInfo", aspect_type=DataJobInfoClass)
I see someone brought this up in the past but not sure if it was ever resolved. I'm trying to write a job that updates the datajobInfo aspect of a data job instead of overwriting it, so I need to access this aspect. Any help would be appreciated! cc @big-carpet-38439 who looked at this issue beforerapid-hamburger-95729
02/03/2023, 4:50 PMgentle-lifeguard-88494
02/04/2023, 6:16 PMdatahub ingest list-runs
and got the following error. Any ideas on how to solve this? Thankscuddly-ram-44320
02/05/2023, 11:19 AM