numerous-account-62719
12/08/2022, 5:48 AM~~~~ Execution Summary ~~~~
RUN_INGEST - {'errors': [],
'exec_id': 'bcaca377-5b8d-4957-bca5-74e68bf71e3d',
'infos': ['2022-12-08 05:46:36.817335 [exec_id=bcaca377-5b8d-4957-bca5-74e68bf71e3d] INFO: Starting execution for task with name=RUN_INGEST',
'2022-12-08 05:46:36.921081 [exec_id=bcaca377-5b8d-4957-bca5-74e68bf71e3d] INFO: Caught exception EXECUTING '
'task_id=bcaca377-5b8d-4957-bca5-74e68bf71e3d, name=RUN_INGEST, stacktrace=Traceback (most recent call last):\n'
' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 123, in execute_task\n'
' task_event_loop.run_until_complete(task_future)\n'
' File "/usr/local/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete\n'
' return future.result()\n'
' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 73, in execute\n'
' SubProcessTaskUtil._write_recipe_to_file(exec_out_dir, file_name, recipe)\n'
' File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_task_common.py", line 105, in '
'_write_recipe_to_file\n'
' os.makedirs(dir_path, mode = 0o777, exist_ok = True)\n'
' File "/usr/local/lib/python3.10/os.py", line 215, in makedirs\n'
' makedirs(head, exist_ok=exist_ok)\n'
' File "/usr/local/lib/python3.10/os.py", line 225, in makedirs\n'
' mkdir(name, mode)\n'
"PermissionError: [Errno 13] Permission denied: '/tmp/datahub/ingest'\n"]}
Execution finished with errors.
Can someone please help me outnumerous-account-62719
12/08/2022, 8:28 AMlively-action-8308
12/08/2022, 11:59 AMadamant-van-21355
12/08/2022, 2:15 PMcolumn-level-lineage
functionality when DBT metadata is involved? We are ingesting metadata from Snowflake, DBT and Looker (latest version) and currently it is not possible to use this for merged *Snowflake*&*DBT* entities. It would be really nice to unlock this feature for cases when DBT nodes are part of the lineage (which is probably the most of the cases). Thanks 🙂bitter-furniture-95993
12/08/2022, 3:35 PMdazzling-appointment-34954
12/08/2022, 5:43 PMancient-apartment-23316
12/08/2022, 6:04 PM2022/12/08 14:37:50 Waiting for: <http://datahub-datahub-gms:8080/health>
2022/12/08 14:37:50 Received 200 from <http://datahub-datahub-gms:8080/health>
No user action configurations found. Not starting user actions.
[2022-12-08 14:37:50,955] INFO {datahub_actions.cli.actions:68} - DataHub Actions version: unavailable (installed editable via git)
[2022-12-08 14:37:51,012] INFO {datahub_actions.cli.actions:98} - Action Pipeline with name 'ingestion_executor' is now running.
Can you please help me, how can I fix data ingestion?
I’m already saw this https://datahubproject.io/docs/ui-ingestion/#i-see-na-when-i-try-to-run-ingestion-what-do-i-dolittle-breakfast-38102
12/08/2022, 9:26 PMbrief-dream-8019
12/08/2022, 10:32 PMquick-student-61408
12/09/2022, 1:52 PMgentle-portugal-21014
12/09/2022, 4:30 PMbrainy-piano-85560
12/11/2022, 11:12 AMcool-translator-98249
12/12/2022, 12:38 AM"failures": {"Stateful Ingestion": ["Fail safe mode triggered, entity '
'difference percent:66.66666666666667 > fail_safe_threshold:{self.stateful_ingestion_config.fail_safe_threshold}"]},
How can I troubleshoot this?acceptable-alarm-65116
12/12/2022, 1:28 PMgentle-camera-33498
12/12/2022, 6:41 PMbland-lighter-26751
12/12/2022, 7:29 PMweb_profile_clicks
that doesn't actually exist anywhere. 01_transform
instead of 01_TRANSFORM
.
Any ideas?melodic-telephone-26568
12/13/2022, 7:57 AMdatahub docker nuke
python -m pip install --upgrade pip wheel setuptools
python -m pip install acryl-datahub==0.8.43
datahub version
DataHub CLI version: 0.8.43
Python version: 3.7.15 (default, Nov 24 2022, 184454) [MSC v.1916 64 bit (AMD64)]
datahub docker quickstart --version v0.8.43 --mysql-port 53306
But it fails to start after a few minutes, telling me that datahub-gms is not running.
The logs for datahub-gms container are as follows.
I don't know what the problem is (the latest version worked without any problem).
2022-12-13 16:52:58 2022/12/13 07:52:58 Waiting for: <http://elasticsearch:9200>
2022-12-13 16:52:58 2022/12/13 07:52:58 Waiting for: <tcp://mysql:3306>
2022-12-13 16:52:58 2022/12/13 07:52:58 Waiting for: <tcp://broker:29092>
2022-12-13 16:52:58 2022/12/13 07:52:58 Problem with dial: dial tcp 172.28.0.5:29092: connect: connection refused. Sleeping 1s
2022-12-13 16:52:58 2022/12/13 07:52:58 Connected to <tcp://mysql:3306>
2022-12-13 16:52:58 2022/12/13 07:52:58 Problem with request: Get "<http://elasticsearch:9200>": dial tcp 172.28.0.3:9200: connect: connection refused. Sleeping 1s
2022-12-13 16:52:59 2022/12/13 07:52:59 Problem with dial: dial tcp 172.28.0.5:29092: connect: connection refused. Sleeping 1s
2022-12-13 16:52:59 2022/12/13 07:52:59 Problem with request: Get "<http://elasticsearch:9200>": dial tcp 172.28.0.3:9200: connect: connection refused. Sleeping 1s
2022-12-13 16:53:00 2022/12/13 07:53:00 Problem with dial: dial tcp 172.28.0.5:29092: connect: connection refused. Sleeping 1s
2022-12-13 16:53:00 2022/12/13 07:53:00 Problem with request: Get "<http://elasticsearch:9200>": dial tcp 172.28.0.3:9200: connect: connection refused. Sleeping 1s
2022-12-13 16:53:01 2022/12/13 07:53:01 Problem with dial: dial tcp 172.28.0.5:29092: connect: connection refused. Sleeping 1s
2022-12-13 16:53:01 2022/12/13 07:53:01 Problem with request: Get "<http://elasticsearch:9200>": dial tcp 172.28.0.3:9200: connect: connection refused. Sleeping 1s
2022-12-13 16:53:02 2022/12/13 07:53:02 Problem with dial: dial tcp 172.28.0.5:29092: connect: connection refused. Sleeping 1s
2022-12-13 16:53:02 2022/12/13 07:53:02 Problem with request: Get "<http://elasticsearch:9200>": dial tcp 172.28.0.3:9200: connect: connection refused. Sleeping 1s
2022-12-13 16:53:03 2022/12/13 07:53:03 Problem with request: Get "<http://elasticsearch:9200>": dial tcp 172.28.0.3:9200: connect: connection refused. Sleeping 1s
2022-12-13 16:53:03 2022/12/13 07:53:03 Problem with dial: dial tcp 172.28.0.5:29092: connect: connection refused. Sleeping 1s
2022-12-13 16:53:04 2022/12/13 07:53:04 Problem with request: Get "<http://elasticsearch:9200>": dial tcp 172.28.0.3:9200: connect: connection refused. Sleeping 1s
2022-12-13 16:53:04 2022/12/13 07:53:04 Problem with dial: dial tcp 172.28.0.5:29092: connect: connection refused. Sleeping 1s
2022-12-13 16:53:05 2022/12/13 07:53:05 Problem with request: Get "<http://elasticsearch:9200>": dial tcp 172.28.0.3:9200: connect: connection refused. Sleeping 1s
2022-12-13 16:53:05 2022/12/13 07:53:05 Connected to <tcp://broker:29092>
2022-12-13 16:53:06 2022/12/13 07:53:06 Problem with request: Get "<http://elasticsearch:9200>": dial tcp 172.28.0.3:9200: connect: connection refused. Sleeping 1s
2022-12-13 16:53:07 2022/12/13 07:53:07 Received 200 from <http://elasticsearch:9200>
2022-12-13 16:53:08 2022-12-13 07:53:08.265:INFO::main: Logging initialized @277ms to org.eclipse.jetty.util.log.StdErrLog
2022-12-13 16:53:08 WARNING: jetty-runner is deprecated.
2022-12-13 16:53:08 See Jetty Documentation for startup options
2022-12-13 16:53:08 <https://www.eclipse.org/jetty/documentation/>
2022-12-13 16:53:08 ERROR: No such classes directory file:///etc/datahub/plugins/auth/resources
2022-12-13 16:53:08 Usage: java [-Djetty.home=dir] -jar jetty-runner.jar [--help|--version] [ server opts] [[ context opts] context ...]
2022-12-13 16:53:08 Server opts:
2022-12-13 16:53:08 --version - display version and exit
2022-12-13 16:53:08 --log file - request log filename (with optional 'yyyy_mm_dd' wildcard
2022-12-13 16:53:08 --out file - info/warn/debug log filename (with optional 'yyyy_mm_dd' wildcard
2022-12-13 16:53:08 --host name|ip - interface to listen on (default is all interfaces)
2022-12-13 16:53:08 --port n - port to listen on (default 8080)
2022-12-13 16:53:08 --stop-port n - port to listen for stop command (or -DSTOP.PORT=n)
2022-12-13 16:53:08 --stop-key n - security string for stop command (required if --stop-port is present) (or -DSTOP.KEY=n)
2022-12-13 16:53:08 [--jar file]*n - each tuple specifies an extra jar to be added to the classloader
2022-12-13 16:53:08 [--lib dir]*n - each tuple specifies an extra directory of jars to be added to the classloader
2022-12-13 16:53:08 [--classes dir]*n - each tuple specifies an extra directory of classes to be added to the classloader
2022-12-13 16:53:08 --stats [unsecure|realm.properties] - enable stats gathering servlet context
2022-12-13 16:53:08 [--config file]*n - each tuple specifies the name of a jetty xml config file to apply (in the order defined)
2022-12-13 16:53:08 Context opts:
2022-12-13 16:53:08 [[--path /path] context]*n - WAR file, web app dir or context xml file, optionally with a context path
2022-12-13 16:53:08 2022/12/13 07:53:08 Command exited with error: exit status 1
2022-12-13 16:53:27 + echo
2022-12-13 16:53:27 + grep -q ://
2022-12-13 16:53:27 + NEO4J_HOST=http://
2022-12-13 16:53:27 + [[ ! -z '' ]]
2022-12-13 16:53:27 + [[ -z '' ]]
2022-12-13 16:53:27 + ELASTICSEARCH_AUTH_HEADER='Accept: */*'
2022-12-13 16:53:27 + [[ '' == true ]]
2022-12-13 16:53:27 + ELASTICSEARCH_PROTOCOL=http
2022-12-13 16:53:27 + WAIT_FOR_EBEAN=
2022-12-13 16:53:27 + [[ '' != true ]]
2022-12-13 16:53:27 + [[ '' == ebean ]]
2022-12-13 16:53:27 + [[ -z '' ]]
2022-12-13 16:53:27 + WAIT_FOR_EBEAN=' -wait <tcp://mysql:3306> '
2022-12-13 16:53:27 + WAIT_FOR_CASSANDRA=
2022-12-13 16:53:27 + [[ '' == cassandra ]]
2022-12-13 16:53:27 + WAIT_FOR_KAFKA=
2022-12-13 16:53:27 + [[ '' != true ]]
2022-12-13 16:53:27 ++ echo broker:29092
2022-12-13 16:53:27 ++ sed 's/,/ -wait tcp:\/\//g'
2022-12-13 16:53:27 + WAIT_FOR_KAFKA=' -wait <tcp://broker:29092> '
2022-12-13 16:53:27 + WAIT_FOR_NEO4J=
2022-12-13 16:53:27 + [[ elasticsearch != elasticsearch ]]
2022-12-13 16:53:27 + OTEL_AGENT=
2022-12-13 16:53:27 + [[ '' == true ]]
2022-12-13 16:53:27 + PROMETHEUS_AGENT=
2022-12-13 16:53:27 + [[ '' == true ]]
2022-12-13 16:53:27 + auth_resource_dir=/etc/datahub/plugins/auth/resources
2022-12-13 16:53:27 + COMMON='
2022-12-13 16:53:27 -wait <tcp://mysql:3306> -wait <tcp://broker:29092> -timeout 240s java -Xms1g -Xmx1g -jar /jetty-runner.jar --jar jetty-util.jar --jar jetty-jmx.jar --classes /etc/datahub/plugins/auth/resources --config /datahub/datahub-gms/scripts/jetty.xml /datahub/datahub-gms/bin/war.war'
2022-12-13 16:53:27 + [[ '' != true ]]
2022-12-13 16:53:27 + exec dockerize -wait <http://elasticsearch:9200> -wait-http-header 'Accept: */*' -wait <tcp://mysql:3306> -wait <tcp://broker:29092> -timeout 240s java -Xms1g -Xmx1g -jar /jetty-runner.jar --jar jetty-util.jar --jar jetty-jmx.jar --classes /etc/datahub/plugins/auth/resources --config /datahub/datahub-gms/scripts/jetty.xml /datahub/datahub-gms/bin/war.war
2022-12-13 16:53:27 2022/12/13 07:53:27 Waiting for: <http://elasticsearch:9200>
2022-12-13 16:53:27 2022/12/13 07:53:27 Waiting for: <tcp://mysql:3306>
2022-12-13 16:53:27 2022/12/13 07:53:27 Waiting for: <tcp://broker:29092>
2022-12-13 16:53:27 2022/12/13 07:53:27 Connected to <tcp://mysql:3306>
2022-12-13 16:53:27 2022/12/13 07:53:27 Connected to <tcp://broker:29092>
2022-12-13 16:53:27 2022/12/13 07:53:27 Received 200 from <http://elasticsearch:9200>
2022-12-13 16:53:27 2022-12-13 07:53:27.370:INFO::main: Logging initialized @224ms to org.eclipse.jetty.util.log.StdErrLog
2022-12-13 16:53:27 WARNING: jetty-runner is deprecated.
2022-12-13 16:53:27 See Jetty Documentation for startup options
2022-12-13 16:53:27 <https://www.eclipse.org/jetty/documentation/>
2022-12-13 16:53:27 ERROR: No such classes directory file:///etc/datahub/plugins/auth/resources
2022-12-13 16:53:27 Usage: java [-Djetty.home=dir] -jar jetty-runner.jar [--help|--version] [ server opts] [[ context opts] context ...]
2022-12-13 16:53:27 Server opts:
2022-12-13 16:53:27 --version - display version and exit
2022-12-13 16:53:27 --log file - request log filename (with optional 'yyyy_mm_dd' wildcard
2022-12-13 16:53:27 --out file - info/warn/debug log filename (with optional 'yyyy_mm_dd' wildcard
2022-12-13 16:53:27 --host name|ip - interface to listen on (default is all interfaces)
2022-12-13 16:53:27 --port n - port to listen on (default 8080)
2022-12-13 16:53:27 --stop-port n - port to listen for stop command (or -DSTOP.PORT=n)
2022-12-13 16:53:27 --stop-key n - security string for stop command (required if --stop-port is present) (or -DSTOP.KEY=n)
2022-12-13 16:53:27 [--jar file]*n - each tuple specifies an extra jar to be added to the classloader
2022-12-13 16:53:27 [--lib dir]*n - each tuple specifies an extra directory of jars to be added to the classloader
2022-12-13 16:53:27 [--classes dir]*n - each tuple specifies an extra directory of classes to be added to the classloader
2022-12-13 16:53:27 --stats [unsecure|realm.properties] - enable stats gathering servlet context
2022-12-13 16:53:27 [--config file]*n - each tuple specifies the name of a jetty xml config file to apply (in the order defined)
2022-12-13 16:53:27 Context opts:
2022-12-13 16:53:27 [[--path /path] context]*n - WAR file, web app dir or context xml file, optionally with a context path
2022-12-13 16:53:27 2022/12/13 07:53:27 Command exited with error: exit status 1
melodic-dress-7431
12/13/2022, 12:29 PMmelodic-dress-7431
12/13/2022, 12:29 PMerror: error reading /opt/datahub/metadata-auth/auth-api/build/libs/auth-api-0.9.4-SNAPSHOT.jar; zip file is empty
melodic-dress-7431
12/13/2022, 12:29 PMmelodic-dress-7431
12/13/2022, 12:31 PMmelodic-dress-7431
12/13/2022, 12:33 PMacoustic-rose-68681
12/13/2022, 2:03 PMmicroscopic-mechanic-13766
12/13/2022, 3:04 PMtransformers:
-
type: "pattern_add_dataset_tags"
config:
tag_pattern:
rules:
'.*030902.*': ['urn:li:tag:030902']
'.*050501.*': ['urn:li:tag:050501']
-
type: "pattern_add_dataset_domain"
config:
domain_pattern:
rules:
'.*libros.*': ['urn:li:domain:c4c94633-96cf-4a93-baa7-15562905f8f0']
'.*050501.*': ['urn:li:domain:97faf5f3-4494-4620-abcf-a6a9eeea9fbe']
I am using 0.9.0 for both gms and front and 0.0.8 for actions
Both the tags and domains do exist and the execution was executed successfully.
I think the problem could be in the pattern_add_dataset_tags
(although I am not really sure) as I have previously used the latter transformer in previous ocasions and didn't have any problem with it.
Thanks in advance!purple-printer-15193
12/13/2022, 4:55 PMancient-library-85500
12/13/2022, 9:16 PM* What went wrong:
Execution failed for task ':li-utils:compileMainGeneratedDataTemplateJava'.
> Could not find tools.jar. Please check that /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.352.b08-2.el7_9.x86_64/jre contains a valid JDK installation.
Running java -version
outputs this:
openjdk version "11.0.17" 2022-10-18 LTS
OpenJDK Runtime Environment (Red_Hat-11.0.17.0.8-2.el7_9) (build 11.0.17+8-LTS)
OpenJDK 64-Bit Server VM (Red_Hat-11.0.17.0.8-2.el7_9) (build 11.0.17+8-LTS, mixed mode, sharing)
My JAVA_HOME variable is set to the Java 11 location. It seems that when I try to run build, gradle is picking up a different, older version of Java that I had been using previously.lemon-lock-92370
12/14/2022, 3:50 AM# Backend (gms)
./gradlew :metadata-service:war:build
(cd docker && COMPOSE_DOCKER_CLI_BUILD=1 DOCKER_BUILDKIT=1 docker-compose -p datahub -f docker-compose-without-neo4j.yml -f docker-compose-without-neo4j.override.yml -f docker-compose.dev.yml up -d --no-deps --force-recreate datahub-gms)
# Frontend
./gradlew :datahub-frontend:dist -x yarnTest -x yarnLint
(cd docker && COMPOSE_DOCKER_CLI_BUILD=1 DOCKER_BUILDKIT=1 docker-compose -p datahub -f docker-compose-without-neo4j.yml -f docker-compose-without-neo4j.override.yml -f docker-compose.dev.yml up -d --no-deps --force-recreate datahub-frontend-react)
Then how could I do this for metadata-ingestion? 😮
I modified some code in metadata-ingestion/src/datahub/ingestion/source/aws/glue.py file.
I want to build this and make it updated in my docker to apply the code modification. I tried to build this as below.
./gradlew :metadata-ingestion:build
How could I update this modification to existing docker container? Please help 🙏 Thank you 🙇astonishing-cartoon-6079
12/14/2022, 8:32 AMnamespace com.mycompany.dq
/**
* Details about dataset Storage.
*/
@Aspect = {
"name": "storage",
"autoRender": true,
"renderSpec": {
"displayType": "properties",
"displayName": "Storage Info",
}
}
record Storage {
format: optional string
compression: optional string
sizeInBytes: optional long
fileNum: optional long
}
I can insert storage aspect successfully, but there is not Storage Info tab in the web page. Does anybody have the answer to slove the problem?brainy-piano-85560
12/14/2022, 8:46 AMstrong-kite-83354
12/14/2022, 2:06 PM