loud-camera-71352
11/29/2021, 3:53 PMcurl '<http://localhost:8080/entities?action=ingest>' -X POST --data '{
"entity": {
"value": {
"com.linkedin.metadata.snapshot.DatasetSnapshot": {
"urn": "urn:li:dataset:(urn:li:dataPlatform:exasol,main.dds.test2,PROD)",
"aspects": [
{
"com.linkedin.schema.EditableSchemaMetadata":[
"editableSchemaFieldInfo":{
"fieldPath": "member_id",
"globalTags": { "tags": [{ "tag": "urn:li:tag:PII" }] }
}
]
}
]
}
}
}
}'
cool-painting-92220
11/29/2021, 7:22 PMdatahub docker nuke
) but wanted to save all the metadata I had previously ingested so that I don't have to run the ingestion again when I started datahub back up again, what would be the best way to approach this? Additionally, if I had used Datahub's UI to add some descriptions and documentation to a few tables, how could this data be saved if the server were to be shutdown?
Thank you for any guidance that can be provided! 😄polite-flower-25924
11/29/2021, 9:51 PMplain-farmer-27314
11/30/2021, 3:55 PMcool-painting-92220
11/30/2021, 9:09 PMnumerous-translator-7230
12/01/2021, 3:26 AMred-pizza-28006
12/01/2021, 10:52 AMdatahub delete -n --query "fivetran_headscarf_hurray_staging"
. Any ideas?
---- (full traceback above) ----
File "/Users/ajaykumarmuppuri/opt/anaconda3/lib/python3.8/site-packages/datahub/entrypoints.py", line 95, in main
sys.exit(datahub(standalone_mode=False, **kwargs))
File "/Users/ajaykumarmuppuri/opt/anaconda3/lib/python3.8/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/Users/ajaykumarmuppuri/opt/anaconda3/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/Users/ajaykumarmuppuri/opt/anaconda3/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/Users/ajaykumarmuppuri/opt/anaconda3/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/ajaykumarmuppuri/opt/anaconda3/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/Users/ajaykumarmuppuri/opt/anaconda3/lib/python3.8/site-packages/datahub/cli/delete_cli.py", line 148, in delete
deletion_result = delete_with_filters(
File "/Users/ajaykumarmuppuri/opt/anaconda3/lib/python3.8/site-packages/datahub/cli/delete_cli.py", line 211, in delete_with_filters
batch_deletion_result.merge(one_result)
File "/Users/ajaykumarmuppuri/opt/anaconda3/lib/python3.8/site-packages/datahub/cli/delete_cli.py", line 51, in merge
self.sample_records.extend(another_result.sample_records)
AttributeError: 'FieldInfo' object has no attribute 'extend'
handsome-football-66174
12/01/2021, 6:07 PM{
listUsers(input: { start: 0, count: 10 }){
start
count
total
users {
urn
type
username
status
properties {
displayName
email
title
departmentId
departmentName
firstName
lastName
fullName
countryCode
}
editableProperties {
aboutMe
pictureLink
}
}
}
}
bland-orange-13353
12/01/2021, 7:33 PMrefined-branch-44251
12/02/2021, 1:03 AMcurl --location --request POST '<http://localhost:8080/entities?action=search>' \
--header 'X-RestLi-Protocol-Version: 2.0.0' \
--header 'Content-Type: application/json' \
--data-raw '{
"input": "glossaryTerms:Classification.Sensitive",
"entity": "dataset",
"start": 0,
"count": 10
}'
this returns all datasets with the glossary term 'Classification.Sensitive'. is there a way to search datasets where this glossary term has been applied to fields of the dataset (and not the dataset)?abundant-flag-19546
12/02/2021, 3:56 AMMLExperimentUrn.java
, MLExperimentUrn.pdl
in li-utils,
• MLExperimentKey.pdl
MLExperimentSnapshot.pdl
MLExperimentProperties.pdl
MLExperimentAspect.pdl
• and registered these items in Aspect.pdl
and Snapshot.pdl
When I tried to build with the command
COMPOSE_DOCKER_CLI_BUILD=1 DOCKER_BUILDKIT=1 docker-compose -p datahub build
I got these error:
#10 235.9 > Task :metadata-service:restli-impl:checkRestModel FAILED
#10 235.9 [checker]
#10 235.9 [checker] idl compatibility report:
#10 235.9 [checker] Incompatible changes:
#10 235.9 [checker] 1) /collection/actions/batchIngest/parameters/entities/type: new union added members com.linkedin.metadata.snapshot.MLExperimentSnapshot
#10 235.9 [checker] 2) com.linkedin.entity.Entity/value/com.linkedin.metadata.snapshot.Snapshot/ref/union: new union added members com.linkedin.metadata.snapshot.MLExperimentSnapshot, breaks old readers
#10 235.9 [checker]
#10 235.9 [checker] [RS-COMPAT]: false
#10 235.9 [checker] [MD-COMPAT]: false
#10 235.9 [checker] [RS-I]:/collection/actions/batchIngest/parameters/entities/type: new union added members com.linkedin.metadata.snapshot.MLExperimentSnapshot
#10 235.9 [checker] [MD-I]:com.linkedin.entity.Entity/value/com.linkedin.metadata.snapshot.Snapshot/ref/union: new union added members com.linkedin.metadata.snapshot.MLExperimentSnapshot, breaks old readers
#10 235.9 [checker]
#10 235.9
#10 235.9 FAILURE: Build failed with an exception.
So I tried this workaround (found from official docs)
./gradlew :gms:impl:build -Prest.model.compatibility=ignore
but makes ‘project gms not found’ error.
1. Is this correct way to implement new Entity? (Creating custom urn java code, Registering items in aspect.pdl and snapshot.pdl)
2. How can I ignore that error while building docker images?quaint-branch-37931
12/02/2021, 10:15 AMfull-area-6720
12/02/2021, 11:14 AMrefined-apple-6340
12/02/2021, 2:58 PMaloof-forest-55926
12/02/2021, 7:31 PMdatahub
as both the username and passwordbulky-controller-34643
12/03/2021, 2:04 AMambitious-guitar-89068
12/03/2021, 6:35 AMbest-crayon-19865
12/03/2021, 1:26 PM<http://datahub.dwh-stage.corp.loc>
Did someone have similar error?
ERROR - ('Unable to emit metadata to DataHub GMS', {'message': "Invalid URL 'datahub.dwh-stage.corp.loc/entities?action=ingest': No schema supplied. Perhaps you meant <http://datahub.dwh-stage.corp.loc/entities?action=ingest>?"})
ambitious-vegetable-3452
12/03/2021, 2:06 PM{
"type": "server",
"timestamp": "2021-12-03T14:03:22,365Z",
"level": "WARN",
"component": "r.suppressed",
"cluster.name": "elasticsearch",
"node.name": "elasticsearch-master-1",
"message": "path: /_cluster/health, params: {wait_for_status=green, timeout=1s}",
"stacktrace": [
"org.elasticsearch.discovery.MasterNotDiscoveredException: null",
"at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$2.onTimeout(TransportMasterNodeAction.java:220) [elasticsearch-7.9.3.jar:7.9.3]",
"at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:325) [elasticsearch-7.9.3.jar:7.9.3]",
"at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:252) [elasticsearch-7.9.3.jar:7.9.3]",
"at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:605) [elasticsearch-7.9.3.jar:7.9.3]",
"at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:678) [elasticsearch-7.9.3.jar:7.9.3]",
"at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]",
"at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]",
"at java.lang.Thread.run(Thread.java:832) [?:?]"
]
}
Does anyone know how i could solve this?refined-apple-6340
12/03/2021, 7:34 PMplain-farmer-27314
12/03/2021, 8:06 PMbest-planet-6756
12/03/2021, 8:11 PMCOMPOSE_DOCKER_CLI_BUILD=1 DOCKER_BUILDKIT=1 docker-compose -p datahub build
But I'm getting the following error:
#9 320.0 > Task :metadata-io:compileJava FAILED
#9 320.0
#9 320.0 FAILURE: Build failed with an exception.
#9 320.0
#9 320.0 * What went wrong:
#9 320.0 Execution failed for task ':metadata-io:compileJava'.
#9 320.0 > Could not resolve all files for configuration ':metadata-io:compileClasspath'.
#9 320.0 > Could not resolve com.linkedin.datahub-gma:ebean-dao:0.2.81.
#9 320.0 Required by:
#9 320.0 project :metadata-io
#9 320.0 > Could not resolve com.linkedin.datahub-gma:ebean-dao:0.2.81.
#9 320.0 > Could not get resource '<https://plugins.gradle.org/m2/com/linkedin/datahub-gma/ebean-dao/0.2.81/ebean-dao-0.2.81.pom>'.
#9 320.0 > Could not GET '<https://jcenter.bintray.com/com/linkedin/datahub-gma/ebean-dao/0.2.81/ebean-dao-0.2.81.pom>'.
#9 320.0 > Connect to <http://jcenter.bintray.com:443|jcenter.bintray.com:443> [<http://jcenter.bintray.com/34.95.74.180|jcenter.bintray.com/34.95.74.180>] failed: connect timed out
#9 320.0 > Could not resolve com.linkedin.datahub-gma:restli-resources:0.2.81.
#9 320.0 Required by:
#9 320.0 project :metadata-io
#9 320.0 > Could not resolve com.linkedin.datahub-gma:restli-resources:0.2.81.
#9 320.0 > Could not get resource '<https://plugins.gradle.org/m2/com/linkedin/datahub-gma/restli-resources/0.2.81/restli-resources-0.2.81.pom>'.
#9 320.0 > Could not GET '<https://jcenter.bintray.com/com/linkedin/datahub-gma/restli-resources/0.2.81/restli-resources-0.2.81.pom>'.
#9 320.0 > Connect to <http://jcenter.bintray.com:443|jcenter.bintray.com:443> [<http://jcenter.bintray.com/34.95.74.180|jcenter.bintray.com/34.95.74.180>] failed: connect timed out
#9 320.0 > Could not resolve com.linkedin.datahub-gma:elasticsearch-dao-7:0.2.81.
#9 320.0 Required by:
#9 320.0 project :metadata-io
#9 320.0 > Could not resolve com.linkedin.datahub-gma:elasticsearch-dao-7:0.2.81.
#9 320.0 > Could not get resource '<https://plugins.gradle.org/m2/com/linkedin/datahub-gma/elasticsearch-dao-7/0.2.81/elasticsearch-dao-7-0.2.81.pom>'.
#9 320.0 > Could not GET '<https://jcenter.bintray.com/com/linkedin/datahub-gma/elasticsearch-dao-7/0.2.81/elasticsearch-dao-7-0.2.81.pom>'.
#9 320.0 > Connect to <http://jcenter.bintray.com:443|jcenter.bintray.com:443> [<http://jcenter.bintray.com/34.95.74.180|jcenter.bintray.com/34.95.74.180>] failed: connect timed out
#9 320.0
#9 320.0 * Try:
#9 320.0 Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.
#9 320.0
#9 320.0 * Get more help at <https://help.gradle.org>
#9 320.0
#9 320.0 Deprecated Gradle features were used in this build, making it incompatible with Gradle 6.0.
#9 320.0 Use '--warning-mode all' to show the individual deprecation warnings.
#9 320.0 See <https://docs.gradle.org/5.6.4/userguide/command_line_interface.html#sec:command_line_warnings>
#9 320.0
#9 320.0 BUILD FAILED in 5m 19s74 actionable tasks: 74 executed
#9 320.0
------
failed to solve with frontend dockerfile.v0: failed to build LLB: executor failed running [/bin/sh -c cd /datahub-src && ./gradlew :metadata-service:war:build -x test]: runc did not terminate sucessfully
ERROR: Service 'datahub-gms' failed to build : Build failed
Anyone run into this before?lemon-greece-73651
12/04/2021, 1:32 AMfull-area-6720
12/06/2021, 7:05 AMbulky-controller-34643
12/07/2021, 3:50 AMnice-country-99675
12/07/2021, 12:56 PMdatahub delete --platform quicksight
and some of them were not deleted which I have to delete them by urn
. But when I try to re ingest these datasets, none appear in the UI. The ingestion process didn't fail, I see no errors in the logs... but nothing shows up. As a matter of fact, when I try a new datahub delete --platform quicksight
it tells me there are 0 records. Then I checked the postgres tables in DataHub I still see the upstream lineage to the QuickSight datasets, so the datasets are in some way still in the DB, but not available to the UI,.... When I remove them one by one using the urn
I'm able to re ingest them and the UI properly display them....broad-crowd-13788
12/07/2021, 9:10 PMValueSerializationError: KafkaError{code=_VALUE_SERIALIZATION,val=-161,str="Schema being registered is incompatible with an earlier schema for subject "MetadataChangeEvent_v4-value" (HTTP status code 409, SR code 409)"}
cool-painting-92220
12/08/2021, 12:57 AMdatabase_pattern, view_pattern,
and schema_pattern
, I got the error message shown below. Any thoughts on why I might be running into these issues?
Error:
3 validation errors for SnowflakeConfig
schema_pattern -> deny
value is not a valid list (type=type_error.list)
view_pattern -> deny
value is not a valid list (type=type_error.list)
database_pattern -> deny
value is not a valid list (type=type_error.list)
Ingestion File:
source:
type: snowflake
config:
host_port: ****
warehouse: ****
username: ****
password: ****
role: ****
database_pattern:
deny: ****
view_pattern:
deny: ****
schema_pattern:
deny: ****
sink:
type: "datahub-rest"
config:
server: "<http://localhost:8080>"
cool-painting-92220
12/08/2021, 1:21 AMrefined-apple-6340
12/08/2021, 3:56 AM