gifted-queen-61023
09/08/2021, 11:20 AM./gradlew build
) I keep getting stuck at 99% due to the installation of dependencies (with pip).
It seems to have difficulties with docutils
and dill
from metadata ingestioninstallDev.
Should the have more strict version intervals in metadata-ingestion/setup.py
or something of that sort? Am I doing something wrong?
Screenshot from my 4th attemptadorable-judge-53430
09/08/2021, 11:46 AMsource:
type: datahub-business-glossary
config:
# Coordinates
file: ~/business_glossary.yml
sink:
type: "datahub-rest"
config:
server: "<http://localhost:8080>"
If I then run the ingestion, I get the following error:
KeyError: 'Did not find a registered class for datahub-business-glossary'
Ingesting data from Postgres and other sources worked great though. Any ideas whatâs happening here?curved-sandwich-81699
09/09/2021, 7:59 PMsource:
type: "snowflake"
config:
username: ...
password: ...
host_port: ...
database_pattern:
ignoreCase: true
allow:
- "database"
schema_pattern:
ignoreCase: true
allow:
- "schema"
table_pattern:
ignoreCase: true
deny:
- ".*"
The tables from database.schema are still getting ingested. Same thing if using database.*
or database.schema.*
as table_pattern.deny... Or I am missing something?handsome-belgium-11927
09/10/2021, 8:20 AMCaused by: java.net.URISyntaxException: Urn entity type should be 'dataset'.: urn:li:dataset:(urn:li:dataPlatform:exasol,main.dds.h_car,PROD)
urn is correct for sure, used it for other examples, like profiling.
Any help would be much appreciatedfresh-carpet-31048
09/10/2021, 10:03 PMsquare-activity-64562
09/13/2021, 8:24 AMsquare-activity-64562
09/13/2021, 9:18 AM_
but the complete one doessquare-activity-64562
09/13/2021, 9:33 AM+ Add Description
is shown. If we move cursor over the schema descriptions (all empty) it feels like the schema is jumping. Probably need to increase height of rowsquare-activity-64562
09/13/2021, 9:41 AMmicroscopic-musician-99632
09/13/2021, 9:53 AMcool-state-20157
09/13/2021, 10:10 PMcurved-jordan-15657
09/14/2021, 10:02 AM09:57:26.739 [Thread-6973] ERROR c.l.d.g.a.service.AnalyticsService - Search query failed: Elasticsearch exception [type=search_phase_execution_exception, reason=all shards failed]
09:57:26.739 [Thread-6973] ERROR c.l.d.g.e.DataHubDataFetcherExceptionHandler - Failed to execute DataFetcher
java.lang.RuntimeException: Search query failed:
at com.linkedin.datahub.graphql.analytics.service.AnalyticsService.executeAndExtract(AnalyticsService.java:245)
at com.linkedin.datahub.graphql.analytics.service.AnalyticsService.getHighlights(AnalyticsService.java:216)
at com.linkedin.datahub.graphql.analytics.resolver.GetHighlightsResolver.getHighlights(GetHighlightsResolver.java:50)
at com.linkedin.datahub.graphql.analytics.resolver.GetHighlightsResolver.get(GetHighlightsResolver.java:29)
at com.linkedin.datahub.graphql.analytics.resolver.GetHighlightsResolver.get(GetHighlightsResolver.java:19)
at graphql.execution.ExecutionStrategy.fetchField(ExecutionStrategy.java:270)
at graphql.execution.ExecutionStrategy.resolveFieldWithInfo(ExecutionStrategy.java:203)
at graphql.execution.AsyncExecutionStrategy.execute(AsyncExecutionStrategy.java:60)
at graphql.execution.Execution.executeOperation(Execution.java:165)
at graphql.execution.Execution.execute(Execution.java:104)
at graphql.GraphQL.execute(GraphQL.java:557)
at graphql.GraphQL.parseValidateAndExecute(GraphQL.java:482)
at graphql.GraphQL.executeAsync(GraphQL.java:446)
at graphql.GraphQL.execute(GraphQL.java:377)
at com.linkedin.datahub.graphql.GraphQLEngine.execute(GraphQLEngine.java:88)
at com.datahub.metadata.graphql.GraphQLController.lambda$postGraphQL$0(GraphQLController.java:82)
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.elasticsearch.ElasticsearchStatusException: Elasticsearch exception [type=search_phase_execution_exception, reason=all shards failed]
at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:187)
at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:1892)
at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:1869)
at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1626)
at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1583)
at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1553)
at org.elasticsearch.client.RestHighLevelClient.search(RestHighLevelClient.java:1069)
at com.linkedin.datahub.graphql.analytics.service.AnalyticsService.executeAndExtract(AnalyticsService.java:240)
... 17 common frames omitted
Suppressed: org.elasticsearch.client.ResponseException: method [POST], host [<https://vpc-datahub-o67waaz2xr5zttbor35tgmlksa.us-east-1.es.amazonaws.com:443>], URI [/datahub_datahub_usage_event/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true], status line [HTTP/1.1 400 Bad Request]
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [browserId] in order to load field data by uninverting the inverted index. Note that this can use significant memory."}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"datahub_datahub_usage_event","node":"M5OibEC5ThKefEm2b1wR4Q","reason":{"type":"illegal_argument_exception","reason":"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [browserId] in order to load field data by uninverting the inverted index. Note that this can use significant memory."}}],"caused_by":{"type":"illegal_argument_exception","reason":"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [browserId] in order to load field data by uninverting the inverted index. Note that this can use significant memory.","caused_by":{"type":"illegal_argument_exception","reason":"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [browserId] in order to load field data by uninverting the inverted index. Note that this can use significant memory."}}},"status":400}
at org.elasticsearch.client.RestClient.convertResponse(RestClient.java:302)
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:272)
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:246)
at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1613)
... 21 common frames omitted
Caused by: org.elasticsearch.ElasticsearchException: Elasticsearch exception [type=illegal_argument_exception, reason=Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [browserId] in order to load field data by uninverting the inverted index. Note that this can use significant memory.]
at org.elasticsearch.ElasticsearchException.innerFromXContent(ElasticsearchException.java:496)
at org.elasticsearch.ElasticsearchException.fromXContent(ElasticsearchException.java:407)
at org.elasticsearch.ElasticsearchException.innerFromXContent(ElasticsearchException.java:437)
at org.elasticsearch.ElasticsearchException.failureFromXContent(ElasticsearchException.java:603)
at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:179)
... 24 common frames omitted
Caused by: org.elasticsearch.ElasticsearchException: Elasticsearch exception [type=illegal_argument_exception, reason=Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [browserId] in order to load field data by uninverting the inverted index. Note that this can use significant memory.]
at org.elasticsearch.ElasticsearchException.innerFromXContent(ElasticsearchException.java:496)
at org.elasticsearch.ElasticsearchException.fromXContent(ElasticsearchException.java:407)
at org.elasticsearch.ElasticsearchException.innerFromXContent(ElasticsearchException.java:437)
... 28 common frames omitted
09:57:26.740 [Thread-6973] ERROR c.d.m.graphql.GraphQLController - Errors while executing graphQL query: "query getHighlights {\n getHighlights {\n value\n title\n body\n __typename\n }\n}\n", result: {errors=[{message=An unknown error occurred., locations=[{line=2, column=3}], path=[getHighlights], extensions={code=500, classification=DataFetchingException}}], data=null}, errors: [DataHubGraphQLError{path=[getHighlights], code=SERVER_ERROR, locations=[SourceLocation{line=2, column=3}]}]
how do i resolve the issue?bland-orange-13353
09/14/2021, 3:48 PMbetter-orange-49102
09/15/2021, 6:50 AMmillions-soccer-98440
09/15/2021, 8:56 AMfrom datahub.ingestion.run.pipeline import Pipeline
def ingest_metadata(**kwargs):
"""
:param ingest_param: source & sink datahub param
:type ingest_param: json/struct
"""
ingest_param = kwargs.get('ingest_param')
pipeline = Pipeline.create(ingest_param)
pipeline.run()
pipeline.raise_from_status()
kafka_connect = {
"source": {
"type": "kafka-connect",
"config": {
"connect_uri": "<http://127.0.0.1:8083>",
"cluster_name": "ts-connect",
},
},
"sink": {
"type": "datahub-kafka",
"config": {
"connection": {
"bootstrap": "127.0.0.1:19092",
"schema_registry_url": "<http://127.0.0.1:17081>"
}
},
},
}
ingest_metadata(ingest_param=kafka_connect)
this error after run code
Skipping connector saleordering-postcodes. Sink Connector not yet implemented
Skipping connector thestreet-image-receipts. Sink Connector not yet implemented
Traceback (most recent call last):
File "kafkaconnect.py", line 36, in <module>
ingest_metadata(ingest_param=kafka_connect)
File "kafkaconnect.py", line 14, in ingest_metadata
pipeline.run()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/datahub/ingestion/run/pipeline.py", line 108, in run
for wu in self.source.get_workunits():
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/datahub/ingestion/source/kafka_connect.py", line 468, in get_workunits
connectors_manifest = self.get_connectors_manifest()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/datahub/ingestion/source/kafka_connect.py", line 308, in get_connectors_manifest
connector_manifest.topic_names = topics[c]["topics"]
KeyError: 'sales-ordering-prod-v5'
cool-state-20157
09/15/2021, 5:22 PMERROR: for datahub-frontend-react Cannot start service datahub-frontend-react: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "datahub-frontend/bin/playBinary": stat datahub-frontend/bin/playBinary: no such file or directory: unknown
ERROR: for datahub-frontend-react Cannot start service datahub-frontend-react: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "datahub-frontend/bin/playBinary": stat datahub-frontend/bin/playBinary: no such file or directory: unknown
ERROR: Encountered errors while bringing up the project.
hundreds-twilight-96303
09/15/2021, 7:18 PM[root@QgY85nPtI2 ~]# python3 -m datahub check plugins
Sources:
athena
azure-ad
bigquery (disabled)
bigquery-usage (disabled)
datahub-business-glossary
dbt
druid (disabled)
feast
file
glue (disabled)
hive (disabled)
kafka (disabled)
kafka-connect
ldap (disabled)
looker (disabled)
lookml (disabled)
mongodb (disabled)
mssql (disabled)
mysql
okta (disabled)
oracle (disabled)
postgres (disabled)
redash (disabled)
redshift (disabled)
sagemaker (disabled)
snowflake (disabled)
snowflake-usage(disabled)
sqlalchemy
superset
Sinks:
console
datahub-kafka (disabled)
datahub-rest
file
Transformers:
add_dataset_ownership
add_dataset_tags
mark_dataset_status
pattern_add_dataset_ownership
set_dataset_browse_path
simple_add_dataset_ownership
simple_add_dataset_tags
simple_remove_dataset_ownership
And, when I try to ingest data while enable sql-profile option and error encountered and tell me, Table profiles requested but profiler plugin is not enabled. Try running: pip install 'acryl-datahub[sql-profiles]'
File "/usr/local/python3/lib/python3.6/site-packages/datahub/entrypoints.py", line 91, in main
sys.exit(datahub(standalone_mode=False, **kwargs))
File "/usr/local/python3/lib/python3.6/site-packages/click/core.py", line 1137, in __call__
return self.main(*args, **kwargs)
File "/usr/local/python3/lib/python3.6/site-packages/click/core.py", line 1062, in main
rv = self.invoke(ctx)
File "/usr/local/python3/lib/python3.6/site-packages/click/core.py", line 1668, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/python3/lib/python3.6/site-packages/click/core.py", line 1668, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/python3/lib/python3.6/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/python3/lib/python3.6/site-packages/click/core.py", line 763, in invoke
return __callback(*args, **kwargs)
File "/usr/local/python3/lib/python3.6/site-packages/datahub/cli/ingest_cli.py", line 52, in run
pipeline = Pipeline.create(pipeline_config)
File "/usr/local/python3/lib/python3.6/site-packages/datahub/ingestion/run/pipeline.py", line 103, in create
return cls(config)
File "/usr/local/python3/lib/python3.6/site-packages/datahub/ingestion/run/pipeline.py", line 72, in __init__
self.config.source.dict().get("config", {}), self.ctx
File "/usr/local/python3/lib/python3.6/site-packages/datahub/ingestion/source/sql/mysql.py", line 23, in create
return cls(config, ctx)
File "/usr/local/python3/lib/python3.6/site-packages/datahub/ingestion/source/sql/mysql.py", line 18, in __init__
super().__init__(config, ctx, "mysql")
File "/usr/local/python3/lib/python3.6/site-packages/datahub/ingestion/source/sql/sql_common.py", line 278, in __init__ "Table profiles requested but profiler plugin is not enabled. "ConfigurationError: Table profiles requested but profiler plugin is not enabled. Try running: pip install 'acryl-datahub[sql-profiles]'
May someone give me a favor? Many thanks in advance..square-activity-64562
09/16/2021, 10:29 AMadamant-pharmacist-61996
09/17/2021, 4:17 AMhandsome-belgium-11927
09/20/2021, 4:01 PM/browse/chart/tableau
. I've got 2 charts there and I can find them via search, but not through browsing. Where to look for this error description? I've tried searching docker logs but no luck yetmillions-soccer-98440
09/20/2021, 5:21 PMmillions-soccer-98440
09/20/2021, 5:53 PM{
"source": {
"type": "postgres",
"config": {
"username": login,
"password": password,
"database": "user_activity",
"host_port": host,
"schema_pattern": {
"deny": ["information_schema"]
}
},
},
"sink": {
"type": "datahub-kafka",
"config": {
"connection": {
"bootstrap": "prerequisites-kafka.datahub:9092",
"schema_registry_url": "<http://prerequisites-cp-schema-registry.datahub:8081>"
}
},
},
}
straight-dentist-7439
09/21/2021, 7:50 AMKeyError: 'Did not find a registered class for datahub-business-glossary'
. Any ideas?handsome-belgium-11927
09/22/2021, 3:26 PMproud-jelly-46237
09/22/2021, 8:29 PMFailed to pull image "acryldata/datahub-mysql-setup:v0.8.14": rpc error: code = Unknown desc = Error response from daemon: manifest for acryldata/datahub-mysql-setup:v0.8.14 not found: manifest unknown: manifest unknown
careful-artist-3840
09/23/2021, 12:10 AM2021-09-22T20:07:53-04:00 00:07:53 [application-akka.actor.default-dispatcher-197] WARN auth.sso.oidc.OidcCallbackLogic - Failed to extract groups: No OIDC claim with name groups found
What would cause this error?adamant-van-40260
09/23/2021, 10:43 AMcolossal-furniture-76714
09/23/2021, 3:31 PMcolossal-furniture-76714
09/23/2021, 3:57 PMrough-garage-43684
09/24/2021, 8:44 AMmutation updateDataset($input: DatasetUpdateInput!) {
it run success in datahub-frontend-react localhost:9002/api/graphiql
but failed in metadata-service localhost:8080/api/graphiql
with this log in datahub gms backend. Am I missing something?