kind-psychiatrist-76973
04/14/2022, 10:34 AM08:48:31.973 [pool-17-thread-1] ERROR c.d.m.a.AuthorizationManager - Failed to retrieve policy urns! Skipping updating policy cache until next refresh. start: 0, count: 30
kind-psychiatrist-76973
04/14/2022, 10:50 AMr-184316] WARN auth.sso.oidc.OidcCallbackLogic - Failed to extract groups: No OIDC claim with name groups found
13:41:09 [application-akka.actor.default-dispatcher-184316] ERROR auth.sso.oidc.OidcCallbackLogic - Failed to perform post authentication steps. Redirecting to error page.
java.lang.RuntimeException: Failed to provision user with urn urn:li:corpuser:robert.last-name.Caused by: com.linkedin.r2.message.rest.RestException: Received error 500 from server for URI <http://datahub-datahub-gms:8080/entities/urn:li:corpuser:robert.last-name>
at com.linkedin.r2.transport.http.common.HttpBridge$1.onResponse(HttpBridge.java:76)
quick-pizza-8906
04/14/2022, 2:19 PMswift-breakfast-25077
04/14/2022, 8:25 PMdatahub docker quickstart --quickstart-compose-file docker-compose-without-neo4j.quickstart.yml
however when I go to http://localhost:9002/callback/oidc I get the message Failed to perform SSO callback . SSO is not enabled for protocol: oidc any ideas ??
PS : Configurations added in docker-compose-without-neo4j.quickstart.yml:
AUTH_OIDC_ENABLED=true
AUTH_OIDC_CLIENT_ID= "myclientid"
AUTH_OIDC_CLIENT_SECRET= "myclientsecret"
AUTH_OIDC_DISCOVERY_URI=<https://accounts.google.com/.well-known/openid-configuration>
AUTH_OIDC_BASE_URL=<http://localhost:9002>
AUTH_OIDC_SCOPE="openid profile email"
AUTH_OIDC_USER_NAME_CLAIM=email
AUTH_OIDC_USER_NAME_CLAIM_REGEX=([^@]+)
tall-fall-45442
04/15/2022, 1:48 AMdatahub docker check
shows that there are no issues detected.
Here is the specification that I'm using for the MongoDB ingestion source:
source:
type: mongodb
config:
connect_uri: '<mongodb://localhost>'
username: '${MONGO-DB-USERNAME}'
password: '${MONGO-DB-PASSWORD}'
enableSchemaInference: true
useRandomSampling: true
maxSchemaSize: 300
sink:
type: datahub-rest
config:
server: '<http://localhost:8080>'
But I am getting an error about a refused connection.
'[2022-04-15 01:37:45,418] INFO {datahub.cli.ingest_cli:88} - DataHub CLI version: 0.8.32.1\n'
'[2022-04-15 01:37:45,423] WARNING {urllib3.connectionpool:810} - Retrying (Retry(total=2, connect=None, read=None, redirect=None, '
"status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f86471ada00>: Failed to "
"establish a new connection: [Errno 111] Connection refused')': /config\n"
'[2022-04-15 01:37:49,424] WARNING {urllib3.connectionpool:810} - Retrying (Retry(total=1, connect=None, read=None, redirect=None, '
"status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f86471add00>: Failed to "
"establish a new connection: [Errno 111] Connection refused')': /config\n"
'[2022-04-15 01:37:57,411] WARNING {urllib3.connectionpool:810} - Retrying (Retry(total=0, connect=None, read=None, redirect=None, '
"status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f86471ad520>: Failed to "
"establish a new connection: [Errno 111] Connection refused')': /config\n"
'[2022-04-15 01:37:57,658] ERROR {datahub.entrypoints:152} - File '
'"/tmp/datahub/ingest/venv-1f3b70e4-8933-4643-ad3e-d9279e37c6cd/lib/python3.9/site-packages/urllib3/connection.py", line 174, in '
better-orange-49102
04/15/2022, 7:18 AMquery{
autoCompleteForMultiple(input:{
types: CONTAINER
query: "long"
}){
query
suggestions{
type
suggestions
entities{
urn
type
}
}
}
}
expected to see suggestion for "long_tail_companions" container, but it just returns error 500handsome-football-66174
04/15/2022, 8:30 PMdamp-ambulance-34232
04/16/2022, 2:31 AMbrave-insurance-80044
04/18/2022, 8:45 AM./docker/dev-without-neo4j.sh
Error response from daemon: manifest for linkedin/datahub-elasticsearch-setup:debug not found: manifest unknown: manifest unknown
Error response from daemon: manifest for linkedin/datahub-kafka-setup:debug not found: manifest unknown: manifest unknown
Error response from daemon: manifest for linkedin/datahub-frontend-react:debug not found: manifest unknown: manifest unknown
Seems like the corresponding docker images with the debug
tag is missing on Docker Hub. Could anyone help?eager-oxygen-76249
04/18/2022, 10:03 AMdatahub docker quickstart
Unable to run quickstart - the following issues were detected:
- datahub-gms is running but not healthy
orange-coat-2879
04/19/2022, 12:27 AMtable_pattern.allow
to ingest a specific MSSQL table, only the databse and schema can be ingested not the table. But when I remove tabel_pattern.allow
, datahub can sucessfully ingest all of tables including the specific one. Is it a bug or I missed something? I am sure the name of table is correct. Thanks!red-napkin-59945
04/19/2022, 4:07 AM*
and the UI will show black page
[Thread-54989] INFO c.l.m.s.e.q.r.AutocompleteRequestHandler:127 - No highlighted field for query *, hit
microscopic-mechanic-13766
04/19/2022, 7:41 AM07:32:34.075 [ForkJoinPool.commonPool-worker-0] WARN org.elasticsearch.client.RestClient:65 - request [POST <http://elasticcluster_master1-elastic:9200/*index_v2/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true>] returned 1 warnings: [299 Elasticsearch-8.0.0-1b6a7ece17463df5ff54a3e1302d825889aa1161 "[ignore_throttled] parameter is deprecated because frozen indices have been deprecated. Consider cold or frozen tiers in place of frozen indices."]
07:32:34.080 [ForkJoinPool.commonPool-worker-1] WARN org.elasticsearch.client.RestClient:65 - request [POST <http://elasticcluster_master1-elastic:9200/*index_v2/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true>] returned 1 warnings: [299 Elasticsearch-8.0.0-1b6a7ece17463df5ff54a3e1302d825889aa1161 "[ignore_throttled] parameter is deprecated because frozen indices have been deprecated. Consider cold or frozen tiers in place of frozen indices."]
07:32:34.092 [ForkJoinPool.commonPool-worker-4] WARN org.elasticsearch.client.RestClient:65 - request [POST <http://elasticcluster_master1-elastic:9200/datahub_usage_event/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true>] returned 1 warnings: [299 Elasticsearch-8.0.0-1b6a7ece17463df5ff54a3e1302d825889aa1161 "[ignore_throttled] parameter is deprecated because frozen indices have been deprecated. Consider cold or frozen tiers in place of frozen indices."]
07:32:34.103 [ForkJoinPool.commonPool-worker-2] WARN org.elasticsearch.client.RestClient:65 - request [POST <http://elasticcluster_master1-elastic:9200/*index_v2/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true>] returned 1 warnings: [299 Elasticsearch-8.0.0-1b6a7ece17463df5ff54a3e1302d825889aa1161 "[ignore_throttled] parameter is deprecated because frozen indices have been deprecated. Consider cold or frozen tiers in place of frozen indices."]
07:32:34.104 [ForkJoinPool.commonPool-worker-0] WARN org.elasticsearch.client.RestClient:65 - request [POST <http://elasticcluster_master1-elastic:9200/*index_v2/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true>] returned 1 warnings: [299 Elasticsearch-8.0.0-1b6a7ece17463df5ff54a3e1302d825889aa1161 "[ignore_throttled] parameter is deprecated because frozen indices have been deprecated. Consider cold or frozen tiers in place of frozen indices."]
07:32:34.106 [ForkJoinPool.commonPool-worker-5] WARN org.elasticsearch.client.RestClient:65 - request [POST <http://elasticcluster_master1-elastic:9200/datahub_usage_event/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true>] returned 1 warnings: [299 Elasticsearch-8.0.0-1b6a7ece17463df5ff54a3e1302d825889aa1161 "[ignore_throttled] parameter is deprecated because frozen indices have been deprecated. Consider cold or frozen tiers in place of frozen indices."]
07:32:34.115 [I/O dispatcher 1] ERROR c.l.m.k.e.ElasticsearchConnector:47 - Error feeding bulk request. No retries left
java.io.IOException: Unable to parse response body for Response{requestLine=POST /_bulk?timeout=1m HTTP/1.1, host=<http://elasticcluster_master1-elastic:9200>, response=HTTP/1.1 200 OK}
at org.elasticsearch.client.RestHighLevelClient$1.onSuccess(RestHighLevelClient.java:1764)
at org.elasticsearch.client.RestClient$FailureTrackingResponseListener.onSuccess(RestClient.java:609)
at org.elasticsearch.client.RestClient$1.completed(RestClient.java:352)
at org.elasticsearch.client.RestClient$1.completed(RestClient.java:346)
at org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:122)
at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:181)
at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:448)
at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:338)
at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)
at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException: null
at java.util.Objects.requireNonNull(Objects.java:203)
at org.elasticsearch.action.DocWriteResponse.<init>(DocWriteResponse.java:127)
at org.elasticsearch.action.index.IndexResponse.<init>(IndexResponse.java:54)
at org.elasticsearch.action.index.IndexResponse.<init>(IndexResponse.java:39)
at org.elasticsearch.action.index.IndexResponse$Builder.build(IndexResponse.java:107)
at org.elasticsearch.action.index.IndexResponse$Builder.build(IndexResponse.java:104)
at org.elasticsearch.action.bulk.BulkItemResponse.fromXContent(BulkItemResponse.java:159)
at org.elasticsearch.action.bulk.BulkResponse.fromXContent(BulkResponse.java:196)
at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:1892)
at org.elasticsearch.client.RestHighLevelClient.lambda$performRequestAsyncAndParseEntity$10(RestHighLevelClient.java:1680)
at org.elasticsearch.client.RestHighLevelClient$1.onSuccess(RestHighLevelClient.java:1762)
... 18 common frames omitted
I am using ElasticSearch 8.0.0. When I had ElasticSearch 7.9.3 (as in the quickstart docker-compose) I didn't get these messages.brainy-wall-41694
04/19/2022, 11:17 AMgentle-father-80172
04/19/2022, 7:48 PMsalmon-rose-54694
04/20/2022, 2:37 AMfast-ability-23281
04/20/2022, 2:46 AMNAME READY STATUS RESTARTS AGE
datahub-acryl-datahub-actions-74c674fb9d-rm5rq 1/1 Running 0 6h17m
datahub-datahub-frontend-bd5c8677c-84nrf 1/1 Running 0 6h17m
datahub-datahub-gms-54f994fdf5-vgjzc 1/1 Running 0 6h17m
datahub-datahub-upgrade-job-rsln9 0/1 Error 0 6h17m
datahub-datahub-upgrade-job-zx2sk 0/1 Completed 0 6h16m
datahub-elasticsearch-setup-job-t59k2 0/1 Completed 0 6h18m
datahub-kafka-setup-job-zxsjc 0/1 Completed 0 6h18m
datahub-mysql-setup-job-7glgk 0/1 Completed 0 6h17m
elasticsearch-master-0 1/1 Running 0 6h21m
elasticsearch-master-1 1/1 Running 0 6h21m
elasticsearch-master-2 1/1 Running 0 6h21m
prerequisites-cp-schema-registry-cf79bfccf-mx25m 2/2 Running 0 6h21m
prerequisites-kafka-0 1/1 Running 1 6h21m
prerequisites-mysql-0 1/1 Running 0 6h21m
prerequisites-neo4j-community-0 1/1 Running 0 6h21m
prerequisites-zookeeper-0 1/1 Running 0 6h21m
ubuntu 1/1 Running 0 71m
square-solstice-69079
04/20/2022, 7:20 AMbrave-forest-5974
04/20/2022, 12:41 PMkind-psychiatrist-76973
04/20/2022, 2:46 PM11:52:28 [application-akka.actor.default-dispatcher-10670] WARN o.p.o.profile.creator.TokenValidator - Preferred JWS algorithm: null not available. Using all metadata algorithms: [RS256]
11:52:29 [application-akka.actor.default-dispatcher-10670] ERROR auth.sso.oidc.OidcCallbackLogic - Unable to renew the session. The session store may not support this feature
I have configured SSO with Googleripe-apple-36185
04/20/2022, 6:45 PM~/.datahub/plugins/models
, the metadata service tries to load the '.DS_Store'
file as a model. Am I doing something wrong? This is what I see when I query the config endpoint:lemon-terabyte-66903
04/20/2022, 7:03 PMimportant-wire-73
04/21/2022, 5:03 AMMetadataChangeProposalWrapper(entityType='corpGroup', changeType='UPSERT', entityUrn='urn:li:corpGroup:Data Platform--001', entityKeyAspect=None, auditHeader=None, aspectName='corpGroupInfo', aspect=CorpGroupInfoClass({'displayName': 'Data Platform', 'email': None, 'admins': ['urn:li:corpuser:<http://aa.bb|aa.bb>', 'urn:li:corpuser:<http://aa.cc|aa.cc>', 'urn:li:corpuser:aab', 'urn:li:corpuser:apal'], 'members': ['urn:li:corpuser:<http://aa.bb|aa.bb>', 'urn:li:corpuser:<http://aa.cc|aa.cc>', 'urn:li:corpuser:aab', 'urn:li:corpuser:apal'], 'groups': [], 'description': ' '}), systemMetadata=None)
Group is created and available in UI but members are not added. But, when I add the same member via UI then it works fine. Any suggestions?salmon-area-51650
04/21/2022, 6:35 AM❯ datahub ingest rollback --run-id snowflake-2022_03_19-01_00_27
This will permanently delete data from DataHub. Do you want to continue? [y/N]: y
Failed to execute operation
java.lang.UnsupportedOperationException: Failed to find Typeref schema associated with Config-based Entity
Any idea?creamy-van-28626
04/21/2022, 7:25 AMwitty-butcher-82399
04/21/2022, 9:59 AM0.8.33
and we have found this exception quite recurrent in different connectors:
[2022-04-21 09:40:45,039] ERROR {datahub.ingestion.run.pipeline:210} - Failed to extract some records due to: 'NoneType' object has no attribute 'group'
Any idea what it could be?microscopic-mechanic-13766
04/21/2022, 10:47 AM'[2022-04-21 10:41:04,600] INFO {datahub.cli.ingest_cli:86} - Starting metadata ingestion\n'
'[2022-04-21 10:41:05,303] ERROR {datahub.entrypoints:119} - File '
'"/tmp/datahub/ingest/venv-36f9165c-d27e-44aa-b49a-b08a77157764/lib/python3.9/site-packages/datahub/entrypoints.py", line 105, in main\n'
' 102 def main(**kwargs):\n'
' 103 # This wrapper prevents click from suppressing errors.\n'
' 104 try:\n'
'--> 105 sys.exit(datahub(standalone_mode=False, **kwargs))\n'
' 106 except click.exceptions.Abort:\n'
' ..................................................\n'
' kwargs = {}\n'
' datahub = <Group datahub>\n'
" click.exceptions.Abort = <class 'click.exceptions.Abort'>\n"
' ..................................................\n'
'\n'
'File "/tmp/datahub/ingest/venv-36f9165c-d27e-44aa-b49a-b08a77157764/lib/python3.9/site-packages/click/core.py", line 1130, in __call__\n'
' 1128 def __call__(self, *args: t.Any, **kwargs: t.Any) -> t.Any:\n'
' (...)\n'
'--> 1130 return self.main(*args, **kwargs)\n'
' ..................................................\n'
' self = <Group datahub>\n'
' args = ()\n'
' t.Any = typing.Any\n'
" kwargs = {'standalone_mode': False,\n"
" 'prog_name': 'python3 -m datahub'}\n"
' ..................................................\n'
'\n'
'File "/tmp/datahub/ingest/venv-36f9165c-d27e-44aa-b49a-b08a77157764/lib/python3.9/site-packages/click/core.py", line 1055, in main\n'
' rv = self.invoke(ctx)\n'
'File "/tmp/datahub/ingest/venv-36f9165c-d27e-44aa-b49a-b08a77157764/lib/python3.9/site-packages/click/core.py", line 1657, in invoke\n'
' return _process_result(sub_ctx.command.invoke(sub_ctx))\n'
'File "/tmp/datahub/ingest/venv-36f9165c-d27e-44aa-b49a-b08a77157764/lib/python3.9/site-packages/click/core.py", line 1657, in invoke\n'
' return _process_result(sub_ctx.command.invoke(sub_ctx))\n'
'File "/tmp/datahub/ingest/venv-36f9165c-d27e-44aa-b49a-b08a77157764/lib/python3.9/site-packages/click/core.py", line 1404, in invoke\n'
' return ctx.invoke(self.callback, **ctx.params)\n'
'File "/tmp/datahub/ingest/venv-36f9165c-d27e-44aa-b49a-b08a77157764/lib/python3.9/site-packages/click/core.py", line 760, in invoke\n'
' return __callback(*args, **kwargs)\n'
'File "/tmp/datahub/ingest/venv-36f9165c-d27e-44aa-b49a-b08a77157764/lib/python3.9/site-packages/datahub/telemetry/telemetry.py", line '
'194, in wrapper\n'
' 181 def wrapper(*args: Any, **kwargs: Any) -> Any:\n'
' (...)\n'
' 190 return res\n'
' 191 # Catch general exceptions\n'
' 192 except Exception as e:\n'
' 193 telemetry_instance.ping(category, action, f"error:{get_full_class_name(e)}")\n'
'--> 194 raise e\n'
' 195 # System exits (used in ingestion and Docker commands) are not caught by the exception handler,\n'
' ..................................................\n'
' args = ()\n'
' Any = typing.Any\n'
" kwargs = {'config': '/tmp/datahub/ingest/36f9165c-d27e-44aa-b49a-b08a77157764.yml',\n"
" 'dry_run': False,\n"
" 'preview': False,\n"
" 'strict_warnings': False}\n"
" telemetry_instance.ping = <method 'Telemetry.ping' of <datahub.telemetry.telemetry.Telemetry object at 0x7faf304431c0> "
'telemetry.py:110>\n'
" category = 'datahub.cli.ingest_cli'\n"
" action = 'run'\n"
' ..................................................\n'
'\n'
'File "/tmp/datahub/ingest/venv-36f9165c-d27e-44aa-b49a-b08a77157764/lib/python3.9/site-packages/datahub/telemetry/telemetry.py", line '
'188, in wrapper\n'
' 181 def wrapper(*args: Any, **kwargs: Any) -> Any:\n'
' (...)\n'
' 184 action = func.__name__\n'
' 185 \n'
' 186 telemetry_instance.ping(category, action, "started")\n'
' 187 try:\n'
'--> 188 res = func(*args, **kwargs)\n'
' 189 telemetry_instance.ping(category, action, "completed")\n'
' ..................................................\n'
' args = ()\n'
' Any = typing.Any\n'
" kwargs = {'config': '/tmp/datahub/ingest/36f9165c-d27e-44aa-b49a-b08a77157764.yml',\n"
" 'dry_run': False,\n"
" 'preview': False,\n"
" 'strict_warnings': False}\n"
" action = 'run'\n"
" func.__name__ = 'run'\n"
" telemetry_instance.ping = <method 'Telemetry.ping' of <datahub.telemetry.telemetry.Telemetry object at 0x7faf304431c0> "
Could someone help me to understand what is the problem?
Thanks in advance!busy-waiter-6669
04/21/2022, 11:06 AMred-window-75368
04/21/2022, 11:45 AMdomain:
'urn:li:domain:xxx':
allow:
- '.*'
Only changing the xxx in each of the recipes.
All goes well when running the first recipe, but after running the second one there seems to be some kind of problem, the second domain shows 0 entities in the Domains tab (the first domain shows all the existing entities), but in the first page, in Domains it shows ONLY the second domain with all existing entities (the first domain disappears from the front page).
Can it be the "allow" segment of the recipe? I thought allowing everything only applied to the data coming from the source of the recipe.square-solstice-69079
04/21/2022, 1:04 PM