melodic-dress-7431
01/03/2023, 4:41 PMhallowed-spring-18709
01/03/2023, 5:22 PMminiature-librarian-48611
01/03/2023, 9:22 PMdamp-greece-27806
01/03/2023, 10:00 PMERROR StatusLogger Log4j2 could not find a logging implementation. Please add log4j-core to the classpath. Using SimpleLogger to log to the console...
Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
ERROR LoggingFailureAnalysisReporter
***************************
APPLICATION FAILED TO START
***************************
Description:
Field kafkaHealthChecker in com.linkedin.gms.factory.kafka.DataHubKafkaEventProducerFactory required a bean of type 'com.linkedin.metadata.dao.producer.KafkaHealthChecker' that could not be found.
The injection point has the following annotations:
- @javax.inject.Inject()
- @javax.inject.Named(value="noCodeUpgrade")
Action:
Consider defining a bean of type 'com.linkedin.metadata.dao.producer.KafkaHealthChecker' in your configuration.
lemon-lock-92370
01/04/2023, 8:45 AM--hard
option. When I check mysql database (we are using aws rds), all rows has been deleted.
But! In UI, I could still see Glossary 1.8K and when I click it, there’s no glossary data.
Same thing happens when I search for a word.. 😢
How could I clean up all these metadata? (I couldn’t try datahub docker nuke
, since we are using k8s)
Please help.. 🙏 Thanks in advance 🙇late-bear-87552
01/04/2023, 9:09 AM09:07:31.964 [qtp447981768-3254] INFO c.l.m.r.entity.AspectResource:143 - INGEST PROPOSAL proposal: {aspectName=dataProcessInstanceRunEvent, entityUrn=urn:li:dataProcessInstance:7fb643f7ece7111076633e3681a20133, entityType=dataProcessInstance, aspect={contentType=application/json, value=ByteString(length=146,bytes=7b227469...3a20327d)}, changeType=UPSERT}
09:07:31.967 [pool-14-thread-1] INFO c.l.m.filter.RestliLoggingFilter:55 - POST /aspects?action=ingestProposal - ingestProposal - 200 - 3ms
09:07:31.983 [qtp447981768-19] INFO c.l.m.r.entity.AspectResource:143 - INGEST PROPOSAL proposal: {aspectName=dataProcessInstanceRunEvent, entityUrn=urn:li:dataProcessInstance:7fb643f7ece7111076633e3681a20133, entityType=dataProcessInstance, aspect={contentType=application/json, value=ByteString(length=195,bytes=7b227469...77227d7d)}, changeType=UPSERT}
09:07:31.986 [pool-14-thread-1] INFO c.l.m.filter.RestliLoggingFilter:55 - POST /aspects?action=ingestProposal - ingestProposal - 200 - 3ms
09:07:32.945 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:35 - Error feeding bulk request. No retries left
java.io.IOException: Unable to parse response body for Response{requestLine=POST /_bulk?timeout=1m HTTP/1.1, host=http://****:9200, response=HTTP/1.1 200 OK}
at org.elasticsearch.client.RestHighLevelClient$1.onSuccess(RestHighLevelClient.java:1764)
at org.elasticsearch.client.RestClient$FailureTrackingResponseListener.onSuccess(RestClient.java:609)
at org.elasticsearch.client.RestClient$1.completed(RestClient.java:352)
at org.elasticsearch.client.RestClient$1.completed(RestClient.java:346)
at org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:122)
at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:181)
at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:448)
at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:338)
at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)
at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.NullPointerException: null
at java.base/java.util.Objects.requireNonNull(Objects.java:221)
at org.elasticsearch.action.DocWriteResponse.<init>(DocWriteResponse.java:127)
at org.elasticsearch.action.update.UpdateResponse.<init>(UpdateResponse.java:65)
at org.elasticsearch.action.update.UpdateResponse$Builder.build(UpdateResponse.java:172)
at org.elasticsearch.action.update.UpdateResponse$Builder.build(UpdateResponse.java:160)
at org.elasticsearch.action.bulk.BulkItemResponse.fromXContent(BulkItemResponse.java:159)
at org.elasticsearch.action.bulk.BulkResponse.fromXContent(BulkResponse.java:196)
at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:1892)
at org.elasticsearch.client.RestHighLevelClient.lambda$performRequestAsyncAndParseEntity$10(RestHighLevelClient.java:1680)
at org.elasticsearch.client.RestHighLevelClient$1.onSuccess(RestHighLevelClient.java:1762)
... 18 common frames omitted
best-midnight-2857
01/04/2023, 1:17 PMfailed authentication due to: An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: Exception while evaluating challenge [Caused by javax.security.auth.callback.UnsupportedCallbackException: Unrecognized SASL ClientCallback]) occurred when evaluating SASL token received from the Kafka Broker. Kafka Client will go to AUTHENTICATION_FAILED state.
I configured the helm-chart via springKafkaConfigurationOverrides
for AWS-MSK and I don’t see similar errors in the backend-logs - everything related to MSK works fine for the backend. So I assume the configuration itself is fine.
The environment variables in the frontend-pod/container seem to be also fine. All SASL/MSK related stuff that we configured gets routed to the envs correctly.
However, in the frontend-logs i also find
sasl.client.callback.handler.class = null
in the part where the kafka-producer config is logged. All other SASL/MSK related stuff is set correctly in the producer config though.
So it seems that the application is not correctly reading / passing to the producer what is stated in the environment variable KAFKA_PROPERTIES_SASL_CLIENT_CALLBACK_HANDLER_CLASS=software.amazon.msk.auth.iam.IAMClientCallbackHandler
.
The only interesting thing I noticed compared to the backend is that the relevant kafka-environment variables are prefixed with SPRING_
whereas the envs in the frontend start with KAFKA_
. not sure whether that’s an issue / related here.
Both frontend and backend are on v0.9.5, helm-chart version is 0.2.119
Thanks in advance for any pointers!icy-smartphone-27162
01/04/2023, 3:15 PMhandsome-football-66174
01/04/2023, 5:24 PMbright-egg-51769
01/04/2023, 6:14 PMgorgeous-apartment-73441
01/05/2023, 11:59 AMrich-policeman-92383
01/05/2023, 1:13 PMKafkaMessageListenerContainer$ListenerConsumer Consumer exception
java.lang.IllegalStateException: This error handler cannot process 'org.apache.kafka.common.errors.SaslAuthenticationException's; no record information is available
at org.springframework.kafka.listener.SeekUtils.seekOrRecover(SeekUtils.java:200)
at org.springframework.kafka.listener.SeekToCurrentErrorHandler.handle(SeekToCurrentErrorHandler.java:112)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.handleConsumerException(KafkaMessageListenerContainer.java:1604)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:1212)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.apache.kafka.common.errors.SaslAuthenticationException: Authentication failed during authentication due to invalid credentials with SASL mechanism GSSAPI
hallowed-dog-79615
01/05/2023, 1:28 PMgenerate_database_name
, so the name is dynamically generated according to some env parameters. I have used it with my models with no problem. Basically, in my models config I specify some key in the database field of the yaml, then that key is translated by the generate_database_name
macro. I don't have to specifically call the macro, as it is called by default by dbt (i.e. if I specify 'shop' in the database field of a model, my macro automatically translates it to 'shop_pre' or 'shop_prod' depending on the environment).
This is not happening with sources. If I specify 'shop' as the database of a source, then the connector fails to find the 'shop' database in my warehouse. This is normal, as 'shop' does not exist, is just the key I use to then generate the real name of databases (in the example, the real databases are shop_pre and shop_prod again). From this, I infer that sources are not calling generate_database_name
macro by default, but they are just taking the value in the corresponding YAML field, or calling a different macro. Is there a macro then that I can tweak in order to change this? I know that I can put simple jinja in the YAML, so I should be able to add a simple 'shop_pre' if target.name == 'pre' else 'shop_prod'
, but I think its less clean than having a macro, and it is also less maintainable in the long term. Also I cannot call custom macros inside the YAML, as I will get a non defined error.
So, basically, may I change any dbt default macro that is generating the database name for sources? Maybe the source
macro itself? If so, where can I find its original code of it, just to be sure I don't break it?handsome-football-66174
01/05/2023, 2:27 PMblue-crowd-84759
01/05/2023, 3:15 PMUnable to run quickstart - the following issues were detected:
- kafka-setup is still running
- schema-registry is not running
- broker is not running
- datahub-gms is still starting
- zookeeper is not running
I used datahub docker quickstart --version=v0.9.5 --quickstart-compose-file docker-compose.yml
but the usual datahub docker quickstart
also fails, though with this message:
Unable to run quickstart - the following issues were detected:
- datahub-gms is still starting
- mysql-setup is still running
- mysql is not running
I can attach the respective log files, but I’m wondering if this already tells someone somethingaverage-dinner-25106
01/05/2023, 4:42 AMbland-lighter-26751
01/05/2023, 7:16 PMrhythmic-stone-77840
01/05/2023, 8:20 PMsystem
, system.user
, system.user.name
Where system can be see as the parent term to the other two. Same with system.user being the parent of the last one.
I was going to make system and system.user into Term Groups - but it doesn't look like you can assign a term group as an actual Term. Is there something I'm missing here or a better way to model this is the glossary (other than making everything flat)abundant-airport-72599
01/05/2023, 8:43 PMUPGRADE_DEFAULT_BROWSE_PATHS_ENABLED=true
and restarting the GMS pod, but the upgrade job just seems to silently fail. Some details added in 🧵microscopic-machine-90437
01/06/2023, 5:33 AMfierce-electrician-85924
01/06/2023, 12:13 PMisPartitioningKey
info for particular field using graphql?modern-answer-65441
01/06/2023, 4:44 PMimport { AnalyticsBrowser } from '@segment/analytics-next';
const isEnabled = true;
const key: string = process.env.SEGMENT_KEY as string;
let analytics = AnalyticsBrowser.load({ writeKey: key });
analytics = Object.assign(analytics, { name: 'segment', loaded: () => true });
export default { isEnabled, plugin: analytics };
After building the image and running it, the events are not captured in segment
it works when I directly put the key, something like this
const key: string = 'YDFGDfsdfEW';
I entered the container and ran the command 'printenv'
I can see the env variable in the container
Can someone tell me why 'process.env.SEGMENT_KEY' is not able to pull the value ?rhythmic-stone-77840
01/06/2023, 6:48 PMrhythmic-stone-77840
01/06/2023, 8:31 PMdocker exec --privileged datahub-gms ls -la /tmp/datahub/logs/gms
returns back a cannot access '/tmp/datahub/logs/gms': No such file or directory
- I've rebuild my docker setup from scratch and it's still having issues. The tmp file has no datahub folder in it at allmelodic-dress-7431
01/09/2023, 2:35 AMpowerful-cat-68806
01/08/2023, 10:47 AMinclude_tables: true
include_views: true
profiling:
enabled: true
So all dataset data will be available
Any idea?
Cc: @modern-garden-35830refined-tent-35319
01/09/2023, 9:23 AMcool-kitchen-48091
01/09/2023, 1:42 PMdatahub delete --env PROD --entity_type dataset
gives me assertion error:
File "/usr/local/lib/python3.10/site-packages/datahub/cli/cli_utils.py", line 403, in get_urns_by_filter
329 def get_urns_by_filter(
330 platform: Optional[str],
331 env: Optional[str] = None,
332 entity_type: str = "dataset",
333 search_query: str = "*",
334 include_removed: bool = False,
335 only_soft_deleted: Optional[bool] = None,
336 ) -> Iterable[str]:
(...)
399 for x in results["value"]["entities"]:
400 entities_yielded += 1
401 log.debug(f"yielding {x['entity']}")
402 yield x["entity"]
--> 403 assert (
404 entities_yielded == num_entities
..................................................
platform = 'mysql'
Optional = typing.Optional
env = None
entity_type = 'dataset'
search_query = '*'
include_removed = False
only_soft_deleted = None
Iterable = typing.Iterable
results = {'value': {'numEntities': 1978,
'pageSize': 10000,
'from': 0,
'metadata': {...},
'entities': [...]}}
entities_yielded = 0
log.debug = <method 'Logger.debug' of <Logger datahub.cli.cli_utils (DEBUG)> __init__.py:1455>
num_entities = 1978
..................................................
---- (full traceback above) ----
File "/usr/local/lib/python3.10/site-packages/datahub/entrypoints.py", line 149, in main
sys.exit(datahub(standalone_mode=False, **kwargs))
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/datahub/upgrade/upgrade.py", line 386, in async_wrapper
loop.run_until_complete(run_func_check_upgrade())
File "/usr/local/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
return future.result()
File "/usr/local/lib/python3.10/site-packages/datahub/upgrade/upgrade.py", line 373, in run_func_check_upgrade
ret = await the_one_future
File "/usr/local/lib/python3.10/site-packages/datahub/upgrade/upgrade.py", line 366, in run_inner_func
return await loop.run_in_executor(
File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/local/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 343, in wrapper
raise e
File "/usr/local/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 295, in wrapper
res = func(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/datahub/cli/delete_cli.py", line 192, in delete
deletion_result = delete_with_filters(
File "/usr/local/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 343, in wrapper
raise e
File "/usr/local/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 295, in wrapper
res = func(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/datahub/cli/delete_cli.py", line 245, in delete_with_filters
urns = list(
File "/usr/local/lib/python3.10/site-packages/datahub/cli/cli_utils.py", line 403, in get_urns_by_filter
assert (
AssertionError: Did not delete all entities, try running this command again!
bumpy-manchester-97826
01/09/2023, 5:57 PMdocker build -t data-catalog-frontend:0.1 -f docker/datahub-frontend/Dockerfile .
However I’m getting (in the thread)rich-policeman-92383
01/06/2023, 10:27 PM