best-eve-12546
03/02/2023, 4:20 PMOneOf
"OneOfSchemaMetadataPlatformSchema":
{
"required":
[
"__type"
],
"type": "object",
"properties":
{
"__type":
{
"type": "string"
}
},
"description": "The native schema in the dataset's platform.",
"discriminator":
{
"propertyName": "__type"
}
},
It seems like this is missing a OneOf — I see in the Golden Test Data that this is actually a nested structure, with no “__type” field. I see the same for a bunch of other OneOf types like OneOfSchemaFieldDataTypeType
. Checked out the YAML schema and it’s the same.
Am I supposed to be instantiating these somehow, or should I be sending raw json in “__type” or?nice-match-35259
03/02/2023, 4:23 PMyaml_config = f"""
name: {my_checkpoint_name}
config_version: 1.0
class_name: SimpleCheckpoint
run_name_template: "%Y%m%d-%H%M%S-my-run-name-template"
validations:
- batch_request:
datasource_name: bigquery_datasource
data_connector_name: default_inferred_data_connector_name
data_asset_name: raw_prod.applicative_database_deposit
data_connector_query:
index: -1
expectation_suite_name: suite_deposit_test2
action_list:
- name: datahub_action
action:
module_name: datahub.integrations.great_expectations.action
class_name: DataHubValidationAction
server_url: {server_url}
token: {gms_token}
extra_headers:
- Proxy-Authorization: Bearer {iap_token}
"""
The checkpoint runs correctly but the metadata is not sent to datahub gms. The error I got is in the attached .png. Did somebody face the same problem ?nutritious-bird-77396
03/02/2023, 6:02 PMcuddly-butcher-39945
03/02/2023, 10:01 PMquery ListCorpGroups {
search(input: { type: corpgroup, query: "*"}) {
total
count
searchResults {
entity {
urn
type
... on corpgroup {
properties {
name
}
}
}
}
}
}
Getting the following error:
{
"errors": [
{
"message": "Validation error (WrongType@[search]) : argument 'input.type' with value 'EnumValue{name='corpgroup'}' is not a valid 'EntityType' - Expected enum literal value not in allowable values - 'EnumValue{name='corpgroup'}'.",
"locations": [
{
"line": 2,
"column": 10
}
],
"extensions": {
"classification": "ValidationError"
}
},
{
"message": "Validation error (UnknownType@[search/searchResults/entity]) : Unknown type 'corpgroup'",
"locations": [
{
"line": 9,
"column": 16
}
],
"extensions": {
"classification": "ValidationError"
}
}
],
"data": null,
"extensions": {}
}
Thanks In advance!numerous-account-62719
03/03/2023, 4:32 AMnumerous-account-62719
03/03/2023, 4:33 AMmicroscopic-room-90690
03/03/2023, 8:49 AMFailed to classify table columns
cli_version and gms_version are both '0.9.6.1'
It seems everything is ok except classificaion. What should I do?
Any help will be appreciated. Thank you!shy-dog-84302
03/03/2023, 2:02 PMbest-umbrella-88325
03/06/2023, 7:41 AMgifted-diamond-19544
03/06/2023, 12:13 PMfuture-dog-77968
03/06/2023, 6:17 PMhandsome-football-66174
03/06/2023, 9:25 PMSLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See <http://www.slf4j.org/codes.html#StaticLoggerBinder> for further details.
ERROR SpringApplication Application run failed
org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'upgradeCli': Unsatisfied dependency expressed through field 'noCodeUpgrade'; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'ebeanServer' defined in class path resource [com/linkedin/gms/factory/entity/EbeanServerFactory.class]: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [io.ebean.EbeanServer]: Factory method 'createServer' threw exception; nested exception is java.lang.NullPointerException
proud-soccer-58887
03/07/2023, 6:08 AMrich-pager-68736
03/07/2023, 6:39 AM2023-02-27 15:06:41.598 ERROR 1 --- [ool-10-thread-1] c.l.m.dao.producer.KafkaHealthChecker : Failed to emit MCL for entity urn:li:dataHubExecutionRequest:#Snowflake-2023_02_20-09_43_34
org.apache.#Kafka.common.errors.RecordTooLargeException: The message is 1633361 bytes when serialized which is larger than 1048576, which is the value of the max.request.size configuration.
I've already increased the allowed message size for the topic (max.message.bytes
) and the Kafka cluster (replica.fetch.max.bytes
). I do not find any config parameter to adjust the producer's max.request.size
, i.e., datahub-upgrade, though.
Same for consumer side - how to increase max.partition.fetch.bytes
for MCL consumer?
Any help here?great-branch-515
03/07/2023, 7:36 AM2023-03-07 07:27:35,738 [ThreadPoolTaskExecutor-1] WARN o.apache.kafka.clients.NetworkClient:1077 - [Consumer clientId=consumer-generic-duhe-consumer-job-client-1, groupId=generic-duhe-consumer-job-client] Error while fetching metadata with correlation id 2775 : {DataHubUpgradeHistory_v1=UNKNOWN_TOPIC_OR_PARTITION}
2023-03-07 07:27:35,738 [ThreadPoolTaskExecutor-1] WARN o.apache.kafka.clients.NetworkClient:1077 - [Consumer clientId=consumer-generic-duhe-consumer-job-client-1, groupId=generic-duhe-consumer-job-client] Error while fetching metadata with correlation id 2775 : {DataHubUpgradeHistory_v1=UNKNOWN_TOPIC_OR_PARTITION}
2023-03-07 07:27:35,839 [ThreadPoolTaskExecutor-1] WARN o.apache.kafka.clients.NetworkClient:1077 - [Consumer clientId=consumer-generic-duhe-consumer-job-client-1, groupId=generic-duhe-consumer-job-client] Error while fetching metadata with correlation id 2776 : {DataHubUpgradeHistory_v1=UNKNOWN_TOPIC_OR_PARTITION}
2023-03-07 07:27:35,839 [ThreadPoolTaskExecutor-1] WARN o.apache.kafka.clients.NetworkClient:1077 - [Consumer clientId=consumer-generic-duhe-consumer-job-client-1, groupId=generic-duhe-consumer-job-client] Error while fetching metadata with correlation id 2776 : {DataHubUpgradeHistory_v1=UNKNOWN_TOPIC_OR_PARTITION}
2023-03-07 07:27:35,940 [ThreadPoolTaskExecutor-1] WARN o.apache.kafka.clients.NetworkClient:1077 - [Consumer clientId=consumer-generic-duhe-consumer-job-client-1, groupId=generic-duhe-consumer-job-client] Error while fetching metadata with correlation id 2777 : {DataHubUpgradeHistory_v1=UNKNOWN_TOPIC_OR_PARTITION}
2023-03-07 07:27:35,940 [ThreadPoolTaskExecutor-1] WARN o.apache.kafka.clients.NetworkClient:1077 - [Consumer clientId=consumer-generic-duhe-consumer-job-client-1, groupId=generic-duhe-consumer-job-client] Error while fetching metadata with correlation id 2777 : {DataHubUpgradeHistory_v1=UNKNOWN_TOPIC_OR_PARTITION}
When we try to login in frontend we get error
Failed to perform post authentication steps. Error message: Failed to provision user with urn
that is caused by
java.lang.RuntimeException: Failed to provision user with urn urn:li:corpuser:atul.atri@chegg.com.
Any ideas?busy-analyst-35820
03/07/2023, 10:33 AMelegant-salesmen-99143
03/07/2023, 11:34 AMdelightful-sugar-63810
03/07/2023, 1:47 PMhallowed-shampoo-52722
03/07/2023, 2:53 PMmicroscopic-application-63745
03/07/2023, 2:56 PMverify_ssl
but whenever I add it I get the following error:
[2023-03-07 15:10:23,049] ERROR {logger:26} - Please set env variable SPARK_VERSION
[2023-03-07 15:10:23,543] ERROR {datahub.ingestion.run.pipeline:127} - 1 validation error for DataLakeSourceConfig
verify_ssl
extra fields not permitted (type=value_error.extra)
Please note that without the verify_ssl
the recipe ingests just fine.green-hamburger-3800
03/07/2023, 4:32 PMMLMODEL
This was my query:
mutation CreatePolicy($input: PolicyUpdateInput!) {
createPolicy(input: $input)
}
This was my payload:
{
"input": {
"type": "METADATA",
"name": "MLP - Service Account",
"state": "ACTIVE",
"description": "Test",
"privileges": [
"EDIT_ENTITY_TAGS",
"EDIT_ENTITY_GLOSSARY_TERMS",
"EDIT_ENTITY_OWNERS",
"EDIT_ENTITY_DOCS",
"EDIT_ENTITY_DOC_LINKS",
"EDIT_ENTITY_STATUS",
"EDIT_DOMAINS_PRIVILEGE",
"EDIT_DEPRECATION_PRIVILEGE",
"EDIT_ENTITY",
"EDIT_DATASET_COL_DESCRIPTION",
"EDIT_DATASET_COL_TAGS",
"EDIT_DATASET_COL_GLOSSARY_TERMS",
"EDIT_ENTITY_ASSERTIONS",
"EDIT_LINEAGE",
"EDIT_ENTITY_EMBED",
"EDIT_TAG_COLOR"
],
"actors": {
"users": [
"urn:li:corpuser:mlp_user"
],
"allUsers": false,
"allGroups": false,
"resourceOwners": false
},
"resources": {
"type": "MLMODEL",
"allResources": true
}
}
}
important-processor-44077
03/07/2023, 11:37 PMbest-wire-59738
03/08/2023, 11:44 AMgeneric-mae-consumer-job-client
) reads all the partitions from the topic as the change made to UI is also some where in the queue in the kafka topic.
1. Can we use seperate topic for all the changes we made using UI so that our UI be free from freeezing issue?
2. Also how can we let ourselves come out from the re-balancing of groups issue and speed up our ingestion, As kafka is Asynchronous . MCE consumers are slow in reading the offsets. we are yet to create standalone MCE and MAE Consumers. Hope it increase the speed of the Ingestion but yet to find solution for re-balancing issue.gentle-camera-33498
03/08/2023, 12:47 PM[I/O dispatcher 2] ERROR c.l.m.s.e.update.BulkListener:44 - Failed to feed bulk request. Number of events: 5 Took time ms: -1 Message: failure in bulk execution:
[1]: index [datasetindex_v2_1678278613797], type [_doc], id [urn...], message [[datasetindex_v2_1678278613797/YrBRraPeT6OLr7JvUNdy6A][[datasetindex_v2_1678278613797][0]] ElasticsearchException[Elasticsearch exception [type=document_missing_exception, reason=[_doc][urn...]: document missing]]]
I'm unsure if it is the cause, but I do not see any dataset in the UI.
NOTE: Yes, I tried to run the restoreIndices job, but nothing changed.nice-river-27843
03/08/2023, 1:47 PM2023-03-08 13:37:28,465 [application-akka.actor.default-dispatcher-13] ERROR o.p.core.engine.DefaultCallbackLogic - Unable to renew the session. The session store may not support this feature
acceptable-evening-60358
03/08/2023, 2:58 PMacceptable-evening-60358
03/08/2023, 2:59 PMable-city-76673
03/09/2023, 6:24 AMagreeable-belgium-70840
03/09/2023, 9:35 AM2023-03-09 09:29:44,122 [I/O dispatcher 1] INFO c.l.m.s.e.update.BulkListener:47 - Successfully fed bulk request. Number of events: 1 Took time ms: -1
2023-03-09 09:30:23,729 [R2 Nio Event Loop-1-1] WARN c.l.r.t.h.c.c.ChannelPoolLifecycle:139 - Failed to create channel, remote=localhost/127.0.0.1:8080
io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:8080
Caused by: java.net.ConnectException: Connection refused
at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:777)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:337)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:776)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at java.base/java.lang.Thread.run(Thread.java:829)
Any ideas?freezing-architect-85960
03/09/2023, 9:58 AM%4|1678351593.679|TERMINATE|rdkafka#producer-2| [thrd:app]: Producer terminating with 23 messages (9368 bytes) still in queue or transit: use flush() to wait for outstanding message delivery
Any help ideas about this? I didn't find the flush action in the datahub-airflow-plugin emit function