mysterious-portugal-30527
03/02/2022, 9:34 PMpermission denied for table someschema.blahblahblah
…but the user connecting to the instance can query said tables, validated via Postgres CLI. It is failing every table even though the user has select on all tables via role membership. Help!
Ingestion script follows:
source:
type: postgres
config:
host_port: 'somepostgreshost:port'
database: someuserdatabase
username: avaliduser
password: '${userpassword}'
schema_pattern:
allow:
- public
include_tables: true
include_views: true
profiling:
enabled: true
sink:
type: datahub-rest
config:
server: '<http://datahub-gms:8080>'
Error text snippet:
['Profiling exception (psycopg2.errors.InsufficientPrivilege) permission denied for "
"table '\n"
adamant-kilobyte-90981
03/02/2022, 10:19 PMcool-painting-92220
03/03/2022, 1:49 AMSecrets
has been added that could be related to what I'm looking for, but I'm not too familiar with its limitations/capabilities. Is there a best practice out there for this situation?red-zebra-92204
03/03/2022, 2:47 AMDataHubGraph.get_aspect(aspect="dataJobInfo")
, i'm getting this error: avro.schema.AvroException: ('Datum union type not in schema: %s', None)
The problem lies on the type
field, which should be either AzkabanJobTypeClass
or string
, but it has the value of {"string": "SPARK"}
.
I don't know how to resolve this? Is this related to the deprecation of class AzkabanJobType
?
Code to produce this error:
graph = DataHubGraph(config=DatahubClientConfig(
server='<https://demo.datahubproject.io/api/gms>',
extra_headers={'cookie': f'{cookie}'}
))
job_info = graph.get_aspect(
entity_urn='urn:li:dataJob:(urn:li:dataFlow:(spark,orders_cleanup_flow,PROD),orders_dedupe_job)',
aspect='dataJobInfo',
aspect_type=DataJobInfoClass,
)
Github issue: https://github.com/linkedin/datahub/issues/4289adorable-flower-19656
03/03/2022, 3:01 AMred-napkin-59945
03/03/2022, 6:26 AMBrowsableEntityType
and noticed other existing type has some variable called FACET_FIELDS
like in DashboardType
private static final Set<String> FACET_FIELDS = ImmutableSet.of("access", "tool");
square-solstice-69079
03/03/2022, 8:29 AMadamant-furniture-37835
03/03/2022, 8:35 AMaverage-vr-64604
03/03/2022, 11:41 AMDATAHUB_TELEMETRY_ENABLED
to false
. But any calls to datahub
CLI falls with error:
mixpanel.MixpanelException: HTTPSConnectionPool(host='<http://api.mixpanel.com|api.mixpanel.com>', port=443): Max retries exceeded with url: /engage (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7fafa0073b80>, 'Connection to corp-proxy timed out. (connect timeout=10)'))
Do there any other options to disable telemetry collection?adventurous-dream-16099
03/03/2022, 5:30 PMnumerous-camera-74294
03/03/2022, 5:52 PMmysterious-portugal-30527
03/03/2022, 6:23 PM'HINT: No operator matches the given name and argument types. You might need to add explicit type casts.\n'
The associated SQL Statement:
'(SELECT count(*) AS element_count, sum(CASE WHEN (offer_set IN (NULL) OR offer_set IS NULL) THEN %(param_11)s ELSE %(param_12)s END) '
'AS null_count \n'
Also, this statement seems a bit non-sensical. checking for offer_set in (NULL) or offer_set IS NULL) ?? Both of these tests do the same thing right?? Wouldn’t we want to pick one test? Why do both? Am I missing something??billowy-rocket-47022
03/03/2022, 9:58 PMred-napkin-59945
03/03/2022, 10:58 PMdomains
which indicate one dashboard could belongs to multiple domains. However, in the GraphQL schema definition, Dashboard entity only has one domain? Is this on purpose?mysterious-portugal-30527
03/03/2022, 11:46 PMdatahub docker check
The following issues were detected:
- datahub-gms is running but not healthy
What would be next steps?red-napkin-59945
03/04/2022, 1:15 AM{errors=[The object type 'DataDoc' [@3117:1] does not have a field 'relationships' required via interface 'Entity' [@297:1], There is no type resolver defined for interface / union 'DataDocCell' type]}
Any suggestions about how to fix it?most-pillow-90882
03/04/2022, 5:52 AMdatahub-gms | 2022/03/04 05:44:18 Problem with dial: dial tcp: lookup broker on 127.0.0.11:53: server misbehaving. Sleeping 1s
datahub-actions_1 | 2022/03/04 05:44:18 Problem with request: Get "<http://datahub-gms:8080/health>": dial tcp 172.20.0.10:8080: connect: connection refused. Sleeping 1s
able-rain-74449
03/04/2022, 8:54 AMsalmon-area-51650
03/04/2022, 9:38 AMJSON
columns are not compatible in SQL Profile
for PostgreSQL
.
Is there a way to skip all JSON columns or I need to include all columns one by one in profile_pattern
?
Thanks!!few-air-56117
03/04/2022, 1:45 PM2022-03-04 15:44:28.049 EETInvalidURL: Failed to parse: http://${GMS_HOST:-localhost}:${GMS_PORT:-8080}/config
😞damp-greece-27806
03/04/2022, 2:01 PMsome-pizza-26257
03/04/2022, 6:52 PMelegant-traffic-96321
03/04/2022, 9:58 PM{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [browserId] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "datahub_usage_event",
"node": "3d1IH_U4T1OXbZJkCbNWtw",
"reason": {
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [browserId] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}
}
],
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [browserId] in order to load field data by uninverting the inverted index. Note that this can use significant memory.",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [browserId] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}
}
},
"status": 400
}
acoustic-wolf-70583
03/05/2022, 1:27 AMmysterious-portugal-30527
03/07/2022, 6:44 PMdatahub docker nuke --keep-data
and then datahub docker quickstart
After everything restarts, if I go to the ingestion page it indicates that an ingestion which had been running before my reboot is still running.
I believe this is incorrect. I think this is a state artifact left over from before the restart, right?melodic-helmet-78607
03/08/2022, 2:09 AM02:05:17 [application-akka.actor.default-dispatcher-16] ERROR controllers.AuthenticationController - Authentication error
javax.naming.AuthenticationException: javax.security.auth.login.FailedLoginException: Cannot connect to LDAP server
adorable-tomato-97942
03/08/2022, 6:51 AMsparse-account-96468
03/08/2022, 10:34 AMfull-dentist-68591
03/08/2022, 10:46 AMnumerous-camera-74294
03/08/2022, 3:15 PMValueError: com.linkedin.pegasus2avro.schema.Schemaless contains extra fields: {'com.linkedin.schema.MySqlDDL'}
when using DataHubGraph.get_aspect like:
current_schema_metadata = graph.get_aspect(
entity_urn=dataset_urn,
aspect="schemaMetadata",
aspect_type=SchemaMetadataClass,
)