square-solstice-69079
04/04/2022, 12:58 PMicy-piano-35127
04/04/2022, 1:10 PMdatahub ingest rollback --run-id <run_id>
but i got an expection. I want to remove all my data from a specific datasource. What do you suggest?red-napkin-59945
04/04/2022, 5:57 PMred-napkin-59945
04/04/2022, 6:37 PM,
in the search string, the UI will stuck there rather than returning empty result.cool-painting-92220
04/04/2022, 10:57 PMbreezy-controller-54597
04/05/2022, 6:38 AMhttp://{datahub-frontend_url}/browse
but not in http://{datahub-frontend_url}/search
.curved-truck-53235
04/05/2022, 1:11 PMbumpy-activity-74405
04/05/2022, 1:29 PM0.8.26
-> 0.8.32.1
and ingestion library 0.8.22.1
-> 0.8.32.1
and now I am getting errors when I try to ingest data from looker:
TypeError: You should use `typing_extensions.TypedDict` instead of `typing.TypedDict` with Python < 3.9.2. Without it, there is no way to differentiate required and optional fields when subclassed.
full stack trace in thread. Any ideas?chilly-oil-22683
04/05/2022, 1:32 PMBrainbee
Database: dh-test-ing-brainbee
is associated with "Term" Brainbee
When I approach it through the search service, ie URL = https://my-host/search?filter_glossaryTerms=urn%3Ali%3AglossaryTerm%3ASource.Brainbee
I do see the associated database
When I approach the term from the glossary, URL = https://my-host/glossary/urn:li:glossaryTerm:Sources.Brainbee/Related%20Entities
I do not see any related entities. Weird, I would have expected the associated database here.
When I do the same for another Term, it does work. Why doesn't the glossary page sometimes show all related entities?
Do you recognize this behaviour? How can I repair those links?salmon-area-51650
04/05/2022, 2:07 PMError: UPGRADE FAILED: template: datahub/templates/_helpers.tpl:69:45: executing "datahub-ingestion-cron.cronjob.apiVersion" at <.Capabilities.KubeVersion.Version>: nil pointer evaluating interface {}.KubeVersion
quick-student-61408
04/05/2022, 3:37 PMswift-breakfast-25077
04/05/2022, 4:26 PMred-napkin-59945
04/05/2022, 4:53 PMYou should use `typing_extensions.TypedDict` instead of `typing.TypedDict` with Python < 3.9.2. Without it, there is no way to differentiate required and optional fields when subclassed.
red-napkin-59945
04/05/2022, 4:54 PMred-napkin-59945
04/05/2022, 8:40 PMplain-napkin-77279
04/06/2022, 2:34 AMambitious-cartoon-15344
04/06/2022, 5:32 AMHello everyone, is it related to the use of the following?I have not found usage in the official documentation.
RBAC: Fine-grained Access Controls in DataHub
brave-businessperson-3969
04/06/2022, 10:40 AMfrom datahub.ingestion.graph.client import DatahubClientConfig, DataHubGraph
from datahub.metadata.schema_classes import (
SchemaMetadataClass,
EditableSchemaMetadataClass
)
gms_endpoint = https://<url_of_the_datahub_installation>/api/gms
dataset_urn = <some existing table urn in datahub> #e.g. "urn:li:dataset:(urn:li:dataPlatform:postgres,pagila.public.actor,PROD)"
datahub_uplink = DataHubGraph(config=DatahubClientConfig(server=gms_endpoint, token=datahub_token))
# Following code throws
# ValueError(f'{readers_schema.fullname} contains extra fields: {input_keys}')
# ValueError: com.linkedin.pegasus2avro.schema.Schemaless contains extra fields:
# {'com.linkedin.schema.MySqlDDL'}input_keys}')"
schema_metadata = datahub_uplink.get_aspect_v2(
entity_urn=dataset_urn,
aspect="schemaMetadata",
aspect_type=SchemaMetadataClass,
)
# This works (quering a different aspect):
# schema_metadata = datahub_uplink.get_aspect_v2(
# entity_urn=dataset_urn,
# aspect="editableSchemaMetadata",
# aspect_type=EditableSchemaMetadataClass,
# )
Full error:
Traceback (most recent call last):
File "error.py", line 20, in <module>
schema_metadata = datahub_uplink.get_aspect_v2(
File "/home/uwest/venv/lib/python3.8/site-packages/datahub/ingestion/graph/client.py", line 148, in get_aspect_v2
return aspect_type.from_obj(aspect_json, tuples=True)
File "/home/uwest/venv/lib/python3.8/site-packages/avrogen/dict_wrapper.py", line 41, in from_obj
return conv.from_json_object(obj, cls.RECORD_SCHEMA)
File "/home/uwest/venv/lib/python3.8/site-packages/avrogen/avrojson.py", line 104, in from_json_object
return self._generic_from_json(json_obj, writers_schema, readers_schema)
File "/home/uwest/venv/lib/python3.8/site-packages/avrogen/avrojson.py", line 257, in _generic_from_json
result = self._record_from_json(json_obj, writers_schema, readers_schema)
File "/home/uwest/venv/lib/python3.8/site-packages/avrogen/avrojson.py", line 345, in _record_from_json
field_value = self._generic_from_json(json_obj[field.name], writers_field.type, field.type)
File "/home/uwest/venv/lib/python3.8/site-packages/avrogen/avrojson.py", line 255, in _generic_from_json
result = self._union_from_json(json_obj, writers_schema, readers_schema)
File "/home/uwest/venv/lib/python3.8/site-packages/avrogen/avrojson.py", line 313, in _union_from_json
return self._generic_from_json(json_obj, s, readers_schema)
File "/home/uwest/venv/lib/python3.8/site-packages/avrogen/avrojson.py", line 238, in _generic_from_json
return self._generic_from_json(json_obj, writers_schema, s)
File "/home/uwest/venv/lib/python3.8/site-packages/avrogen/avrojson.py", line 257, in _generic_from_json
result = self._record_from_json(json_obj, writers_schema, readers_schema)
File "/home/uwest/venv/lib/python3.8/site-packages/avrogen/avrojson.py", line 358, in _record_from_json
raise ValueError(f'{readers_schema.fullname} contains extra fields: {input_keys}')
ValueError: com.linkedin.pegasus2avro.schema.Schemaless contains extra fields: {'com.linkedin.schema.MySqlDDL'}
brave-secretary-27487
04/06/2022, 11:27 AMb'{"data":{"updateDeprecation":true}}'
but the deprication isn't visable in the datahub AI.
I guess the deprication should be visable in the UI. Am i missing something?bumpy-activity-74405
04/06/2022, 12:10 PM/.datahub
when running datahub ingest -c...
or maybe configure this path via some env variable? My use case is that I am running an ingestion recipe in an environment where I don’t have access to the root folder. Up until now (was using 0.8.16.4
) it worked fine, but after updating to 0.8.32.1
I am getting PermissionError: [Errno 13] Permission denied: '/.datahub'
. I am a bit confused as to why that would be required in the first place since gms host is being passed via the recipe in the sink configuration 🤔breezy-portugal-43538
04/06/2022, 12:20 PMbroker | [2022-04-06 12:11:25,098] INFO [Controller id=1] Processing automatic preferred replica leader election (kafka.controller.KafkaController)
broker | [2022-04-06 12:11:25,099] TRACE [Controller id=1] Checking need to trigger auto leader balancing (kafka.controller.KafkaController)
broker | [2022-04-06 12:11:25,101] DEBUG [Controller id=1] Preferred replicas by broker Map(1 -> Map(__consumer_offsets-22 -> Vector(1), __consumer_offsets-30 -> Vector(1), __consumer_offsets-8 -> Vector(1), __consumer_offsets-21 -> Vector(1), __consumer_offsets-4 -> Vector(1), __consumer_offsets-27 -> Vector(1), __consumer_offsets-7 -> Vector(1), __consumer_offsets-9 -> Vector(1), __consumer_offsets-46 -> Vector(1), __consumer_offsets-25 -> Vector(1), __consumer_offsets-35 -> Vector(1), __consumer_offsets-41 -> Vector(1), __consumer_offsets-33 -> Vector(1), __consumer_offsets-23 -> Vector(1), __consumer_offsets-49 -> Vector(1), _schemas-0 -> Vector(1), MetadataChangeEvent_v4-0 -> Vector(1), __consumer_offsets-47 -> Vector(1), MetadataChangeLog_Timeseries_v1-0 -> Vector(1), __consumer_offsets-16 -> Vector(1), __consumer_offsets-28 -> Vector(1), __consumer_offsets-31 -> Vector(1), __consumer_offsets-36 -> Vector(1), __consumer_offsets-42 -> Vector(1), __consumer_offsets-3 -> Vector(1), __consumer_offsets-18 -> Vector(1), __consumer_offsets-37 -> Vector(1), __consumer_offsets-15 -> Vector(1), __consumer_offsets-24 -> Vector(1), MetadataChangeProposal_v1-0 -> Vector(1), DataHubUsageEvent_v1-0 -> Vector(1), __consumer_offsets-38 -> Vector(1), __consumer_offsets-17 -> Vector(1), __consumer_offsets-48 -> Vector(1), __confluent.support.metrics-0 -> Vector(1), __consumer_offsets-19 -> Vector(1), __consumer_offsets-11 -> Vector(1), FailedMetadataChangeEvent_v4-0 -> Vector(1), __consumer_offsets-13 -> Vector(1), __consumer_offsets-2 -> Vector(1), __consumer_offsets-43 -> Vector(1), __consumer_offsets-6 -> Vector(1), __consumer_offsets-14 -> Vector(1), MetadataAuditEvent_v4-0 -> Vector(1), FailedMetadataChangeProposal_v1-0 -> Vector(1), MetadataChangeLog_Versioned_v1-0 -> Vector(1), __consumer_offsets-20 -> Vector(1), __consumer_offsets-0 -> Vector(1), __consumer_offsets-44 -> Vector(1), __consumer_offsets-39 -> Vector(1), __consumer_offsets-12 -> Vector(1), __consumer_offsets-45 -> Vector(1), __consumer_offsets-1 -> Vector(1), __consumer_offsets-5 -> Vector(1), __consumer_offsets-26 -> Vector(1), __consumer_offsets-29 -> Vector(1), __consumer_offsets-34 -> Vector(1), __consumer_offsets-10 -> Vector(1), __consumer_offsets-32 -> Vector(1), __consumer_offsets-40 -> Vector(1))) (kafka.controller.KafkaController)
broker | [2022-04-06 12:11:25,102] DEBUG [Controller id=1] Topics not in preferred replica for broker 1 Map() (kafka.controller.KafkaController)
broker | [2022-04-06 12:11:25,102] TRACE [Controller id=1] Leader imbalance ratio for broker 1 is 0.0 (kafka.controller.KafkaController)
mysql | 2022-04-06T12:13:06.251636Z 33 [Note] Bad handshake
mysql | 2022-04-06T12:15:06.251404Z 35 [Note] Bad handshake
ambitious-cartoon-15344
04/07/2022, 1:52 AM<https://datahubproject.io/docs/how/auth/add-users>
, Use static credentials method add user, But my UI Tag (user & groups) cannot view users.
My steps:
shell
[root@90037datahub ~]# echo "luoweiying:123456Test" >> ${HOME}/.datahub/plugins/frontend/auth/user.props
[root@90037datahub ~]# cat ${HOME}/.datahub/plugins/frontend/auth/user.props
luoweiying:123456Test
[root@90037datahub ~]# docker restart datahub-frontend-react
datahub-frontend-react
And this user cannot be used in Policiesmany-guitar-67205
04/07/2022, 9:19 AMERR_CONNECTION_RESET
errors in createHttpLink.ts:145, that occurred when calling the graphql api. The query is:
{
"operationName": "createDomain",
"variables": {
"input": {
"id": "foobar-domain-id",
"name": "FOOBAR",
"description": "FOOBAR Description"
}
},
"query": "mutation createDomain($input: CreateDomainInput!) {\n createDomain(input: $input)\n}\n"
}
There is nothing in the datahub-gms logs.
Is this a bug? How can I use my own ID for a Domain?breezy-portugal-43538
04/07/2022, 10:57 AMdocker-compose-without-neo4j-m1.quickstart.yml
and I had modified the following settings:
postgres:
command: --character-set-server=utf8mb4 --collation-server=utf8mb4_bin
container_name: postgres
environment:
- POSTGRES_DB=datahub
- POSTGRES_USER=datahub
- POSTGRES_PASSWORD=datahub
hostname: postgres
image: postgres:13
ports:
- 5432:5432
volumes:
- postgres-db-volume:/var/lib/postgresql/data
postgres-setup:
container_name: postgres-setup
depends_on:
- postgres
environment:
- POSTGRES_HOST=postgres
- POSTGRES_PORT=5432
- POSTGRES_USER=datahub
- POSTGRES_PASSWORD=datahub
- POSTGRES_DB=datahub
hostname: postgres-setup
image: acryldata/datahub-postgres-setup:head
Is this a correct way of addressing the change? Could you help by suggesting on how to change the volume. what should it point to?curved-truck-53235
04/07/2022, 11:20 AMrich-machine-24265
04/07/2022, 12:49 PM./gradlew build
with jdk1.8.0 I get error
FAILURE: Build failed with an exception.
* What went wrong:
Execution failed for task ':metadata-integration:java:datahub-protobuf:compileJava'.
> error: release version 11 not supported
when run with OpenJDK 11 I get a lot of errors cannot find symbol @javax.annotation.processing.Generated
Task :datahub-graphql-core:compileJava FAILED
/home/akravtsov/Documents/IdeaProjects/datahub-latest/datahub-graphql-core/src/mainGeneratedGraphQL/java/com/linkedin/datahub/graphql/generated/Entity.java:7: error: cannot find symbol
@javax.annotation.processing.Generated(
^
symbol: class Generated
location: package javax.annotation.processing
/home/akravtsov/Documents/IdeaProjects/datahub-latest/datahub-graphql-core/src/mainGeneratedGraphQL/java/com/linkedin/datahub/graphql/generated/EntityWithRelationships.java:7: error: cannot find symbol
@javax.annotation.processing.Generated(
^
kind-psychiatrist-76973
04/07/2022, 1:05 PM12:54:31.973 [pool-17-thread-1] ERROR c.d.m.a.AuthorizationManager - Failed to retrieve policy urns! Skipping updating policy cache until next refresh. start: 0, count: 30
com.linkedin.r2.RemoteInvocationException: com.linkedin.r2.RemoteInvocationException: Failed to get response from server for URI <http://localhost:8080/entities>
icy-piano-35127
04/07/2022, 7:20 PMlively-action-8308
04/07/2022, 8:12 PMpackage filters;
import play.mvc.EssentialAction;
import play.mvc.EssentialFilter;
import javax.inject.Inject;
import javax.inject.Singleton;
import java.util.concurrent.Executor;
@Singleton
public class CSPFilter extends EssentialFilter {
private final Executor exec;
/**
* @param exec This class is needed to execute code asynchronously.
*/
@Inject
public CSPFilter(Executor exec) {
this.exec = exec;
}
@Override
public EssentialAction apply(EssentialAction next) {
return EssentialAction.of(request ->
next.apply(request).map(result ->
result.withHeader(
"Content-Security-Policy",
"frame-ancestors 'self' <http://localhost:3000>")
, exec)
);
}
}
After deployment the OIDC configuration works fine. When we access it using our virtual service url, the login via Keycloak is successfully done but when we try to access Datahub embedded we receive the following logs:
19:28:10 [application-akka.actor.default-dispatcher-34] ERROR application -
! @7n94niein - Internal server error, for (GET) [/callback/oidc?state=ZGIDs6vDT7b3nE3bY5fti5p99UJKQozIGmVbGImckj0&session_state=a6a687e8-a210-4f7b-b2e2-998ce2c8656c&code=017576af-b373-48fe-9e38-a64510d81c05.a6a687e8-a210-4f7b-b2e2-998ce2c8656c.54b62100-14ad-4c8c-93d0-754f98ddfe4a] ->
play.api.UnexpectedException: Unexpected exception[CompletionException: org.pac4j.core.exception.TechnicalException: State parameter is different from the one sent in authentication request. Session expired or possible threat of cross-site request forgery]
at play.api.http.HttpErrorHandlerExceptions$.throwableToUsefulException(HttpErrorHandler.scala:247)
at play.api.http.DefaultHttpErrorHandler.onServerError(HttpErrorHandler.scala:176)
at play.core.server.AkkaHttpServer$$anonfun$2.applyOrElse(AkkaHttpServer.scala:363)
at play.core.server.AkkaHttpServer$$anonfun$2.applyOrElse(AkkaHttpServer.scala:361)
at scala.concurrent.Future$$anonfun$recoverWith$1.apply(Future.scala:346)
at scala.concurrent.Future$$anonfun$recoverWith$1.apply(Future.scala:345)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:92)
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:92)
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:92)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:91)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:49)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.util.concurrent.CompletionException: org.pac4j.core.exception.TechnicalException: State parameter is different from the one sent in authentication request. Session expired or possible threat of cross-site request forgery
at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1606)
at play.core.j.HttpExecutionContext$$anon$2.run(HttpExecutionContext.scala:56)
... 6 common frames omitted
Caused by: org.pac4j.core.exception.TechnicalException: State parameter is different from the one sent in authentication request. Session expired or possible threat of cross-site request forgery
at org.pac4j.oidc.credentials.extractor.OidcExtractor.extract(OidcExtractor.java:74)
at org.pac4j.oidc.credentials.extractor.OidcExtractor.extract(OidcExtractor.java:32)
at org.pac4j.core.client.BaseClient.retrieveCredentials(BaseClient.java:65)
at org.pac4j.core.client.IndirectClient.getCredentials(IndirectClient.java:140)
at org.pac4j.core.engine.DefaultCallbackLogic.perform(DefaultCallbackLogic.java:89)
at auth.sso.oidc.OidcCallbackLogic.perform(OidcCallbackLogic.java:87)
at controllers.SsoCallbackController$SsoCallbackLogic.perform(SsoCallbackController.java:62)
at controllers.SsoCallbackController$SsoCallbackLogic.perform(SsoCallbackController.java:49)
at org.pac4j.play.CallbackController.lambda$callback$0(CallbackController.java:56)
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
... 7 common frames omitted
@bumpy-umbrella-67147swift-breakfast-25077
04/07/2022, 8:23 PM