kind-dawn-17532
05/25/2022, 7:34 PMcool-painting-92220
05/25/2022, 10:32 PMrich-policeman-92383
05/26/2022, 10:08 AMMay 26, 2022 @ 15:18:59.000 /datahub_datahub-gms.3.uh37zjgealu56t32nr2ct4836 java.util.concurrent.CompletionException: java.lang.RuntimeException: Failed to retrieve entities of type UsageType
May 26, 2022 @ 15:18:59.000 /datahub_datahub-gms.3.uh37zjgealu56t32nr2ct4836 Caused by: java.lang.RuntimeException: Failed to retrieve entities of type UsageType
May 26, 2022 @ 15:18:59.000 /datahub_datahub-gms.3.uh37zjgealu56t32nr2ct4836 Caused by: java.lang.RuntimeException: Failed to batch load Usage Stats
May 26, 2022 @ 15:18:59.000 /datahub_datahub-gms.3.uh37zjgealu56t32nr2ct4836 Caused by: java.lang.RuntimeException: Failed to load Usage Stats for resource urn:li:dataset:(urn:li:dataPlatform:hive,edw_base.clickstream_base_dly,PROD)
May 26, 2022 @ 15:18:59.000 /datahub_datahub-gms.3.uh37zjgealu56t32nr2ct4836 Caused by: com.linkedin.r2.RemoteInvocationException: com.linkedin.r2.RemoteInvocationException: Failed to get response from server for URI <http://localhost:8080/usageStats>
May 26, 2022 @ 15:18:59.000 /datahub_datahub-gms.3.uh37zjgealu56t32nr2ct4836 at com.linkedin.restli.internal.client.ExceptionUtil.wrapThrowable(ExceptionUtil.java:135)
May 26, 2022 @ 15:18:59.000 /datahub_datahub-gms.3.uh37zjgealu56t32nr2ct4836 Caused by: com.linkedin.r2.RemoteInvocationException: Failed to get response from server for URI <http://localhost:8080/usageStats>
May 26, 2022 @ 15:18:59.000 /datahub_datahub-gms.3.uh37zjgealu56t32nr2ct4836 Caused by: java.util.concurrent.TimeoutException: Exceeded request timeout of 10000ms
great-cpu-72376
05/31/2022, 1:41 PMtest_pull_task = PythonOperator(
task_id="test_pull_task",
python_callable=test,
op_kwargs=test_operator_task.output,
inlets={"dataset": [Dataset(platform="file",name="/test/inlet/input.txt")]},
outlets={"dataset": [Dataset(platform="file",name="/test/outlet/output.txt")]}
)
in the task log I see this:
[2022-05-31, 13:34:40 UTC] {_lineage_core.py:80} INFO - Emitted from Lineage: DataJob(id='test_pull_task', urn=<datahub.utilities.urns.data_job_urn.DataJobUrn object at 0x7efdb7ac1e50>, flow_urn=<datahub.utilities.urns.data_flow_urn.DataFlowUrn object at 0x7efdbb025790>, name=None, description=None, properties={'task_id': "'test_pull_task'", '_outlets': '[]', 'label': "'test_pull_task'", '_downstream_task_ids': '[]', '_inlets': '[]', 'email': "['***.it.sgn@u-blox.com']", '_task_type': "'PythonOperator'", '_task_module': "'***.operators.python'", 'execution_timeout': 'None', 'depends_on_past': 'False', 'wait_for_downstream': 'False', 'sla': 'None', 'trigger_rule': "'all_success'"}, url='<http://localhost:8080/taskinstance/list/?flt1_dag_id_equals=test_lineage_drop_partition&_flt_3_task_id=test_pull_task>', tags={'drop', 'maya', 'postgresql', 'dba', 'partition'}, owners={'it-app-svc'}, inlets=[], outlets=[], upstream_urns=[<datahub.utilities.urns.data_job_urn.DataJobUrn object at 0x7efdb7ad5b50>])
inlet and outlet are empty, why? I have copied these lines from the example https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/src/datahub_provider/example_dags/lineage_backend_demo.py
Another question, if I want to pass the dataset arrays dynamically, what should I do?calm-dinner-63735
06/03/2022, 12:59 PMhigh-hospital-85984
06/06/2022, 5:16 PMabundant-painter-6
06/07/2022, 2:33 PMnutritious-bird-77396
06/10/2022, 7:59 PMclientId
and telemetryClientId
has been added to the database.
I am assuming the name of the aspect has been changed from one release to the other not sure I don't see any evidence from Git on this assumption.
@early-lamp-41924 Would it be safer to update clientId
aspect to telemetryClientId
?plain-napkin-77279
06/13/2022, 6:13 AMhallowed-machine-2603
06/15/2022, 1:37 AMmillions-notebook-72121
06/23/2022, 9:40 AMquiet-arm-91745
06/28/2022, 4:12 PMlineage_backend_demo
got this log
[2022-06-28, 16:05:30 UTC] {_lineage_core.py:67} INFO - Emitted from Lineage: DataFlow(urn=<datahub.utilities.urns.data_flow_urn.DataFlowUrn object at 0x7f2a4526f550>, id='datahub_lineage_backend_demo', orchestrator='airflow', cluster='prod', name=None, description="An example DAG demonstrating the usage of DataHub's Airflow lineage backend.\n\n", properties={'timezone': "'UTC'", 'start_date': '1656201600.0', 'fileloc': "'/home/airflow/gcs/dags/lineage_backend_demo.py'", 'tags': "['example_tag']", 'catchup': 'False', 'is_paused_upon_creation': 'None', '_default_view': "'tree'", '_access_control': 'None'}, url='<https://xxxxxx-dot-asia-southeast1.composer.googleusercontent.com/tree?dag_id=datahub_lineage_backend_demo>', tags={'example_tag'}, owners={'airflow'})
[2022-06-28, 16:05:30 UTC] {_lineage_core.py:80} INFO - Emitted from Lineage: DataJob(id='run_data_task', urn=<datahub.utilities.urns.data_job_urn.DataJobUrn object at 0x7f2a45210e80>, flow_urn=<datahub.utilities.urns.data_flow_urn.DataFlowUrn object at 0x7f2a4526ff70>, name=None, description=None, properties={'_downstream_task_ids': '[]', 'label': "'run_data_task'", '_inlets': '["Dataset(platform=\'snowflake\', name=\'mydb.schema.tableA\', env=\'PROD\')", "Dataset(platform=\'snowflake\', name=\'mydb.schema.tableB\', env=\'PROD\')"]', 'task_id': "'run_data_task'", 'execution_timeout': '300.0', 'email': "['<mailto:jdoe@example.com|jdoe@example.com>']", '_outlets': '["Dataset(platform=\'snowflake\', name=\'mydb.schema.tableC\', env=\'PROD\')"]', '_task_type': "'BashOperator'", '_task_module': "'airflow.operators.bash'", 'depends_on_past': 'False', 'wait_for_downstream': 'False', 'trigger_rule': "'all_success'", 'sla': 'None'}, url='<https://xxxxxxxx-dot-asia-southeast1.composer.googleusercontent.com/taskinstance/list/?flt1_dag_id_equals=datahub_lineage_backend_demo&_flt_3_task_id=run_data_task>', tags={'example_tag'}, owners={'airflow'}, inlets=[<datahub.utilities.urns.dataset_urn.DatasetUrn object at 0x7f2a4522d5b0>, <datahub.utilities.urns.dataset_urn.DatasetUrn object at 0x7f2a4522d580>], outlets=[<datahub.utilities.urns.dataset_urn.DatasetUrn object at 0x7f2a4522d550>], upstream_urns=[])
is there any steps that i missed?
thanks beforenutritious-bird-77396
06/30/2022, 9:57 PM0.8.39.1rc8
still the same errorsteep-midnight-37232
07/05/2022, 2:15 PMsteep-soccer-91284
07/20/2022, 9:22 AMwitty-butcher-82399
07/22/2022, 12:50 PMfaint-translator-23365
08/01/2022, 7:51 PMancient-apartment-23316
08/02/2022, 7:01 PMingress:
enabled: true
annotations:
<http://kubernetes.io/ingress.class|kubernetes.io/ingress.class>: alb
<http://alb.ingress.kubernetes.io/scheme|alb.ingress.kubernetes.io/scheme>: internet-facing
<http://alb.ingress.kubernetes.io/target-type|alb.ingress.kubernetes.io/target-type>: instance
<http://alb.ingress.kubernetes.io/subnets|alb.ingress.kubernetes.io/subnets>: subnet-1, subnet-2
# <http://alb.ingress.kubernetes.io/certificate-arn|alb.ingress.kubernetes.io/certificate-arn>: <<certificate-arn>>
<http://alb.ingress.kubernetes.io/inbound-cidrs|alb.ingress.kubernetes.io/inbound-cidrs>: 0.0.0.0/0
<http://alb.ingress.kubernetes.io/listen-ports|alb.ingress.kubernetes.io/listen-ports>: '[{"HTTP": 80}]'
# <http://alb.ingress.kubernetes.io/actions.ssl-redirect|alb.ingress.kubernetes.io/actions.ssl-redirect>: '{"Type": "redirect", "RedirectConfig": { "Protocol": "HTTPS", "Port": "443", "StatusCode": "HTTP_301"}}'
hosts:
- host: "<http://dev-datahub.qwerty.com|dev-datahub.qwerty.com>"
# redirectPaths:
# - path: /*
# name: ssl-redirect
# port: use-annotation
paths:
- /*
and the datahub is still inaccessible to me from the internet.
I think maybe this option could help me? https://github.com/acryldata/datahub-helm/blob/master/charts/datahub/subcharts/datahub-frontend/values.yaml#L48quick-pizza-8906
08/03/2022, 3:24 PMquery count($urn: String!) {
corpGroup(urn: $urn) {
relationships(input: {
types: ["OwnedBy"]
direction: INCOMING
}) {
total
}
}
searchAcrossEntities(input: {
types: [],
query: "*",
filters: [
{
field: "owners",
value: $urn
}
]
}) {
total
}
}
}
Basically I want to count datasets owned by the group, unfortunately the counts from corpGroup
and searchAcrossEntities
do not match - it seems corpGroup
counts in also softly-removed sets. Is it intended? Can it be somehow avoided? It seems searchAcrossEntities
is giving the correct count but I need corpGroup
to give the correct count as well, as I noticed the problem by running query:
query getGroupsCount {
search(
input: {type: CORP_GROUP, query: "*", filters: [<some filters here>]}
) {
searchResults {
entity {
urn
... on CorpGroup {
name
properties {
displayName
}
relationships(input: {
types: ["OwnedBy"]
direction: INCOMING
}) {
total
}
}
}
}
}
}
And it is impossible to run such aggregation with searchAcrossEntities
I understand.
To add to that I run first query against demo and I got even more absurd result (see below). Any hints/tips what am I doing wrong here?faint-translator-23365
08/04/2022, 4:58 PMfamous-florist-7218
08/05/2022, 2:37 AMadamant-van-21355
08/11/2022, 1:28 PMv0.8.43
release announced, the helm chart version (and repository) still point to v0.8.42
.
1. Why is that? Is this helm chart planned to be updated soon to v0.8.43
?
2. If no, is the v0.8.42
helm release safe (bug-free) to upgrade?
Thank you!aloof-leather-92383
08/11/2022, 6:55 PMambitious-cartoon-15344
08/12/2022, 9:42 AMbusy-petabyte-37287
08/12/2022, 2:12 PMbusy-petabyte-37287
08/12/2022, 2:13 PMgreat-motherboard-71467
08/16/2022, 1:00 PMWHZ-Authentication {
com.sun.security.auth.module.LdapLoginModule sufficient
userProvider="<ldaps://ldaps.some.server.eu:636/cn=users,cn=accounts,dc=some,dc=domain,dc=com>"
authzIdentity="{USERNAME}"
userFilter="(&(objectClass=person)(uid={USERNAME}))"
java.naming.security.authentication="simple"
debug="true"
useSSL="true";
};
as you can see there is a change which works in my case
i replaced authIdentity with authzIdentity
authzIdentity="{USERNAME}"
Regarding to the documentation:
authzIdentity=authz_id
This option specifies an authorization identity for the user. authz_id
is any string name. If it comprises a single special token with curly braces then that token is treated as a attribute name and will be replaced with a single value of that attribute from the user's LDAP entry. If the attribute cannot be found then the option is ignored. When this option is supplied and the user has been successfully authenticated then an additional UserPrincipal
is created using the authorization identity and it is associated with the current Subject
.nutritious-bird-77396
08/17/2022, 4:07 PMbland-barista-59197
09/02/2022, 6:01 PMupstream connect error or disconnect/reset before headers. reset reason: connection termination
breezy-shoe-41523
09/06/2022, 9:55 AM