plain-analyst-30186
02/05/2023, 2:26 PMbest-umbrella-88325
02/06/2023, 10:44 AMstrong-easter-55319
02/06/2023, 11:19 AM./gradlew build
I get the following error: no such file or directory: ./gradlew
In fact, this directory does not exist on the root of the project, is there any missing step on the documentation that I have to look into it?strong-easter-55319
02/06/2023, 12:34 PM./gradlew build
FAILURE: Build failed with an exception.
* Where:
Build file '/Users/gamboad/Sites/datahub/metadata-service/restli-servlet-impl/build.gradle' line: 80
* What went wrong:
A problem occurred evaluating project ':metadata-service:restli-servlet-impl'.
> Could not resolve all dependencies for configuration ':metadata-service:restli-servlet-impl:dataModel'.
> Failed to calculate the value of task ':metadata-models:compileJava' property 'javaCompiler'.
> Unable to download toolchain matching these requirements: {languageVersion=8, vendor=any, implementation=vendor-specific}
> Unable to download toolchain. This might indicate that the combination (version, architecture, release/early access, ...) for the requested JDK is not available.
> Could not read '<https://api.adoptopenjdk.net/v3/binary/latest/8/ga/mac/aarch64/jdk/hotspot/normal/adoptopenjdk>' as it does not exist.
* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.
* Get more help at <https://help.gradle.org>
Deprecated Gradle features were used in this build, making it incompatible with Gradle 7.0.
Use '--warning-mode all' to show the individual deprecation warnings.
See <https://docs.gradle.org/6.9.2/userguide/command_line_interface.html#sec:command_line_warnings>
BUILD FAILED in 1m 58s
My java version is 11
$ java --version
openjdk 11.0.18 2023-01-17
OpenJDK Runtime Environment Homebrew (build 11.0.18+0)
OpenJDK 64-Bit Server VM Homebrew (build 11.0.18+0, mixed mode)
The issue seems the be related with the gradle version, which version should I use?powerful-cat-68806
02/06/2023, 3:04 PM0 0 * * *
)
Does it make sense that if they run simultaneously, theyâll fail?
Also - can I kill a running process?sparse-memory-36759
02/06/2023, 3:56 PMhandsome-football-66174
02/06/2023, 5:26 PMsalmon-spring-51500
02/06/2023, 7:12 PMjolly-gpu-90313
02/07/2023, 6:07 AMdatahub docker quickstart
or sudo datahub docker quickstart
> sudo datahub docker quickstart
Docker doesn't seem to be running. Did you start it?
~ .........................................................................................................................................................................
> datahub docker quickstart
Docker doesn't seem to be running. Did you start it?
Can someone help me figure this out? My docker
setup is running.powerful-cat-68806
02/07/2023, 9:03 AMgray-ocean-32209
02/07/2023, 10:19 AMaverage-dinner-25106
02/07/2023, 10:39 AMbillowy-flag-4217
02/07/2023, 1:37 PMinclude_view_lineage
for the postgres ingestion library. I have noticed that it doesn't always emit correct lineage between my entities. It seems mostly to be lineage between view and view, where one view has an upstream dependency to another, where lineage is ignored.
Is this by design or should we expect include_view_lineage
to include lineage between views and views too?incalculable-manchester-41314
02/07/2023, 2:47 PMmelodic-ability-49840
02/07/2023, 4:32 PMfrom datetime import timedelta
from airflow import DAG
from airflow.operators.bash import BashOperator
from airflow.utils.dates import days_ago
from datahub_provider.entities import Dataset
default_args = {
"owner": "airflow",
"depends_on_past": False
}
with DAG(
"datahub_lineage_backend_demo",
default_args=default_args,
description="An example DAG demonstrating the usage of DataHub's Airflow lineage backend.",
schedule_interval=timedelta(days=1),
start_date=days_ago(2),
tags=["example_tag"],
catchup=False,
) as dag:
task1 = BashOperator(
task_id="run_data_task",
dag=dag,
bash_command="echo 'This is where you might run your data tooling.'",
inlets=[
Dataset("snowflake", "mydb.schema.tableA"),
Dataset("snowflake", "mydb.schema.tableB", "DEV")
],
outlets=[Dataset("snowflake", "mydb.schema.tableD")],
)
and also:
from datetime import timedelta
from airflow.operators.bash import BaseOperator, BashOperator
from airflow import DAG
from airflow.utils.dates import days_ago
import datahub.emitter.mce_builder as builder
from datahub_provider.operators.datahub import DatahubEmitterOperator
default_args = {
"owner": "airflow",
"depends_on_past": False,
"retries": 1,
"retry_delay": timedelta(minutes=5),
"execution_timeout": timedelta(minutes=120),
}
with DAG(
"datahub_lineage_emission_test",
default_args=default_args,
description="An example DAG demonstrating lineage emission within an Airflow DAG.",
schedule_interval=timedelta(days=1),
start_date=days_ago(2),
catchup=False,
) as dag:
# This example shows a SnowflakeOperator followed by a lineage emission. However, the
# same DatahubEmitterOperator can be used to emit lineage in any context.
lineage_dag_start = BashOperator(
task_id="LINEAGE_START",
dag=dag,
bash_command="echo 'This is lineage test Start DAG'"
)
emit_lineage_task = DatahubEmitterOperator(
task_id="emit_lineage",
datahub_conn_id="datahub_rest_default",
mces=[
builder.make_lineage_mce(
upstream_urns=[
builder.make_dataset_urn("s3", "mydb.schema.tableA"),
builder.make_dataset_urn("s3", "mydb.schema.tableB"),
],
downstream_urn=builder.make_dataset_urn(
"s3", "mydb.schema.tableC"
),
)
],
dag=dag
)
lineage_finish = BashOperator(
task_id="LINEAGE_FINISH",
dag=dag,
bash_command="echo 'This is lineage test Finish DAG'"
)
lineage_dag_start >> emit_lineage_task >> lineage_finish
rhythmic-quill-75064
02/07/2023, 4:33 PMdatahub-datahub-gms
:
[R2 Nio Event Loop-1-1] WARN c.l.r.t.h.c.c.ChannelPoolLifecycle:139 - Failed to create channel, remote=localhost/127.0.0.1:8080
io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:8080
Caused by: java.net.ConnectException: Connection refused
[...]
[pool-16-thread-1] ERROR c.d.m.ingestion.IngestionScheduler:244 - Failed to retrieve ingestion sources! Skipping updating schedule cache until next refresh. start: 0, count: 30
com.linkedin.r2.RemoteInvocationException: com.linkedin.r2.RemoteInvocationException: Failed to get response from server for URI <http://localhost:8080/entities>
[...]
Caused by: com.linkedin.r2.RetriableRequestException: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:8080
gray-ghost-82678
02/07/2023, 5:25 PMsalmon-jordan-53958
02/07/2023, 5:30 PMgreen-hamburger-3800
02/07/2023, 5:31 PMbland-orange-13353
02/07/2023, 5:39 PMsalmon-jordan-53958
02/07/2023, 11:17 PMquaint-barista-82836
02/08/2023, 3:46 AMCalculating Metrics: 0%| | 0/15 [00:00<?, ?it/s]
Calculating Metrics: 0%| | 0/15 [00:00<?, ?it/s]
Calculating Metrics: 13%|ââ | 2/15 [00:00<00:01, 12.15it/s]
Calculating Metrics: 13%|ââ | 2/15 [00:00<00:01, 12.15it/s]
Calculating Metrics: 13%|ââ | 2/15 [00:00<00:01, 12.15it/s]
Calculating Metrics: 27%|âââ | 4/15 [00:02<00:08, 1.34it/s]
Calculating Metrics: 27%|âââ | 4/15 [00:02<00:08, 1.34it/s]
Calculating Metrics: 27%|âââ | 4/15 [00:02<00:08, 1.34it/s]
Calculating Metrics: 47%|âââââ | 7/15 [00:02<00:05, 1.34it/s]
Calculating Metrics: 47%|âââââ | 7/15 [00:02<00:05, 1.34it/s]
Calculating Metrics: 80%|ââââââââ | 12/15 [00:05<00:01, 2.14it/s]
Calculating Metrics: 80%|ââââââââ | 12/15 [00:05<00:01, 2.14it/s]
Calculating Metrics: 80%|ââââââââ | 12/15 [00:05<00:01, 2.14it/s]
Calculating Metrics: 100%|ââââââââââ| 15/15 [00:07<00:00, 1.86it/s]
Calculating Metrics: 100%|ââââââââââ| 15/15 [00:07<00:00, 1.86it/s]
Calculating Metrics: 100%|ââââââââââ| 15/15 [00:07<00:00, 1.86it/s]
Calculating Metrics: 100%|ââââââââââ| 15/15 [00:07<00:00, 1.86it/s]
Calculating Metrics: 100%|ââââââââââ| 15/15 [00:07<00:00, 1.93it/s]
[2023-02-08, 01:07:57 EST] {subprocess.py:89} INFO - Finding datasets being validated
[2023-02-08, 01:07:57 EST] {subprocess.py:89} INFO - Datasource my_bigquery_datasource is not present in platform_instance_map
[2023-02-08, 01:07:57 EST] {subprocess.py:89} INFO - GE expectation_suite_name - demo, expectation_type - expect_column_values_to_not_be_null, Assertion URN - urn:li:assertion:6f56acc887e38af0561eaeb8d41b0bdb
[2023-02-08, 01:07:57 EST] {subprocess.py:89} INFO - GE expectation_suite_name - demo, expectation_type - expect_column_values_to_be_between, Assertion URN - urn:li:assertion:aa04dc0fc98f145d01ae9fcd5f7f4ee3
[2023-02-08, 01:07:57 EST] {subprocess.py:89} INFO - Sending metadata to datahub ...
[2023-02-08, 01:07:57 EST] {subprocess.py:89} INFO - Dataset URN - urn:li:dataset:(urn:li:dataPlatform:bigquery,project.dataset.fbt_diff,PROD)
[2023-02-08, 01:07:57 EST] {subprocess.py:89} INFO - Assertion URN - urn:li:assertion:6f56acc887e38af0561eaeb8d41b0bdb
[2023-02-08, 01:07:57 EST] {subprocess.py:89} INFO - Assertion URN - urn:li:assertion:aa04dc0fc98f145d01ae9fcd5f7f4ee3
[2023-02-08, 01:07:57 EST] {subprocess.py:89} INFO - Metadata sent to datahub.
[2023-02-08, 01:07:57 EST] {subprocess.py:89} INFO - Validation succeeded!
[2023-02-08, 01:07:57 EST] {subprocess.py:89} INFO -
[2023-02-08, 01:07:57 EST] {subprocess.py:89} INFO - Suite Name Status Expectations met
[2023-02-08, 01:07:57 EST] {subprocess.py:89} INFO - - demo â Passed 2 of 2 (100.0 %)
[2023-02-08, 01:07:59 EST] {subprocess.py:93} INFO - Command exited with return code 0
passing Datasource my_bigquery_datasource is not present in platform_instance_map value as :
action_list:
- name: store_validation_result
action:
class_name: StoreValidationResultAction
- name: store_evaluation_params
action:
class_name: StoreEvaluationParametersAction
- name: update_data_docs
action:
class_name: UpdateDataDocsAction
site_names: []
- name: datahub_action
action:
module_name: datahub.integrations.great_expectations.action
class_name: DataHubValidationAction
server_url: <http://ip_address:8080>
platform_instance_map:
datasource_name: my_bigquery_datasource
parse_table_names_from_sql: true
magnificent-lock-58916
02/08/2023, 9:31 AMshy-keyboard-55519
02/08/2023, 10:16 AMdatahub-gms
ends up crashlooping. I also tried increasing datahub-gms
versions by patch versions, and this behavior starts between v0.9.6.3 and v0.9.6.4.
I can provide any necessary logs upon request.fierce-garage-74290
02/08/2023, 11:55 AMurl
to corresponding Confluence pages?
What would be your best bet here (or some good practices based on experience)? Thanks!polite-honey-55441
02/08/2023, 12:26 PMlively-spring-5482
02/08/2023, 3:04 PMFailed to perform post authentication steps. Error message: Failed to provision user with urn
Frontend throws the following exception:
Caused by: com.linkedin.r2.message.rest.RestException: Received error 500 from server for URI <http://datahub-datahub-gms:8080/entities>
The gms application complains:
Caused by: java.sql.SQLException: Incorrect string value: '\xC5\x82aw F...' for column 'metadata' at row 1
Obviously, UTF-8 âĹâ character handling issue.
What can be done in this situation? And noâŚ, Iâm not that much into changing my name, Jaros*Ĺ*aw is not that bad ;)
Thanks in advance for your suggestions!quaint-barista-82836
02/08/2023, 5:48 PMable-evening-90828
02/08/2023, 11:31 PMquery searchDataPlatformInstance {
searchAcrossEntities(
input: {types: [DATA_PLATFORM_INSTANCE], query: "", start: 0, count: 1000}
) {
start
count
total
searchResults {
entity {
urn
type
}
}
}
}
able-evening-90828
02/09/2023, 1:17 AMurn:li:tag: user identifier
, but I want to match either urn:li:tag:user identifier
or urn:li:tag:email address
.
query getSearchResultsForMultiple {
searchAcrossEntities(input: {
types: [DATASET],
query: "",
start: 0,
count:1000,
orFilters: [
{
and: [
{
field: "fieldTags",
values: ["urn:li:tag:user identifier", "urn:li:tag:email address"],
condition: EQUAL
}
]
}
]
}) {
start
count
total
searchResults {
entity {
urn
type
}
}
}
}