early-hydrogen-27542
02/23/2023, 8:37 PMsearch
input that retrieves dataset entities by platform and query text that looks for a specific schema name:
search(
input: {type: DATASET, query: "schema_1.", orFilters: [{and: [{field: "platform", values: ["urn:li:dataPlatform:redshift"]}]}], start: 0, count: 10}
)
This works well for one schema, and returns nearly the same number of datasets I see through the UI. However, when I try with a second schema name (e.g. schema_2
, it returns 10k+ datasets which is way more than what actually exists.
Is there a better way to look for specific schemas of tables?colossal-autumn-78301
02/24/2023, 10:51 AMrough-lamp-22858
02/26/2023, 12:35 PMsalmon-angle-92685
02/27/2023, 10:38 AMbest-umbrella-88325
02/27/2023, 12:21 PMError connecting to node prerequisites-kafka-0.prerequisites-kafka-headless.default.svc.cluster.local:9092 (id: 0 rack: null)
Looks like it's trying to connect to the internal service name of the kafka pod. However, I couldn't find this name mentioned anywhere in the code or in the properties files. Upon checking the application.yml, it has the value localhost:9092 (I've port forwarded the kafka pod, so this should work).
Can someone point me out to the location where I should be changing the URL?
Thanks in advance.bitter-translator-92563
02/28/2023, 2:47 PMincalculable-needle-41145
03/01/2023, 1:08 AMcolossal-autumn-78301
03/02/2023, 12:12 PMdatahub-gma
https://github.com/linkedin/datahub-gma used/referenced (or used in deployment) from any of the code in the main datahub repository at https://github.com/datahub-project/datahub ? Could not find any references to this from this main repos. Any hints will be appreciated.rough-journalist-49506
03/02/2023, 1:33 PMgreen-activity-32141
03/02/2023, 2:19 PMgreen-activity-32141
03/02/2023, 2:28 PMadorable-computer-92026
03/02/2023, 2:39 PMrich-salesmen-77587
03/02/2023, 3:55 PMbusy-action-2524
03/02/2023, 6:34 PMgreen-activity-32141
03/02/2023, 7:59 PMstraight-policeman-77814
03/06/2023, 6:34 AMpurple-terabyte-64712
03/06/2023, 12:29 PMbland-orange-13353
03/06/2023, 6:22 PMmany-nest-43191
03/06/2023, 6:56 PMCould not find method compile() for arguments [io.acryljson schema avro0.1.5, build_3txm9qv85o1lfzkn2hmnfzpka$_run_closure1$_closure2@31ca483a] on object of type org.gradle.api.internal.artifacts.dsl.dependencies.DefaultDependencyHandler.Can someone help me to solve this
fierce-forest-92066
03/06/2023, 10:03 PMbrave-judge-32701
03/07/2023, 7:37 AMcreate table test.testtable4 as select * from test.testtable3
, but the table testtable4's upstream is sql at <console>:23
not testtable3
, does it is a compatibility issue?
And spark run on hive, spark create hive table can not be show in datahub immediately , I need to run batch Ingestion task to Ingest hive metastore data.big-postman-38407
03/07/2023, 10:45 AMSub Type
right under Type
, because they are connected, and I do not understand why the sorting works differently.early-airline-85277
03/07/2023, 5:11 PMKafka
. It looks like the filter option does not support this in "Edit View".refined-football-89019
03/07/2023, 9:15 PM<dependency>
<groupId>io.acryl</groupId>
<artifactId>datahub-client</artifactId>
<version>0.10.0-4</version>
</dependency>
...but most of the code examples are Python. For example: https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/examples/library/lineage_job_dataflow_new_api_simple.py. This sample Python code contains the classes: DataJob, DataFlow, etc. These don't appear to have counterparts in the Java client library. For example DataFlow: the Java client library does contain DataFlowInfo, DataFlowKey, etc... but not DataFlow (with a 3 parameter constructor). Can you point me to sample Java code corresponding to the Python example above? thankshandsome-flag-16272
03/07/2023, 10:26 PM./gradlew quickstartDebug
I plan to change the elasticsearch 9200 port to something else, like 39200, to avoid security scan failed on 9200 port for http GET and DELETE methods.
Could anybody tell me which files I should make such change?
Currently, I have made changes in the following files:
• docker/elasticsearch-setup/env/docker.env
• docker/quickstart/docker-compose.quickstart.yml
• docker/docker-compose.ymltall-eye-41335
03/07/2023, 11:44 PMbland-appointment-45659
03/08/2023, 3:03 AMtall-butcher-30509
03/08/2023, 5:54 AMadorable-computer-92026
03/08/2023, 10:56 AMpolite-tent-71027
03/08/2023, 12:52 PMs3
and dbt
and can't make them work together. Maybe someone can point me a direction to look forward. So,
• I have a delta table in s3 having path <s3a://core-data/Features/ssns__lagermetrics>
• There is an external table in Hive created as
CREATE EXTERNAL TABLE `features`.`ssns__lagermetrics` USING DELTA LOCATION '<s3a://core-data/Features/ssns__lagermetrics>';
• I use the external table as a source in dbt:
sources:
- name: features
description: features schema
tables:
- name: ssns__lagermetrics
...
• After ingestion there is 2 separate disconnected entities of ssns__lagermetrics which I can't connect =( I tried to use transformers but it seems to be wrong direction... I very appreciate any help.
-----
PS. Just scanned through history and found dbt-labs/dbt_external_tables
package. I'll try it and update message thereafter.