thankful-jackal-96705
06/23/2023, 9:46 AMrich-restaurant-61261
06/23/2023, 8:30 PMPipelineInitError: Failed to configure the source (superset): Exceeded 30 redirects.
The username and password I used in here is correct, is anyone know what's the error means in here? and how can I solve the issue?
source:
type: superset
config:
connect_uri: '<https://di-superset.aoc.xxx.com>'
username: xxx
password: xxx
provider: db
microscopic-room-90690
06/26/2023, 5:19 AMacceptable-morning-73148
06/26/2023, 7:12 AMquery($urn: String!) {
dataset(urn: $urn) {
viewProperties {
logic
}
}
}
retrieves the viewProperties and the logic
attribute:
{
"data": {
"dataset": {
"viewProperties": {
"logic": "let source = Sql.Database ........"
}
}
},
"extensions": {}
}
How can I query the contents of it? For example this query doesn't produce any results:
query ($source: String!) {
searchAcrossEntities(input: {
start: 0,
count: 100,
query: "",
orFilters: [{
and: [
{field: "logic", condition: CONTAIN, negated: false, values: ["Sql.Database"]}
]}]
}
) {
searchResults {
entity {
urn,
__typename
}
}
}
}
Note how I'm trying to match the logic
attribute to a value it might contain.fast-judge-41877
06/26/2023, 8:27 AMbillions-rose-75566
06/26/2023, 8:50 AMmicroscopic-room-90690
06/26/2023, 9:29 AMmillions-addition-50023
06/26/2023, 9:52 AMmillions-addition-50023
06/26/2023, 9:53 AMfierce-restaurant-41034
06/26/2023, 3:06 PMrich-restaurant-61261
06/26/2023, 7:19 PMripe-stone-30144
06/26/2023, 8:28 PMambitious-bird-91607
06/26/2023, 9:27 PMschemaMetadata
and editableSchemaMetadata
. After making changes directly in my ClickHouse (schemaMetadata
), I've observed that the DataHub user interface still displays the metadata stored in editableSchemaMetadata
without reflecting the changes made.
I would like to better understand how this situation is handled and if there is any mechanism to automatically synchronize the metadata between both sources. Should I manually update editableSchemaMetadata
to reflect the changes made in schemaMetadata
? Does DataHub always gives priority to editableSchemaMetadata
, regardless of any recent updates in schemaMetadata
?billions-journalist-13819
06/27/2023, 6:49 AMquiet-scientist-40341
06/27/2023, 8:20 AMpublic void testDataProcessInstance() throws IOException, ExecutionException, InterruptedException, URISyntaxException {
KafkaEmitterConfig.KafkaEmitterConfigBuilder builder = KafkaEmitterConfig._builder_();
KafkaEmitterConfig config = builder.build();
KafkaEmitter emitter = new KafkaEmitter(config);
if (emitter.testConnection()) {
DataFlowUrn dataFlowUrn = new DataFlowUrn(“test_01”,“mannual_001",“PROD”);
String flowId = DigestUtils._md5DigestAsHex_(dataFlowUrn.toString().getBytes());
Urn jobFlowRunUrn = Urn._createFromTuple_("dataProcessInstance", flowId);
DataJobUrn dataJobUrn = new DataJobUrn(dataFlowUrn, “mannual_job_001”);
String jobId = DigestUtils._md5DigestAsHex_(dataJobUrn.toString().getBytes());
Urn jobRunUrn = Urn._createFromTuple_("dataProcessInstance", jobId);
DataProcessInstanceProperties dataProcessInstanceProperties = new DataProcessInstanceProperties();
dataProcessInstanceProperties.setName(jobId);
AuditStamp auditStamp = new AuditStamp();
auditStamp.setTime(System._currentTimeMillis_());
auditStamp.setActor(Urn._createFromString_("urn:li:corpuser:datahub"));
dataProcessInstanceProperties.setCreated(auditStamp);
dataProcessInstanceProperties.setType(DataProcessType._BATCH_SCHEDULED_);
MetadataChangeProposalWrapper mcpw = MetadataChangeProposalWrapper._builder_()
.entityType(“dataProcessInstance”)
.entityUrn(jobRunUrn)
.upsert()
.aspect(dataProcessInstanceProperties)
.build();
Future<MetadataWriteResponse> requestFuture = emitter.emit(mcpw, null);
MetadataWriteResponse metadataWriteResponse = requestFuture.get();
System._out_.println(metadataWriteResponse.getResponseContent());
System._out_.println("===========dataProcessInstanceProperties=====");
DataProcessInstanceRelationships dataProcessInstanceRelationships = new DataProcessInstanceRelationships();
System._out_.println("====jobFlowRunUrn====="+ jobFlowRunUrn.toString());
System._out_.println("====dataFlowUrn====="+ dataFlowUrn.toString());
dataProcessInstanceRelationships.setParentInstance(jobFlowRunUrn);
dataProcessInstanceRelationships.setParentTemplate(dataJobUrn);
mcpw = MetadataChangeProposalWrapper._builder_()
.entityType(“dataProcessInstance”)
.entityUrn(jobRunUrn)
.upsert()
.aspect(dataProcessInstanceRelationships)
.build();
requestFuture = emitter.emit(mcpw, null);
metadataWriteResponse = requestFuture.get();
System._out_.println(metadataWriteResponse.getResponseContent());
System._out_.println("===========dataProcessInstanceRelationships=====");
DataProcessInstanceRunEvent dataProcessInstanceRunEvent = new DataProcessInstanceRunEvent();
dataProcessInstanceRunEvent.setStatus(DataProcessRunStatus._STARTED_);
dataProcessInstanceRunEvent.setTimestampMillis(System._currentTimeMillis_());
dataProcessInstanceRunEvent.setAttempt(1);
mcpw = MetadataChangeProposalWrapper._builder_()
.entityType(“dataProcessInstance”)
.entityUrn(jobRunUrn)
.upsert()
.aspect(dataProcessInstanceRunEvent)
.build();
requestFuture = emitter.emit(mcpw, null);
metadataWriteResponse = requestFuture.get();
System._out_.println(metadataWriteResponse.getResponseContent());
System._out_.println("===========dataProcessInstanceRunEvent=====");
}
System._out_.println("====================ab");
}microscopic-room-90690
06/27/2023, 9:54 AMmake_dataset_urn
in python script to ingest metadata from source S3. I tried make_dataset_urn(platform="s3", name="bdp/ingest/test/test/account")
and get the result below. It shows the path I need, but the name of dataset is expected to be account. Anyone can help?ancient-queen-15575
06/27/2023, 12:20 PMbucket_name/db_name/2022/10/03/customers_20221003.csv
• bucket_name/db_name/2022/10/03/customers_20221003_modified.csv
• bucket_name/db_name/2022/10/03/account_renewals_03102022.csv
So the format is something like bucket_name/db_name/yyyy/mm/dd/table_name_[yyyymmdd|ddmmyyyy]_[modified|].csv
. The problem I have is that the {table}
name is in the final filename, rather than a subdirectory in the path, and that it has variable text after the table name.
Is it possible for datahub to read the right table name in these circumstances? If not could I create a transformer to do this? I’m unclear on where I’d start if I do have to create a transformernumerous-address-22061
06/27/2023, 7:14 PMdazzling-rainbow-96194
06/27/2023, 8:37 PMblue-honey-61652
06/28/2023, 6:39 AMshy-dog-84302
06/28/2023, 12:27 PMexternalUrl
to the containers and datasets pointing to the GCP console.
Similarly I would like to add an external link to all my Kafka topics ingested from Kafka plugin. That gives more info about a topic to my users.
Is there an out of box config or solution available for that?
I tried exploring GraphQL API to see if there is a mutation I can use to update it offline. But could not fine one.
I only found updatedataset(s) mutation that adds a link in the institutionalMemory. Not sure if this is the best way to go.
Any experience/suggestions on this would be greatly appreciated 🙂 Thanks.blue-holiday-20644
06/29/2023, 9:26 AMacryl-datahub==0.10.4
acryl-datahub-airflow-plugin==0.10.4
dbt-core==1.4.0
dbt-redshift==1.4.0
We get:
ERROR: Cannot install acryl-datahub-airflow-plugin and dbt-core because these package versions have conflicting dependencies.
This error makes it look like a straight clash between the datahub plugin and dbt-core. But it's really a three-way clash. When we install those two in a requirements file locally they are compatible. Even adding apache-airflow==2.5.1
to a local requirements.txt works. It is the combination of datahub plugin, dbt and MWAA that clashes.
We're removing the 0.10.4 version locks next but if anyone's encountered this or has any ideas that would be great.dazzling-rainbow-96194
06/29/2023, 12:11 PMquiet-exabyte-77821
06/29/2023, 8:10 PMsteep-vr-39297
06/30/2023, 1:55 AMmaster branch
, it runs fine, but when I run with the datahub-protobuf-0.10.5-SNAPSHOT.jar
file via build, I get an error.
I need your help.jolly-airline-17196
06/30/2023, 7:23 AMjolly-airline-17196
06/30/2023, 7:26 AMcool-gpu-21169
06/30/2023, 3:18 PMancient-kitchen-28586
07/01/2023, 11:25 AMaverage-nail-72662
07/02/2023, 11:29 PM