mammoth-bear-12532
handsome-football-66174
06/08/2022, 5:34 PMupstream = Upstream(dataset=datasetUrn("bar2"), type=DatasetLineageType.TRANSFORMED)
fieldLineages = UpstreamLineage(
upstreams=[upstream], fineGrainedLineages=fineGrainedLineages
)
lineageMcp = MetadataChangeProposalWrapper(
entityType="dataset",
changeType=ChangeTypeClass.UPSERT,
entityUrn=datasetUrn("bar"),
aspectName="upstreamLineage",
aspect=fieldLineages,
)
bright-cpu-56427
06/09/2022, 7:32 AMpydantic.error_wrappers.ValidationError: 1 validation error for PipelineConfig
source -> filename:
extra fields not permitted (type=value_error.extra)
This error is coming out.
I don’t know which one is wrong.
my python code is
def add_quicksight_platform():
pipeline = Pipeline.create(
# This configuration is analogous to a recipe configuration.
{
"source": {
"type": "file",
"filename:":"/opt/airflow/dags/datahub/quicksight.json"
},
"sink": {
"type": "datahub-rest",
"config": {"server": "{datahub-gms-ip}:8080"},
}
}
)
pipeline.run()
pipeline.raise_from_status()
quicksight.json
{
"auditHeader": null,
"proposedSnapshot": {
"com.linkedin.pegasus2avro.metadata.snapshot.DataPlatformSnapshot": {
"urn": "urn:li:dataPlatform:quicksight",
"aspects": [
{
"com.linkedin.pegasus2avro.dataplatform.DataPlatformInfo": {
"datasetNameDelimiter": "/",
"name": "quicksight",
"type": "OTHERS",
"logoUrl": "<https://play-lh.googleusercontent.com/dbiOAXowepd9qC69dUnCJWEk8gg8dsQburLUyC1sux9ovnyoyH5MsoLf0OQcBqRZILB0=w240-h480-rw>"
}
}
]
}
},
"proposedDelta": null
}
bumpy-activity-74405
06/09/2022, 8:13 AMbrave-pencil-21289
06/09/2022, 12:11 PMcrooked-lunch-27985
06/09/2022, 5:48 PMrich-policeman-92383
06/09/2022, 6:11 PMstocky-midnight-78204
06/10/2022, 2:49 AMrich-policeman-92383
06/10/2022, 9:25 AMbrave-pencil-21289
06/10/2022, 11:41 AMfresh-garage-83780
06/10/2022, 12:59 PMsyntax = "proto3";
package atech.proto.shared;
message Uuid {
string value = 1;
}
What I'm seeing is its pulling in one topic, then erroring on the second. I think its telling me because of duped shared proto objects:
TypeError: Couldn't build proto file into descriptor pool!
Invalid proto descriptor for file "shared/uuid.proto":
atech.proto.shared.Uuid.value: "atech.proto.shared.Uuid.value" is already defined in file "atech_app_journey_ape_completed-key.proto".
atech.proto.shared.Uuid: "atech.proto.shared.Uuid" is already defined in file "atech_app_journey_ape_completed-key.proto".
Seems that I might have hit this this limitation but can't think of a workaround, so not quite sure where to go from here? Any clues?some-lighter-85578
06/10/2022, 1:36 PMadorable-guitar-54244
06/10/2022, 3:22 PMbland-orange-13353
06/10/2022, 3:31 PMbrainy-shampoo-83265
06/10/2022, 5:25 PMstocky-midnight-78204
06/13/2022, 1:52 AMstocky-midnight-78204
06/13/2022, 3:42 AMpresto-on-hive
lemon-zoo-63387
06/13/2022, 6:39 AMlemon-zoo-63387
06/13/2022, 6:43 AMbright-cpu-56427
06/13/2022, 9:48 AMemit_lineage_task = DatahubEmitterOperator(
task_id=f"emit_lineage_{table}",
datahub_conn_id="datahub_rest_default",
mces=[
builder.make_lineage_mce(
upstream_urns=upstreams,
downstream_urn=builder.make_dataset_urn(downstream[0], downstream[1]),
)
],
)
emit_lineage_task.execute(kwargs)
log file
[2022-06-13 09:27:43,912] {_lineage_core.py:67} INFO - Emitted from Lineage:
DataFlow(urn=<datahub.utilities.urns.data_flow_urn.DataFlowUrn object at 0x7f9bcf23eee0>, ... url='<http://localhost:8080/tree?dag_id=dag>', ...)
rich-rocket-77152
06/13/2022, 10:25 AMfresh-napkin-5247
06/13/2022, 10:56 AMbumpy-daybreak-85714
06/13/2022, 12:26 PMINFO - ERROR:root:Error: Unable to locate credentials
From the code perspective, I guess that for this to work this line needs to be modified:
https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/src/datahub/ingestion/source/aws/aws_common.py#L114
Am I right?
Using version acryl-datahub[dbt] == 0.8.36
salmon-angle-92685
06/13/2022, 1:18 PM1 validation error for SnowflakeConfig
top_n_queries
extra fields not permitted (type=value_error.extra)
Thank you in advance !stocky-energy-24880
06/13/2022, 1:39 PMbusy-airport-23391
06/13/2022, 9:31 PMstats
. At the moment, all I'm trying to upsert is the timestampMillis
. When I try to set the timestampMillis, I get the following error despite passing a long value:
java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Long
at com.linkedin.metadata.timeseries.transformer.TimeseriesAspectTransformer.getCommonDocument(TimeseriesAspectTransformer.java:80)
at com.linkedin.metadata.timeseries.transformer.TimeseriesAspectTransformer.transform(TimeseriesAspectTransformer.java:47)
at com.linkedin.metadata.kafka.hook.UpdateIndicesHook.updateTimeseriesFields(UpdateIndicesHook.java:217)
at com.linkedin.metadata.kafka.hook.UpdateIndicesHook.invoke(UpdateIndicesHook.java:115)
at com.linkedin.metadata.kafka.MetadataChangeLogProcessor.consume(MetadataChangeLogProcessor.java:77)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.messaging.handler.invocation.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:169)
at org.springframework.messaging.handler.invocation.InvocableHandlerMethod.invoke(InvocableHandlerMethod.java:119)
at org.springframework.kafka.listener.adapter.HandlerAdapter.invoke(HandlerAdapter.java:56)
at org.springframework.kafka.listener.adapter.MessagingMessageListenerAdapter.invokeHandler(MessagingMessageListenerAdapter.java:347)
at org.springframework.kafka.listener.adapter.RecordMessagingMessageListenerAdapter.onMessage(RecordMessagingMessageListenerAdapter.java:92)
at org.springframework.kafka.listener.adapter.RecordMessagingMessageListenerAdapter.onMessage(RecordMessagingMessageListenerAdapter.java:53)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doInvokeOnMessage(KafkaMessageListenerContainer.java:2334)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.invokeOnMessage(KafkaMessageListenerContainer.java:2315)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doInvokeRecordListener(KafkaMessageListenerContainer.java:2237)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doInvokeWithRecords(KafkaMessageListenerContainer.java:2150)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.invokeRecordListener(KafkaMessageListenerContainer.java:2032)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.invokeListener(KafkaMessageListenerContainer.java:1705)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.invokeIfHaveRecords(KafkaMessageListenerContainer.java:1276)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollAndInvoke(KafkaMessageListenerContainer.java:1268)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:1163)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
So I had two questions:
1. Is there some more documentation on how to update a dataset's profile aspect?
2. Is there a reason the long value is getting converted to an int? Or am I trying to update this aspect incorrectly?
Here's my Java function:
MetadataChangeProposalWrapper mcpw = MetadataChangeProposalWrapper.builder()
.entityType("dataset")
.entityUrn("urn:li:dataset:(<urn>,TEST)")
.upsert()
.aspect(new DatasetProfile()
//.setColumnCount(11)
.setTimestampMillis(0L)
)
.build();
Thanks in advance for your help!tall-fall-45442
06/13/2022, 9:59 PMmy-postgres-password
through the UI. Then when I create an ingestion recipe for Postgres and set the password value to '${my-postgres-password}'
it says that it can't authenticate with the database. However, when I put the password in plain text the ingestion succeeds. Is there something I am doing wrong or must secrets follow a certain naming convention?best-umbrella-24804
06/14/2022, 12:25 AMsource:
type: snowflake-usage
config:
env: DEV
host_port: <http://xxxx.snowflakecomputing.com|xxxx.snowflakecomputing.com>
warehouse: DEVELOPER_X_SMALL
username: DATAHUB_DEV_USER
password: '${SNOWFLAKE_DEV_PASSWORD}'
role: DATAHUB_DEV_ACCESS
top_n_queries: 10
sink:
type: datahub-rest
config:
server: '<http://xxxxxx>'
We are finding that there are 1600 instances of the following error in our logs
'2022-06-13 23:52:41.644712 [exec_id=66448a5e-1f15-406f-8ee7-700ed32f427b] INFO: stdout=[2022-06-13 23:52:37,635] WARNING '
"{datahub.ingestion.source.usage.snowflake_usage:93} - usage => Failed to parse usage line {'query_start_time': datetime.datetime(2022, "
'6, 12, 4, 33, 7, 869000, tzinfo=datetime.timezone.utc), \'query_text\': "INSERT INTO xxxxx( ..... ) '
'validation error for SnowflakeJoinedAccessEvent\n'
'email\n'
' none is not an allowed value (type=type_error.none.not_allowed)\n'
Any idea what is going on here?lemon-zoo-63387
06/14/2022, 4:05 AMmany-morning-40345
06/14/2022, 5:44 AM