acceptable-architect-70237
06/03/2020, 4:16 AMstatus
, but removed:false
just showed upbumpy-keyboard-50565
06/03/2020, 4:30 PMOwnerships
was ingested correctly but not DataProcessInfo
, and Status
was magically set?acceptable-architect-70237
06/03/2020, 5:37 PMDataProcessInfo
bumpy-keyboard-50565
06/03/2020, 5:41 PMmce-consumer
after updating the PDL models?acceptable-architect-70237
06/03/2020, 6:05 PMmce-consumer
acceptable-architect-70237
06/03/2020, 6:10 PMbumpy-keyboard-50565
06/03/2020, 7:09 PMavro-python3
lib does something weird (i.e. randomly picks a type from the union). Simply drop the extra dataProcessInfo
in your MCE should fix the issue.
{"auditHeader": None, "proposedSnapshot": ("com.linkedin.pegasus2avro.metadata.snapshot.DataProcessSnapshot", {"urn": "urn:li:dataprocess:(sqoop2,4DEMO2,PROD)", "aspects": [{"owners": [{"owner": "urn:li:corpuser:datahub", "type": "DATAOWNER"}], "lastModified": {"time": 0, "actor": "urn:li:corpuser:datahub"}}, { "outputs": [ "urn:li:dataset:(urn:li:dataPlatform:cassandra,barEarth,DEV)", "urn:li:dataset:(urn:li:dataPlatform:cassandra,barMars,DEV)" ], "inputs": [ "urn:li:dataset:(urn:li:dataPlatform:hbase,barSky,PROD)", "urn:li:dataset:(urn:li:dataPlatform:hbase,barOcean,PROD)" ] } ]}), "proposedDelta": None}
acceptable-architect-70237
06/03/2020, 7:34 PMmce-cli.py
is running with python 3, right?bumpy-keyboard-50565
06/03/2020, 7:39 PMDataProcessInfo
in a tuple with the type info as the first element and the value as the second like this
("com.linkedin.pegasus2avro.dataprocess.DataProcessInfo, {"inputs":[...], "outputs": [...]})
acceptable-architect-70237
06/03/2020, 8:07 PMbumpy-keyboard-50565
06/03/2020, 8:25 PMacceptable-architect-70237
06/03/2020, 8:36 PMdataprocessinfo
aspect doesn't show. Since you think it's the python avro
package causes this problem, I think I can use other tools, maybe a java client
or approach like this https://stackoverflow.com/questions/51664191/pushing-avro-file-to-kafka to generate a valid message?acceptable-architect-70237
06/03/2020, 8:36 PMbumpy-keyboard-50565
06/03/2020, 8:37 PMacceptable-architect-70237
06/03/2020, 8:38 PM{"auditHeader": None, "proposedSnapshot": ("com.linkedin.pegasus2avro.metadata.snapshot.DataProcessSnapshot", {"urn": "urn:li:dataprocess:(21sqoop121,4DEMO3,PROD)", "aspects": [{"owners": [{"owner": "urn:li:corpuser:datahub", "type": "DATAOWNER"}], "lastModified": {"time": 0, "actor": "urn:li:corpuser:datahub"}}, { "com.linkedin.pegasus2avro.dataprocess.DataProcessInfo": { "outputs": [ "urn:li:dataset:(urn:li:dataPlatform:cassandra,barEarth,DEV)", "urn:li:dataset:(urn:li:dataPlatform:cassandra,barMars,DEV)" ], "inputs": [ "urn:li:dataset:(urn:li:dataPlatform:hbase,barSky,PROD)", "urn:li:dataset:(urn:li:dataPlatform:hbase,barOcean,PROD)" ] } }]}), "proposedDelta": None}
bumpy-keyboard-50565
06/03/2020, 8:40 PM(...)
instead of {...}
. This is how you should do it
{"auditHeader": None, "proposedSnapshot": ("com.linkedin.pegasus2avro.metadata.snapshot.DataProcessSnapshot", {"urn": "urn:li:dataprocess:(21sqoop121,4DEMO3,PROD)", "aspects": [{"owners": [{"owner": "urn:li:corpuser:datahub", "type": "DATAOWNER"}], "lastModified": {"time": 0, "actor": "urn:li:corpuser:datahub"}}, ( "com.linkedin.pegasus2avro.dataprocess.DataProcessInfo": { "outputs": [ "urn:li:dataset:(urn:li:dataPlatform:cassandra,barEarth,DEV)", "urn:li:dataset:(urn:li:dataPlatform:cassandra,barMars,DEV)" ], "inputs": [ "urn:li:dataset:(urn:li:dataPlatform:hbase,barSky,PROD)", "urn:li:dataset:(urn:li:dataPlatform:hbase,barOcean,PROD)" ] } )]}), "proposedDelta": None}
acceptable-architect-70237
06/03/2020, 8:40 PMbumpy-keyboard-50565
06/03/2020, 8:41 PMacceptable-architect-70237
06/03/2020, 8:43 PMTraceback (most recent call last):
File "mce_cli.py", line 108, in <module>
main(parser.parse_args())
File "mce_cli.py", line 88, in main
produce(conf, args.data_file, args.schema_record)
File "mce_cli.py", line 31, in produce
content = ast.literal_eval(sample.strip())
File "/Users/liajiang/opt/anaconda3/lib/python3.7/ast.py", line 46, in literal_eval
node_or_string = parse(node_or_string, mode='eval')
File "/Users/liajiang/opt/anaconda3/lib/python3.7/ast.py", line 35, in parse
return compile(source, filename, mode, PyCF_ONLY_AST)
File "<unknown>", line 1
{"auditHeader": None, "proposedSnapshot": ("com.linkedin.pegasus2avro.metadata.snapshot.DataProcessSnapshot", {"urn": "urn:li:dataprocess:(21sqoop121,4DEMO3,PROD)", "aspects": [{"owners": [{"owner": "urn:li:corpuser:datahub", "type": "DATAOWNER"}], "lastModified": {"time": 0, "actor": "urn:li:corpuser:datahub"}}, ( "com.linkedin.pegasus2avro.dataprocess.DataProcessInfo": { "outputs": [ "urn:li:dataset:(urn:li:dataPlatform:cassandra,barEarth,DEV)", "urn:li:dataset:(urn:li:dataPlatform:cassandra,barMars,DEV)" ], "inputs": [ "urn:li:dataset:(urn:li:dataPlatform:hbase,barSky,PROD)", "urn:li:dataset:(urn:li:dataPlatform:hbase,barOcean,PROD)" ] } )]}), "proposedDelta": None}
^
SyntaxError: invalid syntax
bumpy-keyboard-50565
06/03/2020, 8:44 PM"com.linkedin.pegasus2avro.dataprocess.DataProcessInfo":
to "com.linkedin.pegasus2avro.dataprocess.DataProcessInfo",
{"auditHeader": None, "proposedSnapshot": ("com.linkedin.pegasus2avro.metadata.snapshot.DataProcessSnapshot", {"urn": "urn:li:dataprocess:(21sqoop121,4DEMO3,PROD)", "aspects": [{"owners": [{"owner": "urn:li:corpuser:datahub", "type": "DATAOWNER"}], "lastModified": {"time": 0, "actor": "urn:li:corpuser:datahub"}}, ( "com.linkedin.pegasus2avro.dataprocess.DataProcessInfo", { "outputs": [ "urn:li:dataset:(urn:li:dataPlatform:cassandra,barEarth,DEV)", "urn:li:dataset:(urn:li:dataPlatform:cassandra,barMars,DEV)" ], "inputs": [ "urn:li:dataset:(urn:li:dataPlatform:hbase,barSky,PROD)", "urn:li:dataset:(urn:li:dataPlatform:hbase,barOcean,PROD)" ] } )]}), "proposedDelta": None}
acceptable-architect-70237
06/03/2020, 8:47 PMbumpy-keyboard-50565
06/03/2020, 8:49 PMacceptable-architect-70237
06/03/2020, 8:49 PMacceptable-architect-70237
06/03/2020, 8:53 PMcom.linkedin.pegasus2avro.dataprocess.DataProcessInfo
is also needed. I couldn't just use dataProcessInfo
, in case you have doubt about itbumpy-keyboard-50565
06/03/2020, 8:54 PM