hi all - i hope this is the right place to post is...
# troubleshoot
m
hi all - i hope this is the right place to post issues. i'm trying to work with assertions/"validation" on table UI. my end goal is to populate the "validation" tab on table UI with some external checks i'm doing on the table, very similar to the great expectations integration i reviewed the python emitter and also found data_quality_mcpw_rest.py example. however, the example doesn't work as-is (obviously changing data source, table name, etc). is it suppose to? after i run the script and "validation" tab in UI an error appears
Copy code
The field at path '/dataset/assertions/assertions[0]' was declared as a non null type, but the code involved in retrieving data has wrongly returned a null value. The graphql specification requires that the parent field be set to null, or if that is non nullable that it bubble up null to its parent and so on. The non-nullable type is 'Assertion' within parent type '[Assertion!]' (code undefined)
cc @big-carpet-38439
l
@hundreds-photographer-13496
b
@most-room-32003 Hey there! Please make sure you’re emitting a data platform instance aspect for the assertion. If you send some code samples I can also look over!
plus1 1
m
sorry, i'm not sure what you mean. the code i'm running is basically the same as data_quality_mcpw_rest.py. what should i be changing in that file?
h
Hi @most-room-32003, you need to emit dataPlatformInstance aspect for assertion, as shown here -
Copy code
# Construct an assertion platform object.
assertion_dataPlatformInstance = DataPlatformInstance(
    platform=builder.make_data_platform_urn("great-expectations")
)

# Construct a MetadataChangeProposalWrapper object for assertion platform
assertion_dataPlatformInstance_mcp = MetadataChangeProposalWrapper(
    entityType="assertion",
    changeType=ChangeType.UPSERT,
    entityUrn=assertionUrn(assertion_maxVal),
    aspectName="dataPlatformInstance",
    aspect=assertion_dataPlatformInstance,
)
# Emit Assertion entity platform aspect!
emitter.emit(assertion_dataPlatformInstance_mcp)
That got missed in example file earlier. Setting platform for assertion is mandatory for current version. I have raised a PR to take care of this - https://github.com/datahub-project/datahub/pull/4507
m
thanks, i pulled this updated commit and ran as-is and seems to work
b
This is incredible!
@most-room-32003 Where are you sending in assertion results from?
m
right now i'm just testing in a python notebook, nothing fancy, but i want to try to integrate this into my airflow. we have our own checks we do on tables so i want to get that visible
1. in the code, where is all the metadata from
AssertionRunEvent
stored in the webapp, if at all? for example, partitionSpec, runId, result, etc etc. i don't see any in webapp 2. when initially defining
DatasetAssertionInfo
it seems to pull checks from your enum types (AssertionStdOperator, AssertionStdAggregation, AssertionStdParameterType). my checks are home-made and a lot of what you have is not applicable to me. thoughts?
b
1. It is stored in the DB, but not currently displayed on the UI
2. In such cases you can fallback to using "_NATIVE_" (e.g. for AssertionStdOperator, AssertionStdAggregation)
Then simply provide the "native type" field which would be a custom assertion type:
Copy code
/**
* Native assertion type
*/
nativeType: optional string // filled with the platform specific native type string
You can also provide nativeParameters which will be displayed inside the UI on hover:
Copy code
/**
    * Native parameters required for the assertion.
    */
    nativeParameters: optional map[string, string]