Tanmay Movva
03/22/2021, 2:46 AM{
"event_name": "abcd",
"event_type": "",
"version": "v1",
"write_key": "",
"properties": {
"status_code": "some_code",
"status": "some_status",
"mode": "some_mode"
},
"event_timestamp": 1616157914,
"mode": "live"
}
And my schema looks like this
"mode": "string",
"request_failure": "INT"
mode
columns as $.properties.mode
, which is just simple json flattening, From the docs, I was not able to understand the correct syntax for jsonPathString
to use in the tableConfig.request_failure
column is a derived column based on $.properties.status
. I got to know after reading docs that chaining transformations isn’t supported in pinot, So I can’t define a column status = $.properties.status_code
and then use it to define the other columns as request_failure = if(status == 'created', 1, 0)
.
So I think, I need to write a groovy script to extract the value from nested json and apply the if/else logic. But to extract a value from nested json, I would have to import json slurper in groovy(not so familiar with groovy, but this is what I found on SOF/internet to parse json in groovy). So my question here is, does pinot support import statements in the groovy script?
If not, how can I achieve this transformation in pinot?latest
tag.Neha Pawar
Tanmay Movva
03/22/2021, 4:42 AMfilterConfig
, but it isn’t working. Here is my tableConfig
{
"tableName": "x_fts_merchant_events_REALTIME",
"tableType": "REALTIME",
"segmentsConfig": {
"schemaName": "x_fts_merchant_events_dimensions",
"timeColumnName": "event_timestamp",
"timeType": "MILLISECONDS",
"replicasPerPartition": "1",
"retentionTimeValue": "1",
"retentionTimeUnit": "DAYS",
"segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy"
},
"tenants": {
"broker": "DefaultTenant",
"server": "DefaultTenant"
},
"tableIndexConfig": {
"streamConfigs": {
"streamType": "kafka",
"stream.kafka.consumer.type": "LowLevel",
"stream.kafka.topic.name": "x-fts-events-kafka",
"stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
"stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
"stream.kafka.broker.list": "kafka-kafka-bootstrap.kafka.svc.cluster.local:9092",
"stream.kafka.consumer.prop.auto.offset.reset": "smallest"
},
"loadMode": "MMAP"
},
"metadata": {},
"ingestionConfig": {
"filterConfig": {
"filterFunction": "Groovy({event_name == 'abc'}, event_name)"
}
}
}
Neha Pawar
Tanmay Movva
03/22/2021, 4:49 AMare you still seeing event_name ‘abc’ in the ingested data?Not able to see this. Ideally, I should be able to.
Neha Pawar
Tanmay Movva
03/22/2021, 4:51 AMChaining is supported nowJust to confirm. I first extract the field from json using inbuilt function, then I can use groovy to derive a field from the extracted column, correct?
Neha Pawar
Tanmay Movva
03/22/2021, 4:55 AMNeha Pawar
Tanmay Movva
03/22/2021, 5:20 AMNeha Pawar
Kishore G
Tanmay Movva
03/22/2021, 6:22 AMNeha Pawar