Diogo Baeder
04/25/2022, 12:25 AMCannot read single-value from Collection:
. More on this thread.Diogo Baeder
04/25/2022, 12:26 AMCaused by: java.lang.IllegalStateException: Cannot read single-value from Collection: [1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 9, 1, 1] for column: brands_responses
at shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:721) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
at org.apache.pinot.segment.local.recordtransformer.DataTypeTransformer.standardizeCollection(DataTypeTransformer.java:176) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
at org.apache.pinot.segment.local.recordtransformer.DataTypeTransformer.standardize(DataTypeTransformer.java:119) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
at org.apache.pinot.segment.local.recordtransformer.DataTypeTransformer.transform(DataTypeTransformer.java:63) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
... 13 more
2022/04/25 00:19:39.735 ERROR [SegmentGenerationJobRunner] [pool-2-thread-1] Failed to generate Pinot segment for file - file:/sensitive-data/outputs/cases/br/20150501.json
Diogo Baeder
04/25/2022, 12:26 AM[
{
"brands_responses": {
"first_1000226": 2,
"second_1000226": 1,
"third_1000226": 1,
"fourth_1000226": 1,
"fifth_1000226": 9,
"sixth_1000226": 2,
"seventh_1000226": 1,
"eighth_1000226": 1,
"ninth_1000226": 1,
"tenth_1000226": 1,
"eleventh_1000226": 1,
"twelfth_1000226": 1,
"thirteenth_1000226": 1
},
"caseid": 251214750,
"date_": 20150501,
"pmxid": 52735743,
"region": "br",
"sector_id": 1010,
"uuid": "6702e33a-e961-4f62-b9df-2d65e4fe3fd5",
"weight": 0.935066
}
]
Diogo Baeder
04/25/2022, 12:27 AM{
"schemaName": "cases_schema",
"dimensionFieldSpecs": [
{
"name": "brands_responses",
"dataType": "JSON",
"maxLength": 2147483647
},
{
"name": "caseid",
"dataType": "INT"
},
{
"name": "pmxid",
"dataType": "INT"
},
{
"name": "region",
"dataType": "STRING"
},
{
"name": "sector_id",
"dataType": "INT"
},
{
"name": "uuid",
"dataType": "STRING"
}
],
"metricFieldSpecs": [
{
"name": "weight",
"dataType": "FLOAT"
}
],
"dateTimeFieldSpecs": [
{
"name": "date_",
"dataType": "INT",
"format": "1:DAYS:SIMPLE_DATE_FORMAT:yyyyMMdd",
"granularity": "1:DAYS"
}
]
}
Diogo Baeder
04/25/2022, 12:28 AMDiogo Baeder
04/25/2022, 12:40 AMnoDictionaryColumns
, but even with that added it still errors outDiogo Baeder
04/25/2022, 12:41 AM{
"tableName": "cases",
"tableType": "OFFLINE",
"segmentsConfig": {
"schemaName": "cases_schema",
"timeColumnName": "date_",
"timeType": "DAYS",
"replicasPerPartition": "1",
"replication": "1"
},
"tableIndexConfig": {
"loadMode": "MMAP",
"noDictionaryColumns": [
"brands_responses"
],
"jsonIndexColumns": [],
"invertedIndexColumns": [],
"nullHandlingEnabled": true,
"segmentPartitionConfig": {
"columnPartitionMap": {
"region": {
"functionName": "Murmur",
"numPartitions": 400
}
}
}
},
"tenants": {
"broker": "DefaultTenant",
"server": "DefaultTenant"
},
"metadata": {
"customConfigs": {}
},
"routing": {
"instanceSelectorType": "balanced",
"segmentPrunerTypes": [
"partition",
"time"
]
},
"transformConfigs": [
{
"columnName": "brands_responses",
"transformFunction": "jsonFormat(\"brands_responses\")"
}
]
}
Diogo Baeder
04/25/2022, 12:42 AMtransformConfigs
is missing from the table definition when looking at how the table got createdDiogo Baeder
04/25/2022, 12:44 AM{
"OFFLINE": {
"tableName": "cases_OFFLINE",
"tableType": "OFFLINE",
"segmentsConfig": {
"timeType": "DAYS",
"schemaName": "cases_schema",
"replication": "1",
"timeColumnName": "date_",
"allowNullTimeValue": false,
"replicasPerPartition": "1"
},
"tenants": {
"broker": "DefaultTenant",
"server": "DefaultTenant"
},
"tableIndexConfig": {
"invertedIndexColumns": [],
"noDictionaryColumns": [
"brands_responses"
],
"segmentPartitionConfig": {
"columnPartitionMap": {
"region": {
"functionName": "Murmur",
"numPartitions": 400
}
}
},
"rangeIndexVersion": 2,
"jsonIndexColumns": [],
"autoGeneratedInvertedIndex": false,
"createInvertedIndexDuringSegmentGeneration": false,
"loadMode": "MMAP",
"enableDefaultStarTree": false,
"enableDynamicStarTreeCreation": false,
"aggregateMetrics": false,
"nullHandlingEnabled": true
},
"metadata": {
"customConfigs": {}
},
"routing": {
"segmentPrunerTypes": [
"partition",
"time"
],
"instanceSelectorType": "balanced"
},
"isDimTable": false
}
}
Diogo Baeder
04/25/2022, 1:00 AMingestionConfig
as part of the table config - the documentation about JSON indexing is wrong, it doesn't mention this field. But even using this field, it doesn't work, if I send the correct payload I get:
{
"code": 400,
"error": "Arguments of a transform function '[brands_responses]' cannot contain the destination column 'brands_responses'"
}
Diogo Baeder
04/25/2022, 1:28 AMDiogo Baeder
04/25/2022, 1:45 AMMayank
Mark Needham
Diogo Baeder
04/28/2022, 12:59 PM