Hi everyone, I was trying to ingest a JSON data th...
# troubleshooting
k
Hi everyone, I was trying to ingest a JSON data through Kafka. One of the columns is an array of nested JSON and I have marked it as JSON datatype in the schema. When I publish the data to the topic, I get an error in the Pinot server which is "Caused by: java.lang.IllegalStateException: Cannot read single-value from Collection". A sample record would be "fieldToValueMap" : { "Agent_phone_number" : 2807536641, "Call_end_time" : "2021-09-20 194141", "Calling_number" : "4025165405", "Call_start_time" : "2021-09-20 193819", "Account_number" : "4T1QUDSKPI", "Customer_name" : "Dan", "Queue" : { "qdetails" : [ { "queue_duration" : 229, "qname" : "q2" }, { "queue_duration" : 90, "qname" : "q3" } ] }, "Agent_id" : "K3GDP9" }, "nullValueFields" : [ ] Where am I going wrong? I have attached the schema and configuration files in the thread.
These are my schema and configuration files.
k
Hi, In the attached schema there is no field declared as JSON. Also, the sample-record shared doesn't appear to be a valid json. Can you send the full json for it?
k
Sorry i had shared a different schema Here's a record from my json file {"Calling_number":9486855381,"Customer_name":"Changed","Account_number":"8H4GORV05Q","Agent_id":"FHG5Z1","Agent_phone_number":4470000588,"Call_start_time":"2021-07-31 014515","Call_end_time":"2021-07-31 014802","Queue": {"qdetails":[{"qname":"q3","queue_duration":150},{"qname":"q2", "queue_duration":157}]}}
k
Thanks. Can you also paste the complete stacktrace for
"Caused by: java.lang.IllegalStateException: Cannot read single-value from Collection".
The line should also contain the column name as well
k
It's for column 'Queue'
java.lang.RuntimeException: Caught exception while transforming data type for column: Queue at org.apache.pinot.segment.local.recordtransformer.DataTypeTransformer.transform(DataTypeTransformer.java:95) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f] at org.apache.pinot.segment.local.recordtransformer.CompositeTransformer.transform(CompositeTransformer.java:83) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f] at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.processStreamEvents(LLRealtimeSegmentDataManager.java:532) [pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f] at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.consumeLoop(LLRealtimeSegmentDataManager.java:420) [pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f] at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager$PartitionConsumer.run(LLRealtimeSegmentDataManager.java:598) [pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f] at java.lang.Thread.run(Thread.java:829) [?:?] Caused by: java.lang.IllegalStateException: Cannot read single-value from Collection: [111, q2] for column: Queue at shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:721) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f] at org.apache.pinot.segment.local.recordtransformer.DataTypeTransformer.standardizeCollection(DataTypeTransformer.java:176) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f] at org.apache.pinot.segment.local.recordtransformer.DataTypeTransformer.standardize(DataTypeTransformer.java:119) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f] at org.apache.pinot.segment.local.recordtransformer.DataTypeTransformer.standardize(DataTypeTransformer.java:132) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f] at org.apache.pinot.segment.local.recordtransformer.DataTypeTransformer.standardizeCollection(DataTypeTransformer.java:159) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f] at org.apache.pinot.segment.local.recordtransformer.DataTypeTransformer.standardize(DataTypeTransformer.java:119) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f] at org.apache.pinot.segment.local.recordtransformer.DataTypeTransformer.transform(DataTypeTransformer.java:63) ~[pinot-all-0.10.0-jar-with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]
k
Hi Can you try by setting dataType of
Queue
column to
STRING
in schema. Also, change the
noDictionaryColumns
to
jsonIndexColumns
in table config
k
It still throws an error java.lang.RuntimeException: Caught exception while transforming data type for column: Queue
k
Hi, It works with
JSON
datatype in master branch. This seems to be a bug. If you are stuck with 0.10 release, I suggest exploring https://docs.pinot.apache.org/basics/data-import/complex-type
n
another easier option if you have to use 0.10.0, you can keep everything as STRING,
Copy code
{
          "name":"QueueName",
          "dataType":"STRING"

      },
      {
          "name":"QueueJson",
          "dataType":"STRING"

      },
and then use JSONFORMAT for QueueJson and JSONPATHSTRING for the other extractions
Copy code
"ingestionConfig": {
        "transformConfigs": [{
          "columnName": "QueueName",
          "transformFunction": "JSONPATHSTRING(Queue,'$.qdetails[0].qname','null')"
        },{
          "columnName": "QueueJson",
          "transformFunction": "JSONFORMAT(Queue)"
        }]
    }