Ravikumar Maddi
04/23/2021, 4:32 PMtroywinter
04/26/2021, 3:01 PMCaught exception while transforming the record:java.lang.ClassCastException: nullMohamed Sultan
04/27/2021, 11:03 AMVengatesh Babu
04/27/2021, 2:10 PM<https://stackoverflow.com/questions/65886253/pinot-nested-json-ingestion>{
  "metricFieldSpecs": [],
  "dimensionFieldSpecs": [
    {
      "dataType": "STRING",
      "name": "name"
    },
    {
      "dataType": "LONG",
      "name": "age"
    },
    {
      "dataType": "STRING",
      "name": "subjects_str"
    },
    {
      "dataType": "STRING",
      "name": "subjects_name",
      "singleValueField": false
    },
    {
      "dataType": "STRING",
      "name": "subjects_grade",
      "singleValueField": false
    }
  ],
  "dateTimeFieldSpecs": [],
  "schemaName": "myTable"
}{
    "tableName": "myTable",
    "tableType": "OFFLINE",
    "segmentsConfig": {
        "segmentPushType": "APPEND",
        "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy",
        "schemaName": "myTable",
        "replication": "1"
    },
    "tenants": {},
    "tableIndexConfig": {
        "loadMode": "MMAP",
        "invertedIndexColumns": [],
        "noDictionaryColumns": [
            "subjects_str"
        ],
        "jsonIndexColumns": [
            "subjects_str"
        ]
    },
    "metadata": {
        "customConfigs": {}
    },
    "ingestionConfig": {
        "batchIngestionConfig": {
            "segmentIngestionType": "APPEND",
            "segmentIngestionFrequency": "DAILY",
            "batchConfigMaps": [],
            "segmentNameSpec": {},
            "pushSpec": {}
        },
        "transformConfigs": [
            {
                "columnName": "subjects_str",
                "transformFunction": "jsonFormat(subjects)"
            },
            {
                "columnName": "subjects_name",
                "transformFunction": "jsonPathArray(subjects, '$.[*].name')"
            },
            {
                "columnName": "subjects_grade",
                "transformFunction": "jsonPathArray(subjects, '$.[*].grade')"
            }
        ]
    }
}
Data.json
{"name":"Pete","age":24,"subjects":[{"name":"maths","grade":"A"},{"name":"maths","grade":"B--"}]}
{"name":"Pete1","age":23,"subjects":[{"name":"maths","grade":"A+"},{"name":"maths","grade":"B--"}]}
{"name":"Pete2","age":25,"subjects":[{"name":"maths","grade":"A++"},{"name":"maths","grade":"B--"}]}
{"name":"Pete3","age":26,"subjects":[{"name":"maths","grade":"A+++"},{"name":"maths","grade":"B--"}]}bin/pinot-admin.sh LaunchDataIngestionJob -jobSpecFile /home/sas/apache-pinot-incubating-0.7.1-bin/examples/batch/jsontype/ingestionJobSpec.yaml 
SegmentGenerationJobSpec: 
!!org.apache.pinot.spi.ingestion.batch.spec.SegmentGenerationJobSpec
cleanUpOutputDir: false
excludeFileNamePattern: null
executionFrameworkSpec: {extraConfigs: null, name: standalone, segmentGenerationJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner,
  segmentMetadataPushJobRunnerClassName: null, segmentTarPushJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner,
  segmentUriPushJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner}
includeFileNamePattern: glob:**/*.json
inputDirURI: examples/batch/jsontype/rawdata
jobType: SegmentCreationAndTarPush
outputDirURI: examples/batch/jsontype/segments
overwriteOutput: true
pinotClusterSpecs:
- {controllerURI: '<http://localhost:9000>'}
pinotFSSpecs:
- {className: org.apache.pinot.spi.filesystem.LocalPinotFS, configs: null, scheme: file}
pushJobSpec: {pushAttempts: 2, pushParallelism: 1, pushRetryIntervalMillis: 1000,
  segmentUriPrefix: null, segmentUriSuffix: null}
recordReaderSpec: {className: org.apache.pinot.plugin.inputformat.json.JSONRecordReader,
  configClassName: null, configs: null, dataFormat: json}
segmentCreationJobParallelism: 0
segmentNameGeneratorSpec: null
tableSpec: {schemaURI: '<http://localhost:9000/tables/myTable/schema>', tableConfigURI: '<http://localhost:9000/tables/myTable>',
  tableName: myTable}
tlsSpec: null
Trying to create instance for class org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner
Creating an executor service with 1 threads(Job parallelism: 0, available cores: 40.)
Initializing PinotFS for scheme file, classname org.apache.pinot.spi.filesystem.LocalPinotFS
Submitting one Segment Generation Task for file:/home/sas/apache-pinot-incubating-0.7.1-bin/examples/batch/jsontype/rawdata/test.json
Initialized FunctionRegistry with 119 functions: [fromepochminutesbucket, arrayunionint, codepoint, mod, sha256, year, yearofweek, upper, arraycontainsstring, arraydistinctstring, bytestohex, tojsonmapstr, trim, timezoneminute, sqrt, togeometry, normalize, fromepochdays, arraydistinctint, exp, jsonpathlong, yow, toepochhoursrounded, lower, toutf8, concat, ceil, todatetime, jsonpathstring, substr, dayofyear, contains, jsonpatharray, arrayindexofint, fromepochhoursbucket, arrayindexofstring, minus, arrayunionstring, toepochhours, toepochdaysrounded, millisecond, fromepochhours, arrayreversestring, dow, doy, min, toepochsecondsrounded, strpos, jsonpath, tosphericalgeography, fromepochsecondsbucket, max, reverse, hammingdistance, stpoint, abs, timezonehour, toepochseconds, arrayconcatint, quarter, md5, ln, toepochminutes, arraysortstring, replace, strrpos, jsonpathdouble, stastext, second, arraysortint, split, fromepochdaysbucket, lpad, day, toepochminutesrounded, fromdatetime, fromepochseconds, arrayconcatstring, base64encode, ltrim, arraysliceint, chr, sha, plus, base64decode, month, arraycontainsint, toepochminutesbucket, startswith, week, jsonformat, sha512, arrayslicestring, fromepochminutes, remove, dayofmonth, times, hour, rpad, arrayremovestring, now, divide, bigdecimaltobytes, floor, toepochsecondsbucket, toepochdaysbucket, hextobytes, rtrim, length, toepochhoursbucket, bytestobigdecimal, toepochdays, arrayreverseint, datetrunc, minute, round, dayofweek, arrayremoveint, weekofyear] in 942ms
Using class: org.apache.pinot.plugin.inputformat.json.JSONRecordReader to read segment, ignoring configured file format: AVRO
Finished building StatsCollector!
Collected stats for 4 documents
Using fixed length dictionary for column: subjects_grade, size: 20
Created dictionary for STRING column: subjects_grade with cardinality: 5, max length in bytes: 4, range: A to B--
Using fixed length dictionary for column: subjects_name, size: 5
Created dictionary for STRING column: subjects_name with cardinality: 1, max length in bytes: 5, range: maths to maths
Using fixed length dictionary for column: name, size: 20
Created dictionary for STRING column: name with cardinality: 4, max length in bytes: 5, range: Pete to Pete3
Created dictionary for LONG column: age with cardinality: 4, range: 23 to 26
Start building IndexCreator!
Finished records indexing in IndexCreator!
Trying to create instance for class org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner
Initializing PinotFS for scheme file, classname org.apache.pinot.spi.filesystem.LocalPinotFS
Start pushing segments: []... to locations: [org.apache.pinot.spi.ingestion.batch.spec.PinotClusterSpec@4e31276e] for table myTableMayank
Jay Desai
04/27/2021, 9:37 PMapache:masterPhúc Huỳnh
04/28/2021, 2:54 AMsegmentUriPushsegmentMetadataPushAlon Burg
04/28/2021, 8:34 AMUnrecognized field at: whitelistDatasetsMohamed Sultan
04/28/2021, 8:49 AMAlon Burg
04/28/2021, 12:30 PMdocker container exec -it pinot-quickstart bin/generator.sh complexWebsiteSyed Akram
04/29/2021, 8:50 AMPedro Silva
04/30/2021, 4:53 PMMayank
Pedro Silva
05/03/2021, 5:15 PMAkash
05/05/2021, 10:25 PMJay Desai
05/05/2021, 10:31 PMPedro Silva
05/06/2021, 2:42 PMcode: 500
error: "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory"ayush sharma
05/06/2021, 5:07 PMhelm -n my-pinot-kube install pinot-zookeeper incubator/zookeeper --set replicaCount=1_helpers.tplzookeeper.urlconfigurationOverrideError: template: pinot/templates/server/statefulset.yml:63:27: executing "pinot/templates/server/statefulset.yml" at <include "zookeeper.url" .>: error calling include: template: pinot/templates/_helpers.tpl:79:33: executing "zookeeper.url" at <index .Values "configurationOverrides" "zookeeper.connect">: error calling index: index of nil pointerArun Vasudevan
05/06/2021, 10:00 PMUpload the schema and Table ConfigSending request: <http://pinot-quickstart:9000/schemas> to controller: ea8d7bfc16ea, version: Unknown
{"code":500,"error":"org.apache.kafka.common.errors.TimeoutException: Timeout expired while fetching topic metadata"}bash-4.4# bin/kafka-topics.sh --bootstrap-server kafka:9092 --topic transcript-topic --describe
Topic: transcript-topic	PartitionCount: 1	ReplicationFactor: 1	Configs: segment.bytes=1073741824
	Topic: transcript-topic	Partition: 0	Leader: 0	Replicas: 0	Isr: 0Ambika
05/07/2021, 2:42 PMPedro Silva
05/07/2021, 3:20 PM{
  "columnName": "audioLength",
  "transformFunction": "JSONPATH(result,'$.audioLength')"
}{
  "name": "result",
  "dataType": "STRING"
}{
  "metadata": {
    "isMatch": "Y"
  }
}{
  "AudioCreated": "2021-05-06T23: 40: 28.6629486",
  "AudioLength": "00: 04: 02.1800000",
  "BlobPath": "068fd3f0-e5d6-499a-bfb0-94491499aba6/9db5efb9-4a72-44ae-a570-8647e1ac896a/33d3c59d-b8e1-4818-be60-124e637fb02b.wav",
  "isValid": true,
  "feedback": "",
  "otherFeedback": "",
  "result": 1,
  "crowdMemberId": "90c97d94-91c3-4587-8c91-26f6e971d52c",
  "tags": null,
  "scriptToExecute": null
}Aaron Wishnick
05/07/2021, 8:29 PMRK
05/09/2021, 4:18 PMPedro Silva
05/10/2021, 10:32 AM2021/05/10 10:29:48.876 ERROR [ServerSegmentCompletionProtocolHandler] [HitExecutionView__13__6__20210510T1029Z] Could not send request <http://pinot-controller-0.pinot-controller-headless.dc-pinot.svc.cluster.local:9000/segmentConsumed?name=HitExecutionView__13__6__20210510T1029Z&offset=952660&instance=Server_pinot-server-1.pinot-server-headless.dc-pinot.svc.cluster.local_8098&reason=rowLimit&memoryUsedBytes=7330344&rowCount=12500&streamPartitionMsgOffset=952660>
java.net.SocketTimeoutException: Read timed out
	at java.net.SocketInputStream.socketRead0(Native Method) ~[?:1.8.0_282]
	at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) ~[?:1.8.0_282]
	at java.net.SocketInputStream.read(SocketInputStream.java:171) ~[?:1.8.0_282]
	at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[?:1.8.0_282]
	at shaded.org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at shaded.org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at shaded.org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at shaded.org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at shaded.org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at shaded.org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at shaded.org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at shaded.org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at shaded.org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at shaded.org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at shaded.org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at shaded.org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at shaded.org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at shaded.org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at shaded.org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at shaded.org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at shaded.org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at org.apache.pinot.common.utils.FileUploadDownloadClient.sendRequest(FileUploadDownloadClient.java:383) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at org.apache.pinot.common.utils.FileUploadDownloadClient.sendSegmentCompletionProtocolRequest(FileUploadDownloadClient.java:675) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at org.apache.pinot.server.realtime.ServerSegmentCompletionProtocolHandler.sendRequest(ServerSegmentCompletionProtocolHandler.java:207) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at org.apache.pinot.server.realtime.ServerSegmentCompletionProtocolHandler.segmentConsumed(ServerSegmentCompletionProtocolHandler.java:174) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.postSegmentConsumedMsg(LLRealtimeSegmentDataManager.java:949) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager$PartitionConsumer.run(LLRealtimeSegmentDataManager.java:559) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282]
2021/05/10 10:29:48.877 ERROR [LLRealtimeSegmentDataManager_HitExecutionView__13__6__20210510T1029Z] [HitExecutionView__13__6__20210510T1029Z] Holding after response from Controller: {"streamPartitionMsgOffset":null,"buildTimeSec":-1,"isSplitCommitType":false,"status":"NOT_SENT","offset":-1}Tamás Nádudvari
05/10/2021, 8:21 PMRealtimeToOfflineSegmentsTaskAaron Wishnick
05/10/2021, 9:26 PMAssembled and processed 7990100 records from 17 columns in 121766 ms: 65.618484 rec/ms, 1115.5142 cell/ms
time spent so far 0% reading (237 ms) and 99% processing (121766 ms)
at row 7990100. reading next block
block read in memory in 44 ms. row count = 1418384
Finished building StatsCollector!
Collected stats for 9408484 documents
Created dictionary for INT column: ...
...
RecordReader initialized will read a total of 9408484 records.
at row 0. reading next block
Got brand-new decompressor [.gz]
block read in memory in 133 ms. row count = 7990100
Start building IndexCreator!
Assembled and processed 7990100 records from 17 columns in 127060 ms: 62.884464 rec/ms, 1069.0359 cell/ms
time spent so far 0% reading (133 ms) and 99% processing (127060 ms)
at row 7990100. reading next block
block read in memory in 26 ms. row count = 1418384
Finished records indexing in IndexCreator!
Finished segment seal!
...
Generated 25884 star-tree records from 9408484 segment records
Finished constructing star-tree, got 1228 tree nodes and 2058 records under star-node
Finished creating aggregated documents, got 1227 aggregated records
Finished building star-tree in 276631ms
Starting building star-tree with config: StarTreeV2BuilderConfig[...]
Generated 9408484 star-tree records from 9408484 segment recordsArun Vasudevan
05/10/2021, 11:26 PMAvro Schema:{
  "type": "record",
  "name": "Clickstream",
  "namespace": "com.acme.event.clickstream.business",
  "fields": [
    {
      "name": "event_header",
      "type": {
        "type": "record",
        "name": "EventHeader",
        "namespace": "com.acme.event",
        "fields": [
          {
            "name": "event_uuid",
            "type": {
              "type": "string",
              "avro.java.string": "String",
              "logicalType": "uuid"
            },
            "doc": "Universally Unique Identifier for this event "
          },
          {
            "name": "published_timestamp",
            "type": {
              "type": "long",
              "logicalType": "timestamp-millis"
            },
            "doc": "Timestamp in milliseconds since the epoch that the event occurred on its producing device. e.g. <code>System.currentTimeMillis()</code>."
          }]
       }
     }
}{
  "schemaName": "user_clickstream_v1",
  "dimensionFieldSpecs": [
    {
      "name": "event_header.event_uuid",
      "dataType": "STRING"
    }
  ],
  "dateTimeFieldSpecs": [
    {
      "name": "event_header.published_timestamp",
      "dataType": "LONG",
      "format": "1:MILLISECONDS:EPOCH",
      "granularity": "1:MILLISECONDS"
    }
  ]
}Jonathan Meyer
05/11/2021, 3:14 PMHello
What is the recommended approach to getting the "last non-null value" ?
Use a UDF ?
Charles
05/12/2021, 6:20 AMGrpc port is not set for instance: Controller_10.252.125.84_9000RK
05/12/2021, 7:55 AM