https://pinot.apache.org/ logo
f

Fabrício Dutra

03/04/2021, 3:03 PM
Hi all, I'm trying to ingest data from kafka using a topic that doesnt has a datetime column and receving this error:
Copy code
{
  "code": 400,
  "error": "Schema should not be null for REALTIME table"
}
I'm using this spec:
Copy code
curl -X POST "<http://localhost:9000/tables>" -H "accept: application/json" -H "Content-Type: application/json" -d "{ \"tableName\": \"realtime_strimzi_dev_acks\", \"tableType\": \"REALTIME\", \"segmentsConfig\": {  \"segmentPushType\": \"REFRESH\", \"schemaName\": \"sch_strimzi_acks\", \"replication\": \"1\", \"replicasPerPartition\": \"1\" }, \"tenants\": {}, \"tableIndexConfig\": { \"loadMode\": \"MMAP\", \"invertedIndexColumns\": [ \"column1\" ], \"streamConfigs\": { \"streamType\": \"kafka\", \"stream.kafka.consumer.type\": \"lowlevel\", \"stream.kafka.topic.name\": \"producer-test-strimzi-dev-acks-0\", \"stream.kafka.decoder.class.name\": \"org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder\", \"stream.kafka.consumer.factory.class.name\": \"org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory\", \"stream.kafka.broker.list\": \"edh-kafka-brokers.ingestion.svc.Cluster.local:9092\", \"realtime.segment.flush.threshold.time\": \"3600000\", \"realtime.segment.flush.threshold.size\": \"50000\", \"stream.kafka.consumer.prop.auto.offset.reset\": \"smallest\" } }, \"metadata\": { \"customConfigs\": {} }}"
Is there a way to create a realtime table autofilling/creating a datetime column?
k

Kishore G

03/04/2021, 3:12 PM
did you upload the schema first?
f

Fabrício Dutra

03/04/2021, 3:39 PM
yes, but I had the same error message
n

Neha Pawar

03/04/2021, 3:48 PM
Can you paste the schema here?
f

Fabrício Dutra

03/04/2021, 3:54 PM
I'm not including a timefieldspec as I dont have it on my Kafka topic. So would be nice if there was a way to autofill a datetime column on Pinot. That's the spec:
Copy code
{
  "schemaName": "sch_strimzi_ack",
  "dimensionFieldSpecs": [
    {
      "name": "column1",
      "dataType": "STRING"
    }
  ]
}
c

Chinmay Soman

03/04/2021, 4:00 PM
Auto creating a time stamp column is not supported as of now. Do you have any column in Kafka that we can derive time stamp from ?
k

Kishore G

03/04/2021, 4:06 PM
You can probably use now() udf
f

Fabrício Dutra

03/04/2021, 4:09 PM
hmm ok. We will try then to implement the workaround by including the datetime column on that topic. Thanks guys!!
n

Neha Pawar

03/04/2021, 4:31 PM
also, its failing in the first place because the schema name is not matching what you’ve put in the table config
Copy code
sch_strimzi_ack
vs
Copy code
"schemaName\": \"sch_strimzi_acks\
plural
hence the schema not found exception
we can make that exception clearer. Do you mind creating an issue on github?
f

Fabrício Dutra

03/04/2021, 6:58 PM
thanks Neha, the error was clearer when I fixed the name:
Copy code
{
  "code": 400,
  "error": "'timeColumnName' cannot be null in REALTIME table config"
}