Alvin
10/11/2022, 9:24 PMCaused by: org.apache.avro.AvroRuntimeException: Not a valid schema field: $ts$WEEK
at org.apache.avro.generic.GenericData$Record.get(GenericData.java:256)
at org.apache.pinot.plugin.inputformat.avro.AvroRecordExtractor.extract(AvroRecordExtractor.java:76)
at org.apache.pinot.plugin.inputformat.avro.AvroRecordReader.next(AvroRecordReader.java:74)
at org.apache.pinot.segment.local.segment.creator.RecordReaderSegmentCreationDataSource.gatherStats(RecordReaderSegmentCreationDataSource.java:66)
at org.apache.pinot.segment.local.segment.creator.RecordReaderSegmentCreationDataSource.gatherStats(RecordReaderSegmentCreationDataSource.java:37)
at org.apache.pinot.segment.local.segment.creator.impl.SegmentIndexCreationDriverImpl.init(SegmentIndexCreationDriverImpl.java:178)
at org.apache.pinot.segment.local.segment.creator.impl.SegmentIndexCreationDriverImpl.init(SegmentIndexCreationDriverImpl.java:152)
Rong R
10/11/2022, 10:25 PMapache-pinot-0.11.0-bin
you are using.
◦ is this a pre-built binary? a docker image?
• how are you deploying it
◦ do you have the complete shell script you use to launch the cluster and ingestion?Alvin
10/11/2022, 10:28 PMapache-pinot-0.11.0-bin
of apache pinot I downloadedAlvin
10/11/2022, 10:28 PMAlvin
10/11/2022, 10:29 PMAlvin
10/11/2022, 10:29 PMAlvin
10/11/2022, 10:30 PMRong R
10/11/2022, 10:32 PMRong R
10/11/2022, 10:32 PMAlvin
10/11/2022, 10:32 PMAlvin
10/11/2022, 10:36 PMAlvin
10/11/2022, 10:36 PMAlvin
10/11/2022, 10:36 PMAlvin
10/11/2022, 10:38 PMAlvin
10/11/2022, 10:38 PMextra:
configs: |-
pinot.set.instance.id.to.hostname=true
controller.task.scheduler.enabled=true
controller.data.dir=s3://${data_bucket_name}/controller-data
controller.local.temp.dir=/tmp/pinot-tmp-data/
pinot.controller.storage.factory.class.s3=org.apache.pinot.plugin.filesystem.S3PinotFS
pinot.controller.storage.factory.s3.region=us-east-1
pinot.controller.storage.factory.s3.disableAcl=false
pinot.controller.segment.fetcher.protocols=file,http,s3
pinot.controller.segment.fetcher.s3.class=org.apache.pinot.common.utils.fetcher.PinotFSSegmentFetcher
Alvin
10/11/2022, 10:38 PMAlvin
10/11/2022, 10:40 PMAlvin
10/11/2022, 10:44 PMapiVersion: "<http://sparkoperator.k8s.io/v1beta2|sparkoperator.k8s.io/v1beta2>"
kind: SparkApplication
metadata:
name: airline-stats-ingest-testing
namespace: dev
spec:
type: Java
mode: cluster
image: "datamechanics/spark:3.2.1-hadoop-3.3.1-java-11-scala-2.12-python-3.8-latest"
imagePullPolicy: Always
sparkVersion: 3.2.1
mainClass: org.apache.pinot.tools.admin.command.LaunchDataIngestionJobCommand
mainApplicationFile: <s3://dev-xxx-testing/spark-jars/pinot-all-0.11.0-jar-with-dependencies.jar>
arguments:
- "-jobSpecFile"
- "/mnt/config/sparkAirlineStatIngestionJobSpec.yaml"
deps:
jars:
- <s3://dev-xxx-testing/spark-jars/pinot-all-0.11.0-jar-with-dependencies.jar>
- <s3://dev-xxx-testing/spark-jars/pinot-batch-ingestion-spark-3.2-0.11.0-shaded.jar>
- <s3://dev-xxx-testing/spark-jars/pinot-avro-0.11.0-shaded.jar>
- <s3://dev-xxx-testing/spark-jars/pinot-csv-0.11.0-shaded.jar>
- <s3://dev-xxx-testing/spark-jars/pinot-parquet-0.11.0-shaded.jar>
- <s3://dev-xxx-testing/spark-jars/pinot-s3-0.11.0-shaded.jar>
hadoopConf:
com.amazonaws.services.s3.enableV4: "true"
fs.s3.impl: "org.apache.hadoop.fs.s3a.S3AFileSystem"
fs.AbstractFileSystem.s3.impl: "org.apache.hadoop.fs.s3a.S3A"
fs.s3.aws.credentials.provider: "com.amazonaws.auth.InstanceProfileCredentialsProvider,com.amazonaws.auth.DefaultAWSCredentialsProviderChain"
sparkConf:
spark.kubernetes.namespace: dev
spark.driver.extraJavaOptions: "-Dplugins.dir=${CLASSPATH} -Dlog4j2.configurationFile=/mnt/config/pinot-ingestion-job-log4j2.xml"
spark.driver.extraClassPath: "pinot-all-0.11.0-jar-with-dependencies.jar:pinot-avro-0.11.0-shaded.jar:pinot-batch-ingestion-spark-3.2-0.11.0-shaded.jar:pinot-csv-0.11.0-shaded.jar:pinot-parquet-0.11.0-shaded.jar:pinot-s3-0.11.0-shaded.jar"
spark.executor.extraClassPath: "pinot-all-0.11.0-jar-with-dependencies.jar:pinot-avro-0.11.0-shaded.jar:pinot-batch-ingestion-spark-3.2-0.11.0-shaded.jar:pinot-csv-0.11.0-shaded.jar:pinot-parquet-0.11.0-shaded.jar:pinot-s3-0.11.0-shaded.jar"
Rong R
10/11/2022, 10:45 PMRong R
10/11/2022, 10:45 PMAlvin
10/11/2022, 10:46 PMAlvin
10/11/2022, 10:46 PMRong R
10/11/2022, 10:47 PMapache-pinot-0.11.0-bin
and you helm is latest (e.g. current master which is even newer than apache-pinot-0.12.0-bin
Alvin
10/11/2022, 10:47 PMAlvin
10/11/2022, 10:47 PMRong R
10/11/2022, 10:47 PMAlvin
10/11/2022, 10:48 PMAlvin
10/11/2022, 10:48 PMAlvin
10/11/2022, 10:49 PMAlvin
10/11/2022, 10:49 PMAlvin
10/11/2022, 10:49 PMrelease-0.11.0
Alvin
10/11/2022, 10:50 PMRong R
10/11/2022, 10:51 PMAlvin
10/12/2022, 4:42 PMAlvin
10/12/2022, 4:42 PM22/10/12 16:42:00 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1) (10.20.72.20 executor 1): org.apache.avro.AvroRuntimeException: Not a valid schema field: $ts$WEEK
at org.apache.avro.generic.GenericData$Record.get(GenericData.java:256)
at org.apache.pinot.plugin.inputformat.avro.AvroRecordExtractor.extract(AvroRecordExtractor.java:76)
Alvin
10/12/2022, 4:43 PMcurl -X GET "<https://pinot.dev.zzzz.io/version>" -H "accept: application/json"
{"pinot-protobuf":"0.11.0-1b4d6b6b0a27422c1552ea1a936ad145056f7033","pinot-kafka-2.0":"0.11.0-1b4d6b6b0a27422c1552ea1a936ad145056f7033","pinot-avro":"0.11.0-1b4d6b6b0a27422c1552ea1a936ad145056f7033","pinot-distribution":"0.11.0-1b4d6b6b0a27422c1552ea1a936ad145056f7033","pinot-csv":"0.11.0-1b4d6b6b0a27422c1552ea1a936ad145056f7033","pinot-s3":"0.11.0-1b4d6b6b0a27422c1552ea1a936ad145056f7033","pinot-yammer":"0.11.0-1b4d6b6b0a27422c1552ea1a936ad145056f7033","pinot-segment-uploader-default":"0.11.0-1b4d6b6b0a27422c1552ea1a936ad145056f7033","pinot-batch-ingestion-standalone":"0.11.0-1b4d6b6b0a27422c1552ea1a936ad145056f7033","pinot-confluent-avro":"0.11.0-1b4d6b6b0a27422c1552ea1a936ad145056f7033","pinot-thrift":"0.11.0-1b4d6b6b0a27422c1552ea1a936ad145056f7033","pinot-orc":"0.11.0-1b4d6b6b0a27422c1552ea1a936ad145056f7033","pinot-azure":"0.11.0-1b4d6b6b0a27422c1552ea1a936ad145056f7033","pinot-gcs":"0.11.0-1b4d6b6b0a27422c1552ea1a936ad145056f7033","pinot-dropwizard":"0.11.0-1b4d6b6b0a27422c1552ea1a936ad145056f7033","pinot-hdfs":"0.11.0-1b4d6b6b0a27422c1552ea1a936ad145056f7033","pinot-adls":"0.11.0-1b4d6b6b0a27422c1552ea1a936ad145056f7033","pinot-kinesis":"0.11.0-1b4d6b6b0a27422c1552ea1a936ad145056f7033","pinot-json":"0.11.0-1b4d6b6b0a27422c1552ea1a936ad145056f7033","pinot-minion-builtin-tasks":"0.11.0-1b4d6b6b0a27422c1552ea1a936ad145056f7033","pinot-parquet":"0.11.0-1b4d6b6b0a27422c1552ea1a936ad145056f7033","pinot-segment-writer-file-based":"0.11.0-1b4d6b6b0a27422c1552ea1a936ad145056f7033"}%
Alvin
10/12/2022, 4:43 PMAlvin
10/12/2022, 4:44 PMdeps:
jars:
- <s3://dev-xxx-testing/spark-jars/pinot-all-0.11.0-jar-with-dependencies.jar>
- <s3://dev-xxx-testing/spark-jars/pinot-batch-ingestion-spark-3.2-0.11.0-shaded.jar>
- <s3://dev-xxx-testing/spark-jars/pinot-avro-0.11.0-shaded.jar>
- <s3://dev-xxx-testing/spark-jars/pinot-csv-0.11.0-shaded.jar>
- <s3://dev-xxx-testing/spark-jars/pinot-parquet-0.11.0-shaded.jar>
- <s3://dev-xxx-testing/spark-jars/pinot-s3-0.11.0-shaded.jar>
Rong R
10/12/2022, 4:44 PMAlvin
10/12/2022, 4:44 PMRong R
10/12/2022, 4:44 PMAlvin
10/12/2022, 4:46 PMRong R
10/12/2022, 4:46 PMAlvin
10/12/2022, 4:46 PMAlvin
10/12/2022, 4:49 PM$ts$WEEK,
and I am assuming it is computed dynamically at table creation time.
"fieldConfigList": [
{
"name": "ts",
"encodingType": "DICTIONARY",
"indexType": "TIMESTAMP",
"indexTypes": [
"TIMESTAMP"
],
"timestampConfig": {
"granularities": [
"DAY",
"WEEK",
"MONTH"
]
}
}
]
Alvin
10/12/2022, 4:50 PMAlvin
10/12/2022, 5:14 PMRong R
10/12/2022, 5:16 PMAlvin
10/12/2022, 5:18 PMRong R
10/12/2022, 5:21 PM