ayush sharma
03/05/2021, 7:36 PMapiVersion: batch/v1
kind: Job
metadata:
name: pinot-case-offline-ingestion
namespace: my-pinot-kube
spec:
template:
spec:
containers:
- name: pinot-load-case-offline
image: apachepinot/pinot:0.3.0-SNAPSHOT
args: ["LaunchDataIngestionJob", "-jobSpecFile", "/opt/data/table-configs/case_history/job-spec.yml"]
volumeMounts:
- name: mount-data
mountPath: /opt/data
restartPolicy: OnFailure
volumes:
- name: mount-data
hostPath:
path: /opt/data
backoffLimit: 100
After applying this job to node, nothing happens and this is the log of the pod.
SegmentGenerationJobSpec:
!!org.apache.pinot.spi.ingestion.batch.spec.SegmentGenerationJobSpec
excludeFileNamePattern: null
executionFrameworkSpec: {extraConfigs: null, name: standalone, segmentGenerationJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner,
segmentTarPushJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner,
segmentUriPushJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner}
includeFileNamePattern: glob:**/*.csv
inputDirURI: /opt/data/csv_data/case_prod_data
jobType: SegmentCreationAndTarPush
outputDirURI: /pinot-segments/case_history
overwriteOutput: true
pinotClusterSpecs:
- {controllerURI: '<http://192.168.49.2:30892/>'}
pinotFSSpecs:
- {className: org.apache.pinot.spi.filesystem.LocalPinotFS, configs: null, scheme: file}
pushJobSpec: null
recordReaderSpec:
className: org.apache.pinot.plugin.inputformat.csv.CSVRecordReader
configClassName: org.apache.pinot.plugin.inputformat.csv.CSVRecordReaderConfig
configs: {delimiter: '|', multiValueDelimiter: ''}
dataFormat: csv
segmentNameGeneratorSpec:
configs: {segment.name.prefix: case_history, exclude.sequence.id: 'true'}
type: normalizedDate
tableSpec: {schemaURI: null, tableConfigURI: null, tableName: case_history}
Trying to create instance for class org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner
Initializing PinotFS for scheme file, classname org.apache.pinot.spi.filesystem.LocalPinotFS
Am I ingesting the data incorrectly ?Xiang Fu
Xiang Fu
pushJobSpec: null
ayush sharma
03/05/2021, 9:19 PMpushJobSpec:
pushParallelism: 2
pushAttempts: 2
pushRetryIntervalMillis: 1000
But the job gets completed with no errors. And the pod log is
SegmentGenerationJobSpec:
!!org.apache.pinot.spi.ingestion.batch.spec.SegmentGenerationJobSpec
excludeFileNamePattern: null
executionFrameworkSpec: {extraConfigs: null, name: standalone, segmentGenerationJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner,
segmentTarPushJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner,
segmentUriPushJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner}
includeFileNamePattern: glob:**/*.csv
inputDirURI: /opt/data/csv_data/case_prod_data
jobType: SegmentCreationAndTarPush
outputDirURI: /pinot-segments/case_history
overwriteOutput: true
pinotClusterSpecs:
- {controllerURI: '<http://192.168.49.2:30892/>'}
pinotFSSpecs:
- {className: org.apache.pinot.spi.filesystem.LocalPinotFS, configs: null, scheme: file}
pushJobSpec: {pushAttempts: 2, pushParallelism: 2, pushRetryIntervalMillis: 1000,
segmentUriPrefix: null, segmentUriSuffix: null}
recordReaderSpec:
className: org.apache.pinot.plugin.inputformat.csv.CSVRecordReader
configClassName: org.apache.pinot.plugin.inputformat.csv.CSVRecordReaderConfig
configs: {delimiter: '|', multiValueDelimiter: ''}
dataFormat: csv
segmentNameGeneratorSpec:
configs: {segment.name.prefix: case_history, exclude.sequence.id: 'true'}
type: normalizedDate
tableSpec: {schemaURI: null, tableConfigURI: null, tableName: case_history}
Trying to create instance for class org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner
Initializing PinotFS for scheme file, classname org.apache.pinot.spi.filesystem.LocalPinotFS
Xiang Fu
Xiang Fu
ayush sharma
03/05/2021, 9:27 PM16:26:48:ayush@:pinot :alien: kubectl -n my-pinot-kube describe jobs.batch pinot-case-offline-ingestion
Name: pinot-case-offline-ingestion
Namespace: my-pinot-kube
Selector: controller-uid=25b4e843-b600-4de2-a2ad-584ac8ce17b5
Labels: controller-uid=25b4e843-b600-4de2-a2ad-584ac8ce17b5
job-name=pinot-case-offline-ingestion
Annotations: <none>
Parallelism: 1
Completions: 1
Start Time: Fri, 05 Mar 2021 16:26:41 -0500
Completed At: Fri, 05 Mar 2021 16:26:44 -0500
Duration: 3s
Pods Statuses: 0 Running / 1 Succeeded / 0 Failed
Pod Template:
Labels: controller-uid=25b4e843-b600-4de2-a2ad-584ac8ce17b5
job-name=pinot-case-offline-ingestion
Containers:
pinot-load-case-offline:
Image: apachepinot/pinot:0.3.0-SNAPSHOT
Port: <none>
Host Port: <none>
Args:
LaunchDataIngestionJob
-jobSpecFile
/opt/data/table-configs/case_history/job-spec.yml
Environment: <none>
Mounts:
/opt/data from mount-data (rw)
Volumes:
mount-data:
Type: HostPath (bare host directory volume)
Path: /opt/data
HostPathType:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 27s job-controller Created pod: pinot-case-offline-ingestion-mfvrx
Normal Completed 24s job-controller Job completed
The following is the job spec file to refer.
What should be the pinotClusterSpecs.controllerURI value? I tried changing it to anything gibberish and I faced the same logs. I think, my value of pinotClusterSpecs.controllerURI is incorrect.
executionFrameworkSpec:
name: 'standalone'
segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner'
segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner'
segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner'
jobType: SegmentCreationAndTarPush
inputDirURI: '/opt/data/csv_data/case_prod_data'
includeFileNamePattern: 'glob:**/*.csv'
outputDirURI: '/pinot-segments/case_history'
overwriteOutput: true
pinotFSSpecs:
- scheme: file
className: org.apache.pinot.spi.filesystem.LocalPinotFS
recordReaderSpec:
dataFormat: 'csv'
className: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReader'
configClassName: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReaderConfig'
configs:
delimiter: '|'
multiValueDelimiter: ''
tableSpec:
tableName: 'case_history'
pinotClusterSpecs:
# - controllerURI: 'pinot-controller:9000'
- controllerURI: '<http://192.168.49.2:30892/>'
segmentNameGeneratorSpec:
type: normalizedDate
configs:
segment.name.prefix: 'case_history'
exclude.sequence.id: true
pushJobSpec:
pushParallelism: 2
pushAttempts: 2
pushRetryIntervalMillis: 1000
Xiang Fu
/opt/data/csv_data/case_prod_data
ayush sharma
03/05/2021, 9:29 PMXiang Fu
Xiang Fu
apachepinot/pinot:0.6.0
Xiang Fu
ayush sharma
03/05/2021, 9:40 PMResponse for pushing table case_history segment case_history to location <http://192.168.49.2:30892> - 200: {"status":"Successfully uploaded segment: case_history of table: case_history"}
ayush sharma
03/05/2021, 9:40 PMayush sharma
03/05/2021, 9:41 PMXiang Fu
Xiang Fu
ayush sharma
03/05/2021, 9:42 PMXiang Fu
Xiang Fu
ayush sharma
03/05/2021, 9:44 PM2021/03/05 20:45:00.943 INFO [HelixServerStarter] [Start a Pinot [SERVER]] Starting Pinot server
2021/03/05 20:45:00.944 INFO [HelixServerStarter] [Start a Pinot [SERVER]] Initializing Helix manager with zkAddress: pinot-zookeeper:2181, clusterName: pinot-quickstart, instanceId: Server_pinot-server-0.pinot-server-headless.my-pinot-kube.svc.cluster.local_8098
2021/03/05 20:45:02.560 INFO [HelixServerStarter] [Start a Pinot [SERVER]] Initializing server instance and registering state model factory
2021/03/05 20:45:51.252 INFO [HelixServerStarter] [Start a Pinot [SERVER]] Connecting Helix manager
2021/03/05 20:46:42.537 WARN [ClientCnxn] [Start a Pinot [SERVER]-SendThread(pinot-zookeeper:2181)] Client session timed out, have not heard from server in 31084ms for sessionid 0x0
2021/03/05 20:46:44.353 WARN [ParticipantHealthReportTask] [Start a Pinot [SERVER]] ParticipantHealthReportTimerTask already stopped
2021/03/05 20:47:10.343 WARN [CallbackHandler] [Start a Pinot [SERVER]] Callback handler received event in wrong order. Listener: org.apache.helix.messaging.handling.HelixTaskExecutor@2767bcd8, path: /pinot-quickstart/INSTANCES/Server_pinot-server-0.pinot-server-headless.my-pinot-kube.svc.cluster.local_8098/MESSAGES, expected types: [CALLBACK, FINALIZE] but was INIT
2021/03/05 20:47:11.245 INFO [HelixServerStarter] [Start a Pinot [SERVER]] Instance config for instance: Server_pinot-server-0.pinot-server-headless.my-pinot-kube.svc.cluster.local_8098 has instance tags: [DefaultTenant_OFFLINE, DefaultTenant_REALTIME], host: pinot-server-0.pinot-server-headless.my-pinot-kube.svc.cluster.local, port: 8098, no need to update
2021/03/05 20:47:11.249 INFO [HelixServerStarter] [Start a Pinot [SERVER]] Using class: org.apache.pinot.server.api.access.AllowAllAccessFactory as the AccessControlFactory
2021/03/05 20:47:11.455 INFO [HelixServerStarter] [Start a Pinot [SERVER]] Starting server admin application on: <http://0.0.0.0:8097>
2021/03/05 20:47:13.650 WARN [ClientCnxn] [Start a Pinot [SERVER]-SendThread(pinot-zookeeper:2181)] Session 0x10001285ff10004 for server pinot-zookeeper/10.107.87.233:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[?:1.8.0_282]
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[?:1.8.0_282]
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[?:1.8.0_282]
at sun.nio.ch.IOUtil.read(IOUtil.java:192) ~[?:1.8.0_282]
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) ~[?:1.8.0_282]
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:75) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b2d716d9c465eaf69685f8e284015de5cd7b038e]
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:363) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b2d716d9c465eaf69685f8e284015de5cd7b038e]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b2d716d9c465eaf69685f8e284015de5cd7b038e]
2021/03/05 20:47:46.344 WARN [ZKHelixManager] [ZkClient-EventThread-16-pinot-zookeeper:2181] KeeperState:Disconnected, SessionId: 10001285ff10004, instance: Server_pinot-server-0.pinot-server-headless.my-pinot-kube.svc.cluster.local_8098, type: PARTICIPANT
Mar 05, 2021 8:48:39 PM org.glassfish.grizzly.http.server.NetworkListener start
INFO: Started listener bound to [0.0.0.0:8097]
Mar 05, 2021 8:48:40 PM org.glassfish.grizzly.http.server.HttpServer start
INFO: [HttpServer] Started.
2021/03/05 20:48:41.841 WARN [ZKHelixManager] [ZkClient-EventThread-16-pinot-zookeeper:2181] KeeperState:Disconnected, SessionId: 10001285ff10004, instance: Server_pinot-server-0.pinot-server-headless.my-pinot-kube.svc.cluster.local_8098, type: PARTICIPANT
2021/03/05 20:50:17.063 WARN [ZKHelixManager] [ZkClient-EventThread-16-pinot-zookeeper:2181] KeeperState:Disconnected, SessionId: 10001285ff10004, instance: Server_pinot-server-0.pinot-server-headless.my-pinot-kube.svc.cluster.local_8098, type: PARTICIPANT
2021/03/05 20:51:06.653 ERROR [StartServiceManagerCommand] [Start a Pinot [SERVER]] Failed to start a Pinot [SERVER] at 368.2 since launch
org.apache.helix.HelixException: fail to set config. cluster: pinot-quickstart is NOT setup.
at org.apache.helix.ConfigAccessor.set(ConfigAccessor.java:300) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b2d716d9c465eaf69685f8e284015de5cd7b038e]
at org.apache.helix.manager.zk.ZKHelixAdmin.setConfig(ZKHelixAdmin.java:1092) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b2d716d9c465eaf69685f8e284015de5cd7b038e]
at org.apache.pinot.server.starter.helix.HelixServerStarter.start(HelixServerStarter.java:361) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b2d716d9c465eaf69685f8e284015de5cd7b038e]
at org.apache.pinot.tools.service.PinotServiceManager.startServer(PinotServiceManager.java:150) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b2d716d9c465eaf69685f8e284015de5cd7b038e]
at org.apache.pinot.tools.service.PinotServiceManager.startRole(PinotServiceManager.java:95) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b2d716d9c465eaf69685f8e284015de5cd7b038e]
at org.apache.pinot.tools.admin.command.StartServiceManagerCommand$1.lambda$run$0(StartServiceManagerCommand.java:260) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b2d716d9c465eaf69685f8e284015de5cd7b038e]
at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.startPinotService(StartServiceManagerCommand.java:286) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b2d716d9c465eaf69685f8e284015de5cd7b038e]
at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.access$000(StartServiceManagerCommand.java:57) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b2d716d9c465eaf69685f8e284015de5cd7b038e]
at org.apache.pinot.tools.admin.command.StartServiceManagerCommand$1.run(StartServiceManagerCommand.java:260) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-b2d716d9c465eaf69685f8e284015de5cd7b038e]
2021/03/05 21:37:47.170 WARN [ConfigAccessor] [ZkClient-EventThread-16-pinot-zookeeper:2181] No config found at /pinot-quickstart/CONFIGS/RESOURCE/case_history_OFFLINE
ayush sharma
03/05/2021, 9:44 PMayush sharma
03/05/2021, 9:44 PMXiang Fu
ayush sharma
03/05/2021, 9:46 PMkubectl create ns my-pinot-kube
helm install pinot /home/ayush/spyne/incubator-pinot/kubernetes/helm/pinot -n my-pinot-kube --set replicas=1
Xiang Fu
Xiang Fu
ayush sharma
03/05/2021, 9:53 PMWARN [PinotInstanceRestletResource] [grizzly-http-server-1] Admin port is not set for instance: Server_pinot-server-0.pinot-server-headless.my-pinot-kube.svc.cluster.local_8098
...
...
WARN [PinotInstanceRestletResource] [grizzly-http-server-1] Grpc port is not set for instance: Controller_pinot-controller-0.pinot-controller-headless.my-pinot-kube.svc.cluster.local_9000
...
...
ayush sharma
03/05/2021, 9:57 PMWARN [SegmentStatusChecker] [pool-7-thread-2] Table case_history_OFFLINE has 1 segments with no online replicas
WARN [SegmentStatusChecker] [pool-7-thread-2] Table case_history_OFFLINE has 0 replicas, below replication threshold :1
Xiang Fu
Xiang Fu
Xiang Fu
ayush sharma
03/05/2021, 9:59 PMayush sharma
03/05/2021, 10:05 PMXiang Fu
Xiang Fu
Xiang Fu
ayush sharma
03/05/2021, 10:16 PM