Harold Lim
01/30/2021, 7:24 PMNeer Shay
02/01/2021, 8:54 AMHarold Lim
02/02/2021, 6:07 PMMatt
02/02/2021, 8:57 PMElon
02/02/2021, 10:54 PMKha
02/03/2021, 5:08 PMbaseballStats
offline table with some modifications to the files. When I am uploading my own batch data, I get the following error:
400 (Bad Request) with reason: "Cannot add invalid schema: rows_10m. Reason: null"
I currently have a CSV that's formatted like this
# /DIRECTORIES/rawdata/rows_10m.csv
id, hash_one, text_one
0, (large integer), a
1, (large integer), b
...
A schema.json that has this
# /DIRECTORIES/rows_10m_schema.json
{
"schemaName": "rows_10m",
"dimensionFieldSpecs": [
{
"datatype": "STRING",
"name": "text_one"
}
],
"metricFieldSpecs": [
{
"datatype": "INT",
"name": "id"
},
{
"datatype": "INT",
"name": "hash_one"
}
]
}
and a table config that has this
# /DIRECTORIES/rows_10m_offline_table_config.json
{
"tableName": "rows_10m",
"tableTypes": "OFFLINE",
"segmentsConfig": {
"segmentPushType": "APPEND",
"segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy",
"schemaName": "rows_10m",
"replication": "1"
},
"tenants": {},
"tableIndexConfig": {
"loadMode": "HEAP",
"invertedIndexColumns": [
"id",
"hash_one"
]
},
"metadata": {
"customConfigs": {
}
}
}
This is very similar to what I used when I manually added the default baseballStats
. Am I missing anything in my schema.json file?troywinter
02/04/2021, 4:13 AMFixedSegmentNameGenerator
supported ? From the doc https://docs.pinot.apache.org/configuration-reference/job-specification#segment-name-generator-spec , only simple and normalizedDate name generator are supported.Kha
02/05/2021, 9:41 PMoffline_table_config.json
and a schema.json
file to Pinot, however creating a segment doesn't appear to be working. A SEGMENT-NAME.tar.gz
file isn't being created.
My current docker-job-spec.yml looks like this:
# docker-job-spec.yml
executionFrameworkSpec:
name: 'standalone'
segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner'
segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner'
segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner'
jobType: SegmentCreationAndTarPush
inputDirURI: '/tmp/pinot-manual-test/rawdata/100k'
includeFileNamePattern: 'glob:**/*.csv'
outputDirURI: '/tmp/pinot-manual-test/segments/100k'
overwriteOutput: true
pinotFSSpecs:
- scheme: file
className: org.apache.pinot.spi.filesystem.LocalPinotFS
recordReaderSpec:
dataFormat: 'csv'
className: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReader'
configClassName: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReaderConfig'
tableSpec:
tableName: 'rows_100k'
schemaURI: '<http://pinot-controller-test:9000/tables/rows_100k/schema>'
tableConfigURI: '<http://pinot-controller-test:9000/tables/rows_100k>'
pinotClusterSpecs:
- controllerURI: '<http://pinot-controller-test:9000>'
Some of the error messages I'm getting are
Failed to generate Pinot segment for file - file:/tmp/pinot-manual-test/rawdata/100k/rows_100k.csv
Caught exception while gathering stats
java.lang.NumberFormatException: For input string: "5842432235322161941"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) ~[?:1.8.0_282]
at java.lang.Integer.parseInt(Integer.java:583) ~[?:1.8.0_282]
Any leads on this would be appreciated. ThanksAshish
02/07/2021, 12:24 AMNeer Shay
02/08/2021, 9:57 AM"dateTimeFieldSpecs": [
{
"name": "ts",
"dataType": "STRING",
"format": "1:SECONDS:SIMPLE_DATE_FORMAT:\"yyyy-MM-dd HH:mm:ss\"",
"granularity": "1:MINUTES"
}
]
In Superset, I must define the string format in the Python way for it to parse correctly:
%Y-%m-%d %H:%M:%S
When I try creating a chart, I get this error:
Apache Pinot Error
unsupported format character 'Y' (0x59) at index 58
This may be triggered by:
Issue 1002 - The database returned an unexpected error.
Because the query gets translated to this (note that if I remove the "DATETIMECONVERT" and simply use "ts" column it works fine):
SELECT DATETIMECONVERT(ts, '1:SECONDS:SIMPLE_DATE_FORMAT:%Y-%m-%d %H:%M:%S', '1:SECONDS:SIMPLE_DATE_FORMAT:%Y-%m-%d %H:%M:%S', '1:DAYS'),
AVG(metric) AS "AVG_1"
FROM schema.table
WHERE ts >= '2021-02-01 00:00:00'
AND ts < '2021-02-08 00:00:00'
GROUP BY DATETIMECONVERT(ts, '1:SECONDS:SIMPLE_DATE_FORMAT:%Y-%m-%d %H:%M:%S', '1:SECONDS:SIMPLE_DATE_FORMAT:%Y-%m-%d %H:%M:%S', '1:DAYS')
LIMIT 50000;
Has anyone encountered something similar? What is the solution?Tanmay Movva
02/08/2021, 12:50 PMvmarchaud
02/08/2021, 4:26 PMGrace Walkuski
02/08/2021, 6:56 PM[ERROR] Failed to execute goal com.github.eirslett:frontend-maven-plugin:1.1:npm (npm install) on project pinot-controller: Failed to run task: 'npm install' failed. (error code 1) -> [Help 1]
The pinot-controller package seems to be a java project, so why is it trying to run npm install
? How do I get around this? Thanks!Varun Srivastava
02/09/2021, 7:07 AMDevashish Gupta
02/09/2021, 8:06 AMPradeep
02/09/2021, 7:07 PMSegmentMessageHandlerFactory
is not getting registered for some reason.
When I restart the server I don’t see Logs from this function beyond this point.
(https://sourcegraph.com/github.com/apache/incubator-pinot/-/blob/pinot-server/src/ma[…]org/apache/pinot/server/starter/helix/HelixServerStarter.java)
Subscribing changes listener to path: /PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES, type: CALLBACK, listener: org.apache.helix.messaging.handling.HelixTaskExecutor@4b9419ff
Subscribing child change listener to path:/PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES
Subscribing to path:/PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES took:0
21 START:INVOKE /PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES listener:org.apache.helix.messaging.handling.HelixTaskExecutor@4b9419ff type: CALLBACK
Resubscribe change listener to path: /PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES, for listener: org.apache.helix.messaging.handling.HelixTaskExecutor@4b9419ff, watchChild: false
Subscribing changes listener to path: /PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES, type: CALLBACK, listener: org.apache.helix.messaging.handling.HelixTaskExecutor@4b9419ff
Subscribing child change listener to path:/PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES
Subscribing to path:/PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES took:1
The latency of message dbb39d17-cece-4f3c-bd88-29fb61a93863 is 922470 ms
Fail to find message handler factory for type: USER_DEFINE_MSG msgId: dbb39d17-cece-4f3c-bd88-29fb61a93863
The latency of message 4c8dffc3-68cf-417e-8cd2-a572ce4cdcb4 is 42 ms
Fail to find message handler factory for type: USER_DEFINE_MSG msgId: 4c8dffc3-68cf-417e-8cd2-a572ce4cdcb4
21 END:INVOKE /PinotCluster/INSTANCES/Server_10.0.101.11_8069/MESSAGES listener:org.apache.helix.messaging.handling.HelixTaskExecutor@4b9419ff type: CALLBACK Took: 7ms
Elon
02/09/2021, 9:44 PMElon
02/10/2021, 1:45 AMDevashish Gupta
02/10/2021, 11:06 AMapiVersion: batch/v1
kind: Job
metadata:
name: request-realtime-table-creation
namespace: data2
spec:
template:
spec:
containers:
- name: request-realtime-table-json
image: apachepinot/pinot:latest
args: [ "AddTable", "-schemaFile", "/var/pinot/examples/request_schema.json", "-tableConfigFile", "/var/pinot/examples/request_realtime_table_config.json", "-controllerHost", "pinot2-controller", "-controllerPort", "9000", "-exec" ]
env:
- name: JAVA_OPTS
value: "-Xms4G -Xmx4G -Dpinot.admin.system.exit=true"
volumeMounts:
- name: examples
mountPath: /var/pinot/examples
restartPolicy: OnFailure
volumes:
- name: examples
configMap:
name: pinot-table
backoffLimit: 100
Aaron Wishnick
02/10/2021, 5:29 PM2021/02/10 12:22:25.506 ERROR [PluginManager] [main] Failed to load plugin [pinot-s3] from dir [/<redacted>/apache-pinot-incubating-0.6.0-bin/plugins/pinot-file-system/pinot-s3]
java.lang.IllegalArgumentException: object is not an instance of declaring class
Nick Bowles
02/10/2021, 10:31 PM"""
def value = 'blah'
return value
"""
using this syntax:
select groovy('{"returnType":"STRING","isSingleValue":true}', <GROOVY MULTI LINE HERE>, my_variable) as new_variable from table
I have tried to cancel out the quotes, use single, cancel any newlines, and cannot figure out how to get this to work. Any ideas?Laxman Ch
02/11/2021, 5:52 AMsagar
02/11/2021, 6:31 AM2021/02/11 06:27:24.999 ERROR [SegmentGenerationJobRunner] [pool-2-thread-1] Failed to generate Pinot segment for file - <s3://xxxx/xxxx/xxxx.0.parq>
java.lang.IllegalArgumentException: INT96 not yet implemented.
Neha Pawar
sagar
02/11/2021, 8:29 AMFailed to generate Pinot segment for file s3:xxx/xxx/1234.csv
Illegal character in scheme name at index 2: table_OFFLINE_2021-02-01 09:39:00.000_2021-02-01 11:59:00.000_2.tar.gz
at java.net.URI.create(URI.java:852) ~[?:1.8.0_282]
at java.net.URI.resolve(URI.java:1036) ~[?:1.8.0_282]
at org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner.lambda$run$0(SegmentGenerationJobRunner.java:212) ~[pinot-batch-ingestion-standalone-0.7.0-SNAPSHOT-shaded.jar:0.7.0-SNAPSHOT-162d0e61b6b1c3d51f915f7ad3e151a4fb24110a]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_282]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_282]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_282]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_282]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282]
sagar
02/11/2021, 9:20 AMLaxman Ch
02/11/2021, 9:53 AM/pinot/<datasource-name>/PROPERTYSTORE/SEGMENTS/<table-name>_REALTIME/…
• Restarted servers, controllers
We don’t see these segments in IDEAL_STATE.
Anyone has done this? Whats the right way to restore deleted segments for a REALTIME table?sagar
02/11/2021, 10:29 AMAlexander Vivas
02/11/2021, 3:04 PMClassNotFoundException: org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory
Any suggestions?Aaron Wishnick
02/11/2021, 7:32 PM