Mayank
Yupeng Fu
04/29/2021, 9:29 PMkauts shukla
05/01/2021, 9:23 AMkauts shukla
05/02/2021, 11:12 AMselect userid,eventlabel,sessionid, MIN(timestampist) as mint, MAX(timestampist) as maxt, (MAX(timestampist) - MIN(timestampist)) as diff_time from default.click_stream where eventlabel !='null' and timestampist between 1615833000000 and 1616225312000 group by userid,eventlabel,sessionid
Vengatesh Babu
05/03/2021, 6:01 AMJonathan Meyer
05/03/2021, 9:55 AMcontroller.local.temp.dir
?Vengatesh Babu
05/03/2021, 1:10 PMPedro Silva
05/03/2021, 1:35 PMPedro Silva
05/03/2021, 5:28 PMPedro Silva
05/04/2021, 10:46 AMJonathan Meyer
05/04/2021, 2:41 PM/ingestFromFile
API endpoint but prod-compatible (where can segment creation be done in that case ? Minion ?)
Thanks !Josh Highley
05/04/2021, 7:29 PMselect sum(a+b), * from my_table
pinot query browser gives an error when I try this -- is there another way without specifically listing all the columns?Mus
05/04/2021, 11:29 PMKarin Wolok
Pedro Silva
05/05/2021, 10:31 AMPedro Silva
05/05/2021, 2:26 PMcontroller.local.temp.dir
Ambika
05/05/2021, 2:29 PMTrying to create instance for class org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner
Initializing PinotFS for scheme file, classname org.apache.pinot.spi.filesystem.LocalPinotFS
Creating an executor service with 1 threads(Job parallelism: 0, available cores: 6.)
Submitting one Segment Generation Task for file:/opt/pinot/ai/weather/global_weather100M.csv
Using class: org.apache.pinot.plugin.inputformat.csv.CSVRecordReader to read segment, ignoring configured file format: AVRO
RecordReaderSegmentCreationDataSource is used
Finished building StatsCollector!
Collected stats for 100000000 documents
Created dictionary for INT column: date with cardinality: 30, range: 0 to 29
Using fixed length dictionary for column: country, size: 110
Created dictionary for STRING column: country with cardinality: 10, max length in bytes: 11, range: Australia to USA
Created dictionary for INT column: pincode with cardinality: 10, range: 12324 to 3243678
Created dictionary for INT column: week with cardinality: 53, range: 0 to 52
Using fixed length dictionary for column: city, size: 80
Created dictionary for STRING column: city with cardinality: 10, max length in bytes: 8, range: AMD to SRI
Created dictionary for INT column: year with cardinality: 50, range: 1970 to 2019
Created dictionary for INT column: temperature with cardinality: 50, range: 0 to 49
Using fixed length dictionary for column: state, size: 20
Created dictionary for STRING column: state with cardinality: 10, max length in bytes: 2, range: AS to WB
Using fixed length dictionary for column: day, size: 63
Created dictionary for STRING column: day with cardinality: 7, max length in bytes: 9, range: Friday to Wednesday
Created dictionary for LONG column: ts with cardinality: 530768, range: 1620214278776 to 1620214809690
Start building IndexCreator!
Finished records indexing in IndexCreator!
Finished segment seal!
Converting segment: /tmp/pinot-00edd913-441c-4958-8555-9b380f12991b/output/weather_1_OFFLINE_1620214278776_1620214809690_0 to v3 format
v3 segment location for segment: weather_1_OFFLINE_1620214278776_1620214809690_0 is /tmp/pinot-00edd913-441c-4958-8555-9b380f12991b/output/weather_1_OFFLINE_1620214278776_1620214809690_0/v3
Deleting files in v1 segment directory: /tmp/pinot-00edd913-441c-4958-8555-9b380f12991b/output/weather_1_OFFLINE_1620214278776_1620214809690_0
Skip creating default columns for segment: weather_1_OFFLINE_1620214278776_1620214809690_0 without schema
Successfully loaded segment weather_1_OFFLINE_1620214278776_1620214809690_0 with readMode: mmap
Starting building 1 star-trees with configs: [StarTreeV2BuilderConfig[splitOrder=[country, state, city, pincode, day, date, week],skipStarNodeCreation=[],functionColumnPairs=[max__temperature, minMaxRange__temperature, avg__temperature, min__temperature],maxLeafRecords=1000]] using OFF_HEAP builder
Starting building star-tree with config: StarTreeV2BuilderConfig[splitOrder=[country, state, city, pincode, day, date, week],skipStarNodeCreation=[],functionColumnPairs=[max__temperature, minMaxRange__temperature, avg__temperature, min__temperature],maxLeafRecords=1000]
Generated 65977917 star-tree records from 100000000 segment records
Srini Kadamati
05/05/2021, 3:21 PMArun Vasudevan
05/05/2021, 7:59 PMArun Vasudevan
05/05/2021, 9:46 PMPedro Silva
05/06/2021, 10:23 AMRK
05/06/2021, 11:52 AMRK
05/06/2021, 1:23 PMPedro Silva
05/07/2021, 4:10 PM"fromDateTime(JSONPATHSTRING(result,'$.AudioLength','00:00:00.000'), 'HH:mm:ss.SSS')"
where the transformed field is of type Long
?Grace Walkuski
05/07/2021, 5:06 PMdistinct
over grouping by all the fields? For example, is there a difference in efficiency between these two?
select distinct species, name from dataSource
select species, name from dataSource group by species, name
Akash
05/07/2021, 9:14 PMexecutionFrameworkSpec:
name: 'spark'
segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.spark.SparkSegmentGenerationJobRunner'
segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.spark.SparkSegmentTarPushJobRunner'
segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.spark.SparkSegmentUriPushJobRunner'
extraConfigs:
stagingDir: '<hdfs://hadoop/tmp/pinot_staging/>'
jobType: SegmentCreationAndTarPush
inputDirURI: '<hdfs://hadoop/hp/input/Event1/dateid=2020-12-30/>'
outputDirURI: '<hdfs://hadoop/pinot/output/Event1/dateid=2020-12-30/>'
Now this generates the segment on pinot/output/Event1/dateid=2020-12-30/
I have Pinot deepstorage on HDFS where controller data
/hp/pinot/data/controller/Event1/
Currently AFAIU, The data is moved from HDFS => Pinot Controller => HDFS. Is there a way to short circuit the whole network process ?
I can see there is configuration in Table where we can specify batchIngestionConfig=>segmentIngestionType as REFESH. Though, there is no example anywhere, do we have any test in codebase or some blog/docs e.t.cAkash
05/07/2021, 10:23 PMAmbika
05/08/2021, 12:54 AMtroywinter
05/08/2021, 4:55 PMdatetimeconvert
inbuilt function for ingestion transform in pinot? Is there any limitations when transforming time columns? I’m getting error when adding a transform function to table config, but no specific error msg is logged out.RK
05/09/2021, 11:33 AM