< UGRJA9TEH> We tried to create segments through spark for a Apache Pinot #general

<@UGRJA9TEH> We tried to create segments through s...

AnishKanth

03/10/2020, 9:22 PM

@User We tried to create segments through spark for an offline table. We set the configurations 'exclude.sequence.id=true' and ' segment.name.generator.type=normalizedDate' and the segments created did not have sequence_id(TABLE_2020-03-11_2020-03-12.tar.gz). But when we try creating the segments with 'exclude.sequence.id=true' configuration alone (we did not set the normalized date config 'segment.name.generator.type=normalizedDate' ) segments created with sequence_id(TABLE_18025_18027_155). We want to create segments without sequence_id (TABLE_18025_18027). Could you please help? For config segment.name.generator.type can we give normalizedSeconds? We are dealing with hourly data so we would like to create our segments name like (Table_2020-03-11-000000_2020-03-11-040000)

Xiang Fu

03/10/2020, 9:37 PM

I think it has a config to set the granularity I will find it

Xiang Fu

03/10/2020, 9:38 PM

What’s your table config for this?

Xiang Fu

03/10/2020, 9:42 PM

currently

exclude.sequence.id=true

only works with

normalizedDate

Xiang Fu

03/10/2020, 9:44 PM

if you set

hourly

pushfrequency

in your table config

Xiang Fu

03/10/2020, 9:44 PM

then segment name is generated with hour

Xiang Fu

03/10/2020, 9:45 PM

image.png

AnishKanth

03/11/2020, 4:19 AM

@Xiang Fu Okay.. I will set push frequency to Hourly and create the segments

Kishore G

03/11/2020, 4:33 PM

Can we always use millisecond format? It can be rounded to any granularity

Open in Slack

Previous Next