Hi folks! Just to confirm something: I've been ins...
# general
d
Hi folks! Just to confirm something: I've been inserting test data in my Pinot tables, and now I want to clean them up, then start inserting production data. The simplest way to achieve this is to drop the tables and then recreate them, right?
I went on and deleted all the tables. Not sure though if I should wait for the segments to be wiped out first, or if it's safe to go on and recreate them now.
That didn't work, for some reason I recreated the tables and the segments are still there
I'm trying to delete the segments first, now, and then will delete the table, then will wait a bit.
Didn't work either... ๐Ÿ˜ž
m
Oh whatโ€™s the data size ?
d
One of the tables has 9313763 bytes, the other has 129692289 but seems to be picking up more segments
I'm not sure what's going on, perhaps they're consuming events from kafka even though I haven't been publishing any for some minutes now?
Nah, stopped at 129692289 - but what's the correct procedure to reliably flush a table in order to have it prepared to receive data again, @User ? I'm fine with deleting stuff if needed, this is not yet running in production
I mean, the cluster is, of course, but I still don't have it exposed for our internal users, so I can just destroy it if needed
m
The correct procedure for RT tables is to ensure both ideal state and external view in ZK browsers are deleted (also, it may take some time to delete segments from deep store if you have huge amounts of data).
Once IS/EV no longer exist, you can re-create
d
I'm not using deep store, only the segments in EBS
Let me take a look
Sorry for the delay, BTW, Slack didn't notify me about your messages ๐Ÿ˜ž
Alright, deleted from both IDEALSTATES and EXTERNALVIEW in ZK, let me recreate the table now. I also checked the filesystem and the segments files are not there anymore - although the indexes still do exist.
m
Hmm in production I recommend using deep store
What do you mean indexes still exist
d
I still see indexes in a directory for the table I deleted
m
On server?
d
Yep. OK, I just recreated the table, and there it is again with all the segments and the data ๐Ÿ˜ž
And the solution, thanks to @User, is: use a
stream.kafka.consumer.prop.auto.offset.reset
setting of
largest
instead of
smallest
, to avoid picking up previous events from Kafka ๐Ÿ™‚