Sorry to be a never-ending fount of questions, fol...
# general
w
Sorry to be a never-ending fount of questions, folks… is it expected / necessary to create a rangeIndex on dateTime fields, or are those automatically indexed efficiently? Likewise, should I add dateTime fields to the noDictionaryColumns list?
m
What's your time granularity?
Typically, we don't need to set explicit indexing for dateTime fields, as we can still prune segments based on metadata.
w
My base timestamp is epoch milliseconds, I am playing around with deriving an hourly grain field for pre-aggregating in a star-tree index, but based on my current prototype, that seems like it might be premature optimization
m
A general recommendation is to sort on primary key (or a dimension that appears in most queries), and minimal number of inv indexing to have a reasonable selectivity across your query set.
w
👍
Thank you. I am also looking to partition the incoming data based on a dimension that is almost always used selectively in the WHERE clause, and use broker-side partition pruning to minimize scanning unnecessary segments - I’m not sure if that will actually help, or if forcing data locality like that will bottleneck things.
I am using that same dimension as my sort key, but right now, it’s not particularly useful to sort on it, because it shows up in all segments
m
Oh yeah, partitioning is definitely a good idea.
In our usecases, we typically have the partitioning as well as sorting on the same dimension
w
Ok, that makes me feel better, as that was my plan.
m
Good plan, I'd say.
w
Another stupid question - should I create an inverted index on the sort column, or is that unnecessary?
m
That is unnecessary, it won't be used. In fact, the segment generation might just ignore and not create.
w
Perfect, thank you. It seemed like it would be unnecessary, but I’ve seen stranger things, and the docs, while great for an incubating project, were a little unclear - I would love to volunteer to keep notes while I’m doing this, and maybe propose some updates to the docs if that would be helpful.
👍 2
m
That would be really awesome, would really appreciate your help in improving our docs.
k
@Will Briggs you might find this video useful

https://www.youtube.com/watch?v=VdwVDiXOOVo

it talks about all the indexing techniques and when to use what.
w
That’s awesome, thank you