Apache Pinot

we found if a STRING column has tab characters in it (i.e., \t) and used as a group-by key. The query result will return two group-by keys for a result. anyone aware of this bug or walk around it?

the group key generator uses tabs as delimiter

i guess we’d have to change delimiter or configure it somehow, to support this

```@Override
public void dumpToGroupByResults(LinkedList&lt;GroupByResult&gt; dest) {
  GroupKeyResultPair groupKeyResultPair;
  while ((groupKeyResultPair = _heap.poll()) != null) {
    // Set limit to -1 to prevent removing trailing empty strings
    String[] groupKeys = groupKeyResultPair._groupKey.split(GroupKeyGenerator.DELIMITER, -1);```


I can file an issue first and happy to fix it. As for the fix, I need to check the code again. I do not fully understand why we need to do SPLIT in the first place.

look at an implementation of GroupKeyGenerator e.g. DictionaryBasedGroupKeyGenerator. In the getGroupKey method, we append the group key strings, using tab

and so in all places later on, we split with tab. If you trace usages of `
```GroupKeyGenerator.DELIMITER```
you’ll see them all

can we not concatenate the column values and then split them later? this is my main question.

afaik, that would require a significant amount of changes in the core group by functionality

changing delimiter will introduce another config and it is hard to predict what special character users put into their Pinot table.

Since delimiter is transient per query, what’s the issue with changing the static final? \t seems like a poor choice for delimiter 

In the end we walk around the issue by doing a trim ops in Presto.

Sorry for the late reply. We tried not to do a rebuild to change the delimiter. Also the table data may have other weird characters which make the new delimiter not working correctly.