Hi guys, so, in latest Pinot version, is `TEXT_MAT...
# general
b
Hi guys, so, in latest Pinot version, is
TEXT_MATCH
the only way of doing regex like searches on string columns? Are Lucene indexes a must for such functionality? cc @User
s
The other alternative is REGEXP_LIKE since it takes the non-exact search arguments
If you are planning to use WHERE col = "something*", it won't work since it will do a straight comparison.
So either TEXT_MATCH or REGEXP_LIKE should be used for non-exact or regex like matches
b
So
regexp_like
isn't removed with
TEXT_MATCH
right? I guess it's PQL only construct and may not be supported in SQL
s
Yeah we should probably consider removing/deprecating its use and going forward only keep text_match
b
What's the plan for PQL completely? are you planning to remove that or keep it?
t
So @Sidd should
regexp_like
work with SQL query format?
s
I think it does (it was already supported on the SQL path before I added text_match). I will remove it
In summary, I don't think regexp_like should be supported on SQL path.
t
Hey Sidd, the
REGEXP_LIKE
does work with SQL format.
regexp_like
does not. I guess I wan’t paying attention to the error message
Copy code
org.apache.pinot.client.PinotClientException: Query had processing exceptions:
[{"errorCode":150,"message":"PQLParsingError:\njava.lang.IllegalArgumentException: No enum constant org.apache.pinot.pql.parsers.pql2.ast.FilterKind.regexp_like\n\tat java.lang.Enum.valueOf(Enum.java:238)\n\tat org.apache.pinot.pql.parsers.pql2.ast.FilterKind.valueOf(FilterKind.java:21)\n\tat org.apache.pinot.pql.parsers.PinotQuery2BrokerRequestConverter.traverseFilterExpression(PinotQuery2BrokerRequestConverter.java:310)\n\tat org.apache.pinot.pql.parsers.PinotQuery2BrokerRequestConverter.traverseFilterExpression(PinotQuery2BrokerRequestConverter.java:318)\n\tat org.apache.pinot.pql.parsers.PinotQuery2BrokerRequestConverter.traverseFilterExpression(PinotQuery2BrokerRequestConverter.java:318)\n\tat org.apache.pinot.pql.parsers.PinotQuery2BrokerRequestConverter.traverseFilterExpression(PinotQuery2BrokerRequestConverter.java:318)\n\tat org.apache.pinot.pql.parsers.PinotQuery2BrokerRequestConverter.convertFilter(PinotQuery2BrokerRequestConverter.java:180)\n\tat org.apache.pinot.pql.parsers.PinotQuery2BrokerRequestConverter.convert(PinotQuery2BrokerRequestConverter.java:61)\n\tat org.apache.pinot.sql.parsers.CalciteSqlCompiler.compileToBrokerRequest(CalciteSqlCompiler.java:34)\n\tat org.apache.pinot.broker.requesthandler.BaseBrokerRequestHandler.handleRequest(BaseBrokerRequestHandler.java:170)\n\tat org.apache.pinot.broker.api.resources.PinotClientRequest.processSqlQueryPost(PinotClientRequest.java:155)\n\tat sun.reflect.GeneratedMethodAccessor191.invoke(Unknown Source)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:498)"}]
s
@Buchi Reddy @Tim Mwangi so pure log/text search style queries should use text_match instead of regexp_like since these will be filtering purely on the basis of text filter and will be much faster using text_match.
However, if the query has an exact filter on one or more columns and a regex/non-exact filter on another column, then regexp_like can be used for the column with regex filter since the exact filter will be evaluated first and the regex filter operator only needs to evaluate the pattern for a limited set of docIds
k
summary - if regexp_like works and latency is acceptable use it
if not, add text index and use text_match, @Sidd it will be good to use text index automatically even for regexp_like if the index already exists
s
agreed
@Tim Mwangi the exception was probably due to case sensitivity. I thought it is handled. I will look into it.
k
@Sidd case sensitivity is ON by default.
s
I thought so. There is a bug in PinotQuery2BrokerRequestConverter.
Copy code
FilterKind filterKind = FilterKind.valueOf(operator);
works only upper case definitions as present in ENUM
Will put a PR
👍 1