Hi, I'm trying to learn how to use dimension table...
# general
m
Hi, I'm trying to learn how to use dimension tables, but I'm doing something wrong, but what I'm not sure. I have a
regions
dim table and
cases
normal table. And then I run this query:
Copy code
select areaName, lookUp('regions', 'Region', 'LTLAName', areaName)
from cases limit 10
But the error message says it doesn't find the lookup function:
Copy code
[
  {
    "errorCode": 200,
    "message": "QueryExecutionError:\norg.apache.pinot.core.query.exception.BadQueryRequestException: Unsupported function: lookup with 4 parameters\n\tat org.apache.pinot.core.operator.transform.function.TransformFunctionFactory.get(TransformFunctionFactory.java:189)\n\tat org.apache.pinot.core.operator.transform.TransformOperator.<init>(TransformOperator.java:56)\n\tat org.apache.pinot.core.plan.TransformPlanNode.run(TransformPlanNode.java:52)\n\tat org.apache.pinot.core.plan.SelectionPlanNode.run(SelectionPlanNode.java:83)\n\tat org.apache.pinot.core.plan.CombinePlanNode.run(CombinePlanNode.java:94)\n\tat org.apache.pinot.core.plan.InstanceResponsePlanNode.run(InstanceResponsePlanNode.java:33)\n\tat org.apache.pinot.core.plan.GlobalPlanImplV0.execute(GlobalPlanImplV0.java:45)\n\tat org.apache.pinot.core.query.executor.ServerQueryExecutorV1Impl.processQuery(ServerQueryExecutorV1Impl.java:234)\n\tat org.apache.pinot.core.query.executor.QueryExecutor.processQuery(QueryExecutor.java:60)\n\tat org.apache.pinot.core.query.scheduler.QueryScheduler.processQueryAndSerialize(QueryScheduler.java:155)\n\tat org.apache.pinot.core.query.scheduler.QueryScheduler.lambda$createQueryFutureTask$0(QueryScheduler.java:139)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat shaded.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111)"
  }
]
Any ideas?
j
Not an expert, but I remember making it work a few days ago, and your query looks okay to me What version of Pinot are you using ?
k
Yes @User is right. @User Support for Lookup UDF join was added in 0.7.1 version only https://github.com/apache/incubator-pinot/commit/d04785c83f5740a5cec0a2c30d570949304cb8ad From the error message it's not able to find the required code which means running with older Pinot version.
m
aha, cool! Yeh I had it using the docker 'latest', tag but the first time that I ran that it picked up version 0.6.0.
message has been deleted
pinned it to 0.7.1 now 🙂
k
Cool!
m
presumably on this query it's doing the lookup for every single row and therefore repeating the same lookup lots of times? Is there a way that I can get it to do the aggregation by area name first and then do the lookup afterwards so there are less lookups to do?
(reason I ask is that the query time is 10x more with the lookup than without)
j
@User Yes you are right, the lookup is performed on a per row basis because it is currently modeled as transform. Can you please file an issue for the optimization of deferring the lookup? Also add the feature contributor: @User
👍 1