2nd question: if i plan to use lookup table for jo...
# general
x
2nd question: if i plan to use lookup table for joins, i can only use it for decorating the query results - if i use lookup joins in the WHERE clause, queries will be very slow because the lookup join cannot benefit from indexing. is my understanding correct here?
k
do you have sample query?
x
Copy code
SELECT
  f.uuid,
  lookUp('dimTable', 'name', 'uuid', f.uuid) as name,
  lookUp('dimTable', 'country', 'uuid', f.uuid) as country,
  abs(sum(m.metric)) as sum_metric
FROM
  factTable f 
WHERE 
  f.event_time > CAST(to_unixtime(date_trunc('day', now())) AS BIGINT)
  AND f.status = 'OPEN' 
  AND lookUp('dimTable', 'country', 'uuid', f.uuid) = 'SINGAPORE'
GROUP BY
  1,
  2,
  3
ORDER BY
  2
k
use subquery instead of lookup
x
How would that work?
k
Copy code
AND f.uuid IN (select uuid from dimTable where country='SINGAPORE')
x
Do dimension tables support indexes?
k
yeah
x
wow
the only limitation here is the size of dim tables since they will be replicated in every server right?
k
thats right
x
whats the difference internally if i use lookup vs subquery in the above query?
ok, so if say i have:
Copy code
# factTable
-----------------
userid | raw_cell

# dimTable1
-----------------
raw_cell | mapped_cell

# dimTable2
-----------------
userid | attr1 | attr2
i can do a query like
Copy code
SELECT 
  userid, 
  count(*) 
FROM 
  factTable f
WHERE 
  f.raw_cell IN (select raw_cell, mapped_cell FROM dimTable1 WHERE IN_ID_SET(raw_cell, '...') = 1) AND
  f.userid IN (select userid FROM dimTable2 WHERE attr1=val1 and attr2=val2)
GROUP BY
  userid
?
can i use multiple dimension tables in a single query?
btw, instead of
Copy code
AND f.uuid IN (select uuid from dimTable where country='SINGAPORE')
do you mean
Copy code
AND IN_SUBQUERY(uuid, 'select uuid from dimTable where country="SINGAPORE"')
?
k
That’s rt
x
sorry, which of my questions are you referring to?
k
I think you can use multiple dim tables in single query
x
I’ll test it later when i have access to my laptop
@User @User do you know if i can use dimension tables in subquery this way? I was under the impression that subquery can only be used with the same table in a query sorry checked https://docs.google.com/document/d/1s6DZ9eTPqH7vaKQlPjKiWb_OBC3hkkEGICIzcd5gozc/edit#heading=h.axhmfzc3hrwa:
Because the subquery is solved as a separate query, the subquery does not necessarily hit the same table as the main query. It can retrieve the IdSet from any table served by the broker.
@User Can I confirm that using a sub query with an dimension table with the right indexes will not incur any performance hit?
I’m unsure how it works internally
k
That’s right. You can try it yourself and compare the perf
x