Hey folks, can anyone confirm whether range querie...
# general
e
Hey folks, can anyone confirm whether range queries will work with
JSON_MATCH
? Ex. Modifying example query from json-index docs[1]:
Copy code
{
  "name": "adam",
  "age": 30,
  "country": "us",
  "addresses": [
    {
      "number": 112,
      "street": "main st",
      "country": "us"
    },
    {
      "number": 2,
      "street": "second st",
      "country": "us"
    },
    {
      "number": 3,
      "street": "third st",
      "country": "ca"
    }
  ]
}
Copy code
SELECT ... 
FROM mytable 
WHERE JSON_MATCH(person, '"$.addresses[*].number">2')
I would expect that this
adam
doc/row would match. But I’m not sure if range queries are supported. Thanks! https://docs.pinot.apache.org/basics/indexing/json-index#chained-key-lookup
r
the json index is a purely inverted index, doesn't support range queries
you could extract the attribute with a transform function during ingestion, and put a range index on the transformed field
e
Thanks for confirming @User. I had considered the approach you mentioned but from my understanding one important feature I would lose doing that would be the context maintained using the JSON_MATCH function. Ex. assuming that ranges did work, if the query was:
Copy code
SELECT ... 
FROM mytable 
WHERE JSON_MATCH(person, '"$.addresses[*].number">2 AND "$.addresses[*].country"=''us''')
this would not match. Using the approach of extracting the address numbers would remove the possibility of using the context for conjunctions. Do you happen to know if there have been attempts to implement effectively defining a schema for a json column, allowing definition of different data types/indexes within the JSON? Any idea off-hand if that’s completely infeasible, reasonable with effort, etc?
k
Hi Evan, you will have to use a combination of JSON_MATCH and json_extract
Copy code
JSON_MATCH(person, '"$.addresses[*].country"=''us''') AND JSON_EXTRACT_SCALAR(""$.addresses[*].number") > 2
something like that
the planner is optimized to only evaluate the json country = "us"
e
thanks @User 👍
the planner is optimized to only evaluate the json country = “us”
Could you elaborate on this a bit? My initial impression was that your recommendation might still have the limitation where JSON context is not maintained, but your input about the planner made me wonder. Keen to learn, thanks!
k
my bad, I see what you are asking for
my suggestion will return incorrect results when us and number>2 matches across two nested records
e
no problem, I still got some ideas from your input 🙂
r
yes, mine too, I missed the context
k
but you can build on that idea
e
what might be complexity level be, if even possible at all, of having different indexes for different properties of a JSON blob? I would think a fairly common use case would be structured, repeated (array) entities. Would be super powerful to have this capability and I’d be interested in helping to contribute if it’s feasible at all
k
I was just chatting with Richard about adding this. It is possible to support all types on indexes in json fields
🙇 1
🚀 1
lets file an issue and would love to get your inputs and contributions
❤️ 1
would you mind creating one
e
will do 👍
I also see mention occasionally of
MAP
and
STRUCT
data types. Where can I learn more about these data types and their usage?
And would our above discussion be best as its own issue, or amalgamated with https://github.com/apache/pinot/issues/7863 ?