json path expressions in query are not working. se...
# general
a
json path expressions in query are not working. select payload.push_id from githubEvents limit 10 throws unknown column name exception. But using json functions works. @User and @User - this is using json batch quickstart setup.
a
Thanks, will take a look.
k
I dont think that is tested well and can result in unexpected behavior when there is nested structure. Please use json functions if possible
a
ok, thanks
We are going to be using a flat structure - so if json path expressions work consistently for those, we are good.
a
Seems like this broke recently due to https://github.com/apache/pinot/pull/7590. (?) @User FYI.
x
got it, then we can loose the identifier check for json type expression with dot
a
brackets are also valid
[
]
currently along with dot
.
In future, we hope to expand it to take a full json path expression
SELECT tablename.columnname.jsonpath FROM myTable
provided that
tablename.columnname
is. column of type
JSON
.
1
x
hmm, why
tablename.columnname
not just
columnname
?
a
From what I recall,
tablename.columnname
was already supported in queries before we expanded identifiers to JSON notation.
x
I feel those column information should be standardized before the check?
it is
since we are doing multiple round of checks/rewrite, then maybe we should rewrite the idenfier by checking the table name and just keep the column name
a
Yes, after that
JsonStatementOptimizer
does a check to ensure that the the dot notation is applying to a JSON column. I can add a fix for this if you like? I think we would need to add a couple of integration tests as well to make sure this doesn't break again. The change should have gotten caught by a test case 🙂 otherwise its easy to skip this again.
1
j
We should add some json queries into the integration test. That should be able to catch such regressions
👍 1
1
a
Thanks
Should I open an issue for this or there is already one created?