Hey guys, what’s the best way - if any - to deal w...
# troubleshooting
c
Hey guys, what’s the best way - if any - to deal with JSON queries where 1) Some data points might be null 2) Not all data points have the same schema … here’s some sample data:
Copy code
Pinot schema column is called "properties" with a type of string

# First Record
{
    timestamp: '...',
    browser: '...'
}

# Second Record
null

# Third Record
{
    balance: '...'
}
This data is provided by the end user and can be whatever they’d like (this is by design) … so there’s really no “schema” for the data in the JSON column.
p
Depends on what you intend to do with the field. I had a similar case where a json field could be "anything", all I knew is that the max size of the field was 2147483647 chars (2GB per field). I defined the field as a string dimension field in the schema with a reasonable maxLength (by default pinot truncates to 512 chars). Then I added that column to
noDictionaryColumns
&
jsonIndexColumns
in the table configuration. To query the field I use json transformations making sure to always specify the default value (usually to
null
)
1
c
Thanks @Pedro Silva… I’d actually set up most of what you mentioned but was using the
JSONEXTRACTSCALAR
function wrong. One more issue I had, for records with
null
I had to convert it to an empty object for it work, otherwise I would get the exception below:
Copy code
QueryExecutionError:\njava.lang.IllegalArgumentException: json can not be null ...
Which makes sense 😅
m
Thanks @Pedro Silva. Would you mind adding this to the FAQ (#C023BNDT0N8 for details)?
c
Hey @Mayank sure thing. Just to clarify, is this about the null entries?
m
The FAQ question would be about your original question and you can add all details including null from here
c
Ah I see 👍