Hey, this is more of a question to validate, you c...
# troubleshooting
a
Hey, this is more of a question to validate, you cannot have a multiple value / array Metric field in Pinot, correct? i.e. for Dimensions you can specify
Copy code
"singleValueField": false
Is there something analogous for Metrics? I ask because we would like to keep our records at a certain granularity with a couple of metrics that are arrays, but we want to keep them metrics to be able to do aggregates on them.
m
Metrics can be multi valued as well, but you want to ensure that the semantics matches your expectation when aggregating. How do you expect the MV metrics to be aggregated in your case?
a
I would like to aggregate them in a query, SUM, AVG, MAX etc.
m
But you want sum to add all values in the array in same row and also across all rows?
a
Yes
m
That works then.
a
But how do I specify the metric is multi value in the schema?
m
Same way as dimensions
"singleValueField": false
a
Ok thanks, will try that
That doesn't seem to be valid schema for a Metric, I get this error "{"code":400,"error":"Schema should not be null for REALTIME table"}failed to create the mutable_events table"
Copy code
"metricFieldSpecs": [       
      {
        "name": "incident_threatscore",
        "dataType": "INT",
        "singleValueField": false
      },
      {
        "name": "match_count",
        "dataType": "INT"
      }
    ]
if I remove the singleValueField line, the schema gives no error
m
Hmm, that seems odd.
Actually, this seems like an artificial limitation. You need to define this as dimension, and explicitly set defaultNullvalue (so it becomes on par with metrics). It would be good to file an issue to support MV metrics in schema (in aggregations it is supported for dimensions).
a
so creating a dimension with defaultNullValue will let you treat a dimension like a metric? (i.e can SUM/AVG it?)
m
Yes. The code doesn’t stop you from calling aggregation on a dimension, as long as the type is aggregatable. But I’d still recommend you file a GH issue for your original issue, as this is just a work-around and we should address your original problem too.
a
got it, I will try that out, thanks!
👍 1
hit another issue while working on this, are boolean array fields not allowed? here's the schema for that column and the error below:
Copy code
{
        "name": "user_is_active",
        "dataType": "BOOLEAN",
        "singleValueField": false,
        "defaultNullValue": false
      },
m
This is while querying? What was the exact query?
a
No, while attempting to stream data from a kafka topic
m
Hmm, the stack trace is actually from query execution
The exception is from query path, not ingestion path, so I am confused.
a
I can check a few things, but this should work as I have it?
m
The stack trace (from query path) seems legit, as in it does not seem to support BOOLEAN_ARRAY. Let’s first figure out what’s the issue during ingestion
Also, what version of code are you using?
cc: @Jackie
a
I can't check right now, but maybe ingestion is working, but I couldn't query the table at all so I assumed it was ingestion. I will check to see if data actually got ingested
j
Can you try with the latest master? There is a fix for the boolean and timestamp array, which might not be included in the version you are running
a
Ok thanks for letting me know, I will respond back tomorrow