Hi all, I'm on 0.7.1 so can't take advantage of th...
# troubleshooting
k
Hi all, I'm on 0.7.1 so can't take advantage of the JSON data type but am working with a table that has a STRING column and it has
{"name":"value"}
I'm trying to get the "value" by passing in the "name". I tried the JSON transformations but keep getting errors. I didn't see any string-to-json transformations on the 'supported transformations' page. Any other documentation with relevant information or suggestions?
m
Not sure if I follow. Do you want to extract the value from input record and store just the value in Pinot? Or do you want to store it as JSON and then query value by name at read time?
For the first one, you can write a simple Groovy function to do so. For the second one, you can enable JSON index
k
Hi Mayank, I've loaded the data in Pinot and am trying to query it.
Thanks for pointing to JSON index (here). Since I've loaded the data already, can I alter the table to add the index or will it require a reload?
m
Yeah, you could update table config and then trigger reload api.
k
Great. Yeah, I saw the handy 'Reload All Segments' button. However when I click the 'Reload Status' it says 'No segment found in table' 🤔 . It shows me list of segments on the page and all in good status.
m
Hmm what does external view inside of zookeeper say?
k
shows everything as being online. Not sure what I should be looking for?
Copy code
{
  "id": "event_OFFLINE",
  "simpleFields": {
    "BATCH_MESSAGE_MODE": "false",
    "BUCKET_SIZE": "0",
    "IDEAL_STATE_MODE": "CUSTOMIZED",
    "INSTANCE_GROUP_TAG": "event_OFFLINE",
    "MAX_PARTITIONS_PER_INSTANCE": "1",
    "NUM_PARTITIONS": "120",
    "REBALANCE_MODE": "CUSTOMIZED",
    "REPLICAS": "1",
    "STATE_MODEL_DEF_REF": "SegmentOnlineOfflineStateModel",
    "STATE_MODEL_FACTORY_NAME": "DEFAULT"
  },
  "mapFields": {
    "event_OFFLINE_2020-10-20 12:01:06_2020-10-31 23:56:25_6": {
      "Server_172.19.0.5_8098": "ONLINE"
    },
    "event_OFFLINE_2020-10-20 12:01:06_2020-10-31 23:56:32_3": {
      "Server_172.19.0.5_8098": "ONLINE"
    },
.........
.........
.........
    "event_OFFLINE_2021-07-01 00:00:01_2021-07-29 16:59:58_10": {
      "Server_172.19.0.5_8098": "ONLINE"
    }
  },
  "listFields": {}
}
x
you may also try jsonExtractScalar to query the value, e.g. SELECT jsonExtractScalar(col, '$.name', 'STRING') FROM .... The col can be ordinary String type, just that the value should be in valid json format.
k
Thanks Xiaobing. I tried the jsonExtractScalar but getting error due to the null values so I've updated my schema to set a default value and triggered the reload api to reload the data. The api call is not working though. I restarted my containers and now all the segments for that one table are in
bad
state.
x
for BAD state, you may give it a try on reset (doc). Regarding to the error upon jsonExtractScalar, you may try filter out those with null values by adding some predicates in WHERE clause. Not sure about why reload didn't work, any errors you saw while doing reload?
k
Thanks for pointing that out @Xiaobing! I triggered the reset and got a 500 Error: Internal Server Error. I didn't pass in maxwaittimems so it used the default.
Copy code
{
  "code": 500,
  "error": "Failed to reset segments in table: event_OFFLINE. Timed out waiting for external view to stabilize after call to disable/reset segments. Disable/reset might complete in the background, but skipping enable of segments of table: event_OFFLINE"
}
I have 120 segments and some of them are turning
Good
. The reload segments trigger gave a 200 and I did not see errors on the
ran reset again and it worked 🤷🏽