https://pinot.apache.org/ logo
#general
Title
# general
r

Ryan Clark

07/20/2021, 7:54 PM
🧵 Complex schema (un-nesting json) not showing up in table
Our data looks like this:
Copy code
{
  "one": "one",
  "two": "two",
  "three": "three",
  "fourTimestamp": "1593549705711",
  "payload": {
    "context": {
      "one": "one",
      "two": "two"
    },
    "message": {
      "one": "one",
      "two": "two"
    }
  },
  "five": "five"
}
Previously, my schema correctly showed 
"one", "two", "three", "fourTimestamp"
 in the table.
I added a new table config to include 
ingestionConfig
 for the first time:
Copy code
{
  "tableName": "tableName",
  "tableType": "REALTIME",
  "segmentsConfig": {
    "timeColumnName": "fourTimestamp",
    "timeType": "MILLISECONDS",
    "schemaName": "schemaName",
    "replicasPerPartition": "1"
  },
  "tenants": {},
  "tableIndexConfig": {
    "loadMode": "MMAP",
    "streamConfigs": {
      "streamType": "kinesis",
      "stream.kinesis.topic.name": "stream-name",
      "region": "us-east-1",
      "shardIteratorType": "AFTER_SEQUENCE_NUMBER",
      "stream.kinesis.consumer.type": "lowlevel",
      "stream.kinesis.fetch.timeout.millis": "30000",
      "stream.kinesis.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
      "stream.kinesis.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kinesis.KinesisConsumerFactory",
      "realtime.segment.flush.threshold.size": "1000000",
      "realtime.segment.flush.threshold.time": "6h"
    }
  },
  "ingestionConfig": {
    "complexTypeConfig": {
      "delimiter": ".",
      "fieldsToUnnest": [
        "payload.connection",
        "payload.message"
      ],
      "collectionNotUnnestedToJson": "NON_PRIMITIVE"
    }
  },
  "metadata": {
    "customConfigs": {}
  }
}
Then I changed the schema to include the nested objects, as well as 
"five"
 , which is not nested.
Copy code
{
  "schemaName": "mobileEvent",
  "dimensionFieldSpecs": [
    {
      "name": "one",
      "dataType": "STRING"
    },
    {
      "name": "two",
      "dataType": "STRING"
    },
    {
      "name": "five",
      "dataType": "STRING"
    },
    {
      "name": "payload.context.one",
      "dataType": "STRING"
    },
    {
      "name": "payload.context.two",
      "dataType": "STRING"
    },
    {
      "name": "payload.message.one",
      "dataType": "STRING"
    },
    {
      "name": "payload.message.two",
      "dataType": "STRING"
    }
  ],
  "metricFieldSpecs": [
    {
      "name": "three",
      "dataType": "INT"
    }
  ],
  "dateTimeFieldSpecs": [
    {
      "name": "fourTimestamp",
      "dataType": "STRING",
      "format": "1:MILLISECONDS:EPOCH",
      "granularity": "1:MILLISECONDS"
    }
  ]
}
The new schema shows up in the UI to the left of the table, but none of the new additions are showing up in the table. Even 
"five"
 which is not nested. Is there something invalid about my complexTypeConfig?
m

Mayank

07/20/2021, 8:05 PM
@User could you take a look?
j

Jackie

07/20/2021, 8:12 PM
@User Did you re-generate the segments?
r

Ryan Clark

07/20/2021, 8:14 PM
nope. I just did
./pinot-admin.sh AddTable -tableConfigFile
and
./pinot-admin.sh AddSchema -schemaFile
Jackie, now I see them (I did nothing yet) but they are all null.
y

Yupeng Fu

07/20/2021, 8:46 PM
note that complex type handling is on master and will be released in 0.8
but not available in 0.7.1
🙌 1
👀 1
r

Ryan Clark

07/20/2021, 9:12 PM
OK. So I should expect it not to work until 0.8 is released? Any idea when that will be?
y

Yupeng Fu

07/20/2021, 9:30 PM
@User ^
m

Mayank

07/20/2021, 9:32 PM
We are about to start the 0.8 release work shortly, so may be a few weeks. The delay is because we are also working on Apache graduation, and the release process of incubating project is somewhat different from the top level project.
🙌 1
👀 1