https://pinot.apache.org/ logo
Join Slack
Powered by
# pinot-dev
  • m

    Mayank

    11/07/2020, 4:01 PM
    There are configs for all of them to start at a specifics port
  • l

    luanmorenomaciel

    01/20/2021, 10:54 PM
    Hi folks, I'm trying to run a ingestion task from Kafka but not getting any output message from the bin/pinot-admin.sh, this is what I'm doing events coming from kafka
    Copy code
    {
      "user_id": 17611,
      "uuid": "469fe40e-84cf-482c-ba73-fe722596f7bc",
      "first_name": "Christina",
      "last_name": "Jones",
      "date_birth": "1954-03-09",
      "city": "Thomastown",
      "country": "Honduras",
      "company_name": "Nelson, Kline and Munoz",
      "job": "Drilling engineer",
      "phone_number": "<tel:331.851.7563|331.851.7563>",
      "last_access_time": "1994-04-08T07:32:19",
      "time_zone": "America/Montevideo",
      "dt_current_timestamp": "2021-01-20 12:05:53.219255"
    }
    schema definition
    Copy code
    {
      "schemaName": "sch_users_json",
      "dimensionFieldSpecs": [
        {
          "name": "user_id",
          "dataType": "INT"
        },
        {
          "name": "uuid",
          "dataType": "STRING",
          "singleValueField": false
        },
        {
          "name": "first_name",
          "dataType": "STRING"
        },
        {
          "name": "last_name",
          "dataType": "STRING"
        },
        {
          "name": "date_birth",
          "dataType": "STRING"
        },
        {
          "name": "city",
          "dataType": "STRING"
        },
        {
          "name": "country",
          "dataType": "STRING",
          "singleValueField": false
        },
        {
          "name": "phone_number",
          "dataType": "STRING",
          "singleValueField": false
        },
        {
          "name": "last_access_time",
          "dataType": "STRING",
          "singleValueField": false
        },
        {
          "name": "time_zone",
          "dataType": "STRING",
          "singleValueField": false
        }
      ],
      "timeFieldSpec": {
        "incomingGranularitySpec": {
          "timeType": "MILLISECONDS",
          "timeFormat": "EPOCH",
          "dataType": "LONG",
          "name": "dt_current_timestamp"
        }
      }
    }
    task creation
    🙌 1
  • l

    luanmorenomaciel

    01/20/2021, 10:54 PM
    Copy code
    {
      "tableName": "realtime_users_json_events",
      "tableType": "REALTIME",
      "segmentsConfig": {
        "timeColumnName": "mergedTimeMillis",
        "timeType": "MILLISECONDS",
        "retentionTimeUnit": "DAYS",
        "retentionTimeValue": "60",
        "schemaName": "sch_users_json",
        "replication": "1",
        "replicasPerPartition": "1"
      },
      "tenants": {},
      "tableIndexConfig": {
        "loadMode": "MMAP",
        "invertedIndexColumns": [
          "city",
          "country"
        ],
        "streamConfigs": {
          "streamType": "kafka",
          "stream.kafka.consumer.type": "lowlevel",
          "stream.kafka.topic.name": "src-app-users-json",
          "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
          "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
          "stream.kafka.broker.list": "127.0.0.1:9094",
          "realtime.segment.flush.threshold.time": "3600000",
          "realtime.segment.flush.threshold.size": "50000",
          "stream.kafka.consumer.prop.auto.offset.reset": "smallest"
        }
      },
      "metadata": {
        "customConfigs": {}
      }
    }
    I'm connecting on pinot-controller and executing the following command, but getting any results
    Copy code
    root@pinot-controller-0:/opt/pinot# bin/pinot-admin.sh AddTable \
    > -schemaFile /opt/pinot/sch_users_json.json \
    > -tableConfigFile /opt/pinot/realtime_users_json_events.json \
    > -exec
    root@pinot-controller-0:/opt/pinot#
  • n

    Neha Pawar

    01/20/2021, 11:01 PM
    Few things could be the issues: 1. your table says “mergedTimeMillis” as the time column, but i dont see that in the schema or data 2. The timeColumn fieldSpec in the schema looks incorrect. You’ve specified EPOCH millis but the dt_current_timestamp looks like it’s in a simple date time format. Also, we recommend using dateTimeFieldSpec now, instead of timesFieldSpec. Refer to this to configure your dateTimeFieldSpec correctly https://docs.pinot.apache.org/configuration-reference/schema#datetimefieldspec
  • n

    Neha Pawar

    01/20/2021, 11:02 PM
    i wonder why you dont see any messages after you run that command though. Can you check the pinotController.log?
  • l

    luanmorenomaciel

    01/20/2021, 11:02 PM
    doing that now @User thank you for stepping in, we're validating Pinot over Druid
  • l

    luanmorenomaciel

    01/20/2021, 11:02 PM
    let me check now
  • l

    luanmorenomaciel

    01/20/2021, 11:03 PM
    Copy code
    2021/01/20 22:51:03.534 INFO [PinotTableRestletResource] [grizzly-http-server-0] Cannot find valid fieldSpec for timeColumn: mergedTimeMillis from the table config: realtime_users_json_events_REALTIME, in the schema: sch_users_json exception: Cannot find valid fieldSpec for timeColumn: mergedTimeMillis from the table config: realtime_users_json_events_REALTIME, in the schema: sch_users_json
  • l

    luanmorenomaciel

    01/20/2021, 11:04 PM
    you hit the nail on the head, let me fix it here
  • l

    luanmorenomaciel

    01/20/2021, 11:08 PM
    @User this is the new schema definition
    Copy code
    {
      "schemaName": "sch_users_json",
      "dimensionFieldSpecs": [
        {
          "name": "user_id",
          "dataType": "LONG"
        },
        {
          "name": "uuid",
          "dataType": "STRING"
        },
        {
          "name": "first_name",
          "dataType": "STRING"
        },
        {
          "name": "last_name",
          "dataType": "STRING"
        },
        {
          "name": "date_birth",
          "dataType": "STRING"
        },
        {
          "name": "city",
          "dataType": "STRING"
        },
        {
          "name": "country",
          "dataType": "STRING"
        },
        {
          "name": "phone_number",
          "dataType": "STRING"
        },
        {
          "name": "last_access_time",
          "dataType": "STRING"
        },
        {
          "name": "time_zone",
          "dataType": "STRING"
        }
      ],
      "dateTimeFieldSpec": {
        "incomingGranularitySpec": {
          "name": "dt_current_timestamp",
          "dataType": "STRING",
          "format": "SIMPLE_DATE_FORMAT"
        }
      }
    }
  • n

    Neha Pawar

    01/20/2021, 11:10 PM
    the dateTimeFieldSpec looks incorrect
  • l

    luanmorenomaciel

    01/20/2021, 11:10 PM
    can you send me an example please?
  • n

    Neha Pawar

    01/20/2021, 11:10 PM
    Copy code
    "dateTimeFieldSpecs": [
        {
          "name": "millisSinceEpoch",
          "dataType": "LONG",
          "format": "1:MILLISECONDS:EPOCH",
          "granularity": "15:MINUTES"
        },
        {
          "name": "hoursSinceEpoch",
          "dataType": "INT",
          "format": "1:HOURS:EPOCH",
          "granularity": "1:HOURS"
        },
        {
          "name": "dateString",
          "dataType": "STRING",
          "format": "1:DAYS:SIMPLE_DATE_FORMAT:yyyy-MM-dd",
          "granularity": "1:DAYS"
        }
      ]
  • n

    Neha Pawar

    01/20/2021, 11:10 PM
    in your case, you’ll use the 3rd one from this array.
  • n

    Neha Pawar

    01/20/2021, 11:11 PM
    but you’ll have to set your right simple date format
  • l

    luanmorenomaciel

    01/20/2021, 11:12 PM
    got it
    Copy code
    "dateTimeFieldSpecs": [{
        "name": "dt_current_timestamp",
        "dataType": "STRING",
        "format": "1:DAYS:SIMPLE_DATE_FORMAT:yyyy-MM-dd",
        "granularity": "1:DAYS"
      }]
  • n

    Neha Pawar

    01/20/2021, 11:13 PM
    you’ll need some more things after yyyy-MM-dd rt?
  • n

    Neha Pawar

    01/20/2021, 11:13 PM
    2021-01-20 12:05:53.219255
  • n

    Neha Pawar

    01/20/2021, 11:13 PM
    one sec
  • n

    Neha Pawar

    01/20/2021, 11:13 PM
    https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html
  • l

    luanmorenomaciel

    01/20/2021, 11:14 PM
    for now I think this is gonna be suffice! 🙂
  • l

    luanmorenomaciel

    01/20/2021, 11:14 PM
    just trying to run the task and I can adjust this later but I'll keep this in mind for sure
  • n

    Neha Pawar

    01/20/2021, 11:15 PM
    i think it will fail, because the input data will have
    2021-01-20 12:05:53.219255
    and Pinot will try to match it with just
    yyyy-MM-dd
  • l

    luanmorenomaciel

    01/20/2021, 11:16 PM
    hmmm that means I need to adjust great let me check now
  • n

    Neha Pawar

    01/20/2021, 11:18 PM
    yyyy-MM-dd HHmmss.SSSSSS
  • n

    Neha Pawar

    01/20/2021, 11:18 PM
    try this
  • l

    luanmorenomaciel

    01/20/2021, 11:18 PM
    thank you for the that!! super appreciate
    Copy code
    21/01/20 23:17:37.817 WARN [PartitionCountFetcher] [grizzly-http-server-1] Could not get partition count for topic src-app-users-json
    org.apache.kafka.common.errors.TimeoutException: Timeout expired while fetching topic metadata
    2021/01/20 23:17:37.818 ERROR [PinotTableIdealStateBuilder] [grizzly-http-server-1] Could not get partition count for src-app-users-json
    org.apache.kafka.common.errors.TimeoutException: Timeout expired while fetching topic metadata
    2021/01/20 23:17:37.818 ERROR [PinotTableRestletResource] [grizzly-http-server-1] org.apache.kafka.common.errors.TimeoutException: Timeout expired while fetching topic metadata
    java.lang.RuntimeException: org.apache.kafka.common.errors.TimeoutException: Timeout expired while fetching topic metadata
  • n

    Neha Pawar

    01/20/2021, 11:19 PM
    pinot is not able to access the kafka at 127.0.0.1:9094 . is that url right?
  • n

    Neha Pawar

    01/20/2021, 11:20 PM
    are you using docker? you might have to change the host name
  • l

    luanmorenomaciel

    01/20/2021, 11:20 PM
    yeah looking at that , using minikube I can reach through other apps let me check and get back to you
1...202122...30Latest