I have this table config that needs to ingest data...
# troubleshooting
a
I have this table config that needs to ingest data from orc files saved in S3, it it's not ingesting any data
Copy code
{
  "OFFLINE": {
    "tableName": "sales_by_order_OFFLINE",
    "tableType": "OFFLINE",
    "segmentsConfig": {
      "schemaName": "sales_by_order",
      "retentionTimeUnit": "DAYS",
      "retentionTimeValue": "10000",
      "replication": "2",
      "segmentPushFrequency": "HOURLY",
      "segmentPushType": "REFRESH",
      "replicasPerPartition": "1"
    },
    "tenants": {
      "broker": "DefaultTenant",
      "server": "DefaultTenant"
    },
    "tableIndexConfig": {
      "invertedIndexColumns": [],
      "noDictionaryColumns": [],
      "rangeIndexVersion": 2,
      "autoGeneratedInvertedIndex": false,
      "createInvertedIndexDuringSegmentGeneration": false,
      "sortedColumn": [],
      "bloomFilterColumns": [],
      "loadMode": "MMAP",
      "onHeapDictionaryColumns": [],
      "varLengthDictionaryColumns": [],
      "enableDefaultStarTree": false,
      "enableDynamicStarTreeCreation": false,
      "aggregateMetrics": false,
      "nullHandlingEnabled": false,
      "rangeIndexColumns": []
    },
    "metadata": {},
    "quota": {},
    "task": {
      "taskTypeConfigsMap": {
        "SegmentGenerationAndPushTask": {
          "schedule": "0 * * * * ?",
          "tableMaxNumTasks": "28"
        }
      }
    },
    "routing": {},
    "query": {},
    "ingestionConfig": {
      "batchIngestionConfig": {
        "batchConfigMaps": [
          {
            "input.fs.className": "org.apache.pinot.plugin.filesystem.S3PinotFS",
            "input.fs.prop.region": "ap-southeast-1",
            "inputDirURI": "s3 link",
            "includeFileNamePattern": "glob:**/*.orc",
            "excludeFileNamePattern": "glob:**/*.tmp",
            "inputFormat": "orc"
          }
        ],
        "segmentIngestionType": "REFRESH",
        "segmentIngestionFrequency": "HOURLY"
      }
    },
    "isDimTable": false
  }
}
Could this be related to this bucket not being public?
👀 1
n
can you check controller logs, to see if there are any exceptions when scheduling the SegmentGenerationAndPushTask. If no errors, ca you cehck for any logs which show that the tasks were succesfully generated? if controller looks good, ca you check minion logs for any exceptions?
a
The minion pod is in crashbackloop it seems, in the logs I see the following exception:
Copy code
caught exception while executing task: Task_SegmentGenerationAndPushTask_1657270810079_23
Caused by: software.amazon.awssdk.services.s3.model.NoSuchKeyException: The specified key does not exist. (Service: S3, Status Code: 404, Request ID: , Extended Request ID: )
Another table config has the wrong s3 key and causes the minion task to crash, my assumption is that because that task runs before this new table, the exception causes the minion task to shut down prematurely and thus the task for this new table has not even run once