This message was deleted.
# troubleshooting
s
This message was deleted.
s
Was it able to process any data? Is this an ingestion from S3? Seems like an S3 access problem. Maybe permissions on the S3 bucket. Can you share the load spec?
r
Copy code
{
  "type": "index_parallel",
  "spec": {
    "dataSchema": {
      "dataSource": "test",
      "parser": {
        "type": "string",
        "parseSpec": {
          "format": "json",
          "timestampSpec": {
            "column": "!!!_no_such_column_!!!",
            "missingValue": "2010-01-01T00:00:00Z"
          },
          "dimensionsSpec": {
            "dimensions": null
          }
        }
      },
      "metricsSpec": null,
      "granularitySpec": {
        "type": "uniform",
        "segmentGranularity": "HOUR",
        "queryGranularity": {
          "type": "none"
        },
        "rollup": false,
        "intervals": null
      },
      "transformSpec": {
        "filter": {
          "type": "not",
          "field": {
            "type": "in",
            "dimension": "ID",
            "values": [
              "1",
              "2"
            ],
            "extractionFn": null
          }
        },
        "transforms": null
      }
    },
    "ioConfig": {
      "type": "index_parallel",
      "firehose": {
        "type": "ingestSegment",
        "dataSource": "test",
        "interval": "2023-04-25T20:00:00.000Z/2023-04-25T21:00:00.000Z",
        "segments": null,
        "filter": null,
        "dimensions": null,
        "metrics": null,
        "maxInputSegmentBytesPerTask": 555157286400
      },
      "appendToExisting": false
    },
    "tuningConfig": {
      "type": "index_parallel",
      "maxRowsPerSegment": null,
      "maxRowsInMemory": 1000000,
      "maxBytesInMemory": 0,
      "maxTotalRows": null,
      "numShards": null,
      "partitionDimensions": [],
      "indexSpec": {
        "bitmap": {
          "type": "concise"
        },
        "dimensionCompression": "lz4",
        "metricCompression": "lz4",
        "longEncoding": "longs"
      },
      "maxPendingPersists": 0,
      "maxNumConcurrentSubTasks": 5,
      "buildV9Directly": true,
      "forceGuaranteedRollup": false,
      "reportParseExceptions": false,
      "pushTimeout": 0,
      "segmentWriteOutMediumFactory": null,
      "logParseExceptions": false,
      "maxParseExceptions": 2147483647,
      "maxSavedParseExceptions": 0
    }
  },
  "context": {},
  "dataSource": "test"
}
Same approach works fine for datasource with less data, so i s3 is accessible , but it might be timeout issue since many segments are getting ingested continuously for longer duration?
tried with another datasource with many segments for 1 day. could see below error in logs- any leads to fix this kind of issue?
Copy code
2023-05-15T04:39:01,821 WARN [main] com.sun.jersey.spi.inject.Errors - The following warnings have been detected with resource and/or provider classes:
  WARNING: A HTTP GET method, public void org.apache.druid.server.http.SegmentListerResource.getSegments(long,long,long,javax.servlet.http.HttpServletRequest) throws java.io.IOException, MUST return a non-void type.
2023-05-15T04:39:02,318 ERROR [main] org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Exception when stopping method[public void org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner.stop()] on object[org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner@6a2badb1]
java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_232]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_232]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_232]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_232]
	at org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler.stop(Lifecycle.java:463) [druid-core-0.15.0-incubating-iap10.jar:0.15.0-incubating-iap10]
	at org.apache.druid.java.util.common.lifecycle.Lifecycle.stop(Lifecycle.java:366) [druid-core-0.15.0-incubating-iap10.jar:0.15.0-incubating-iap10]
	at org.apache.druid.cli.CliPeon.run(CliPeon.java:367) [druid-services-0.15.0-incubating-iap10.jar:0.15.0-incubating-iap10]
	at org.apache.druid.cli.Main.main(Main.java:118) [druid-services-0.15.0-incubating-iap10.jar:0.15.0-incubating-iap10]
Caused by: java.util.concurrent.RejectedExecutionException: Service not started.
	at org.apache.druid.java.util.emitter.core.LoggingEmitter.emit(LoggingEmitter.java:94) ~[druid-core-0.15.0-incubating-iap10.jar:0.15.0-incubating-iap10]
	at org.apache.druid.java.util.emitter.core.ComposingEmitter.emit(ComposingEmitter.java:57) ~[druid-core-0.15.0-incubating-iap10.jar:0.15.0-incubating-iap10]
	at org.apache.druid.java.util.emitter.service.ServiceEmitter.emit(ServiceEmitter.java:67) ~[druid-core-0.15.0-incubating-iap10.jar:0.15.0-incubating-iap10]
	at org.apache.druid.java.util.emitter.service.ServiceEmitter.emit(ServiceEmitter.java:72) ~[druid-core-0.15.0-incubating-iap10.jar:0.15.0-incubating-iap10]
	at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner.stop(SingleTaskBackgroundRunner.java:226) ~[druid-indexing-service-0.15.0-incubating-iap10.jar:0.15.0-incubating-iap10]
	... 8 more
Finished peon task
s
I just noticed that you are on a very old release. For batch jobs such as this Druid 25.0's SQL Based Ingestion has proved to be more efficient at processing large loads. You can use the EXTERN table function to access S3 buckets for ingestion directly from SQL.
👍 1
g
Seconding the recommendation to upgrade and use SQL ingestion 🙂