Eyal Yurman
07/11/2025, 8:57 PMkn3jox
07/15/2025, 9:31 AMkn3jox
07/16/2025, 4:51 AM"spec": {
"ioConfig": {
"watcher": {
"type": "file",
"pollPeriod": "PT10M"
}
Ashi Bhardwaj
07/16/2025, 9:11 AM9.37.2
which is not compatible with pac4j v4.Doaa Deeb
07/17/2025, 11:06 PMsezur work
07/21/2025, 5:13 PMsezur work
07/21/2025, 5:14 PMsezur work
07/21/2025, 5:14 PMsezur work
07/21/2025, 5:15 PMEyal Yurman
07/24/2025, 10:13 PMJulien Blondeau
08/01/2025, 6:31 AMAryan Mullick
08/01/2025, 9:31 AMHemanth Rao
08/04/2025, 9:31 AMINNOCENT BOY
08/19/2025, 9:49 AMUtkarsh Chaturvedi
08/19/2025, 10:17 AMPARTITIONED BY granularity.
This I figure is because the date range is split between month level segments and day level segments. So I break the ingestion into 2 : Before the month level change and after the month level change. So I run an ingestion July 25 - July 31. This works but only with DAY granularity. So this makes me uncertain about whether or not the ingestion earlier was breaking because of the underlying segment granularity.
3. Now the ingestion for July 25 - July 31. creates 7 day level segments : But they are not getting compacted : Compaction is saying 100% compacted except for last 10 days. Not seeing these uncompacted segments. Shouldn't these segements be relevant for compaction?
If anybody who understands compaction well, can help with this. Would be appreciated.Luke Foskey
08/21/2025, 3:52 AMsarthak
08/21/2025, 6:23 AM# Metadata storage configuration
druid.extensions.loadList=["postgresql-metadata-storage"]
druid.metadata.storage.type=postgresql
druid.metadata.storage.connector.connectURI=jdbc:<postgresql://your-db-host:5432/your-db-name>
druid.metadata.storage.connector.user=your-db-user
druid.metadata.storage.connector.password=your-db-password
# SSL-specific configuration
druid.metadata.postgres.ssl.useSSL=true
druid.metadata.postgres.ssl.sslMode=verify-full
druid.metadata.postgres.ssl.sslRootCert=/path/to/ca-cert.crt
It's not working. Errors are in screenshots.
If I am trying to connect with ssl Disabled postgres then Druid is running (commented all ssl related configs)Lionel Mena
08/21/2025, 2:44 PMAlex Niremov
08/22/2025, 7:08 AMSiva praneeth Alli
08/24/2025, 12:57 AMTanay Maheshwari
08/24/2025, 6:38 AMSean Fulton
08/25/2025, 8:12 PMSean Fulton
08/25/2025, 8:12 PMShivam Choudhary
08/27/2025, 1:00 AMCristi Aldulea
08/28/2025, 4:39 AMSuraj Goel
09/03/2025, 3:34 PMtransformSpec
filtering operates after row deserialization and data ingestion and the above feature can help in saving a lot of computation.Yotam Bagam
09/08/2025, 8:25 AMVivek M
09/08/2025, 10:12 AMOverview
We are facing an issue while ingesting a large dataset from S3 into Apache Druid. The ingestion process fails during the segment building phase with a data length mismatch error.
Error Message
java.lang.IllegalStateException: java.io.IOException: com.amazonaws.SdkClientException: Data read has a different length than the expected: dataLength=9404416; expectedLength=1020242891; includeSkipped=true; in.getClass()=class com.amazonaws.services.s3.AmazonS3Client$2; markedSupported=false; marked=0; resetSinceLastMarked=false; markCount=0; resetCount=0
at org.apache.commons.io.LineIterator.hasNext(LineIterator.java:108)
at org.apache.druid.data.input.TextReader$1.hasNext(TextReader.java:73)
at org.apache.druid.data.input.IntermediateRowParsingReader$1.hasNext(IntermediateRowParsingReader.java:60)
at org.apache.druid.java.util.common.parsers.CloseableIterator$2.findNextIteratorIfNecessary(CloseableIterator.java:74)
at org.apache.druid.java.util.common.parsers.CloseableIterator$2.next(CloseableIterator.java:108)
at org.apache.druid.java.util.common.parsers.CloseableIterator$1.next(CloseableIterator.java:52)
at org.apache.druid.indexing.common.task.FilteringCloseableInputRowIterator.hasNext(FilteringCloseableInputRowIterator.java:68)
at org.apache.druid.data.input.HandlingInputRowIterator.hasNext(HandlingInputRowIterator.java:63)
at org.apache.druid.indexing.common.task.InputSourceProcessor.process(InputSourceProcessor.java:95)
at org.apache.druid.indexing.common.task.IndexTask.generateAndPublishSegments(IndexTask.java:891)
at org.apache.druid.indexing.common.task.IndexTask.runTask(IndexTask.java:500)
Context
• The error occurs while ingesting a large JSON file from S3.
• Data read has a length of 9,404,416 bytes, while the expected length is 1,020,242,891 bytes.
• The error happens in the BUILD_SEGMENTS phase.
• The same large dataset is ingested successfully from our local Druid setup without any issues.
• Other datasets of smaller sizes are being ingested successfully.
Questions / Request for Support
We are looking for guidance and support on the following points:
1. Is this a known issue when ingesting large files from S3 into Druid?
2. Are there recommended configurations or best practices to handle such issues?
3. Should we consider splitting files, adjusting timeouts, or configuring retries to better handle large file ingestion?
4. Are there troubleshooting steps, patches, or workarounds that can help resolve this problem?
Additional Information
• Druid version, ingestion spec, and sample files can be provided upon request.
• We are happy to share more logs and configuration details as needed.
Thank you for your support!Suraj Goel
09/13/2025, 2:07 PM