Chandu
06/21/2023, 5:30 AMAshok Kumar Ragupathi
06/21/2023, 7:25 AMAshok Kumar Ragupathi
06/21/2023, 7:26 AMAshok Kumar Ragupathi
06/21/2023, 7:26 AMAshok Kumar Ragupathi
06/21/2023, 7:27 AMAshok Kumar Ragupathi
06/21/2023, 7:27 AMAshok Kumar Ragupathi
06/21/2023, 7:28 AMJiaojiao Fu
06/21/2023, 7:53 AMDejan Zegarac
06/21/2023, 8:56 AMSrinivas Narava
06/21/2023, 5:04 PMdropBeforeByPeriod(P580D), loadByPeriod(P2Y+future)
loadByPeriod(P70D+future), dropBeforeByPeriod(P90D)
loadByPeriod(P2Y+future)
dropBeforeByPeriod(P580D), loadForever
Zac Harvey
06/21/2023, 7:20 PMViraj Raul
06/22/2023, 7:26 AMGururaj K.P
06/22/2023, 9:19 AMJRob
06/22/2023, 9:29 PMVinod Pillai
06/23/2023, 1:36 AMHareesh Joshi
06/23/2023, 4:14 AM윤혁준
06/23/2023, 5:43 AMAshok Kumar Ragupathi
06/23/2023, 7:22 AMAshok Kumar Ragupathi
06/23/2023, 7:22 AMAshok Kumar Ragupathi
06/23/2023, 7:23 AM윤혁준
06/24/2023, 6:47 AMsj Gao
06/25/2023, 6:59 AMDejan Zegarac
06/26/2023, 10:46 AMI have 1 master, 2 data servers with a historical and a middleManager on both and a query server with router and broker.
When replying to my questions, consider the fact that I have /deep/segments and /deep/indexing-logs (these two folders are accessible from every node, but shared on a shared network partition), /druid/tasks, /druid/segment-cache, /druid/processing (these 3 apply to every node and are node specific - data is not shared over a network partition). /druid/tasks contains some indexing files, such as index_kafka_session-events_de140c5bd9f15fa_jiaekmng, for example. /deep/indexing-logs contains also some indexing files such as index_kafka_session-events_de140c5bd9f15fa_dokepdhc.report.json and index_kafka_session-events_de140c5bd9f15fa_dokepdhc.log.
Questions:
1. I get a new message from kafka. It creates a new segment and writes data to that segment. Where is that segment right now located? In memory? In disk? In deep storage? On all 3 locations? For example, when I receive a new record, I don't get anything in /deep/segments or /druid/segment-cache. Is that normal and when do the files go there? It is very important for me to properly understand where and when druid stores data to specific locations or memory.
2. If, for example, 1 data node fails, what happens? I see that it contains 1 replica on both nodes, but when the node boots up, how does it transfer data to that node? Where is data stored and from where it pulls the data. I'm guessing it pulls it from /deep/indexing-logs (in case segment is active), and /deep/segments, if it's not active but published, no?
3. If deep storage gets disconnected, what happens? Where is data stored in that case and does it write data back to deep storage when it becomes available?
Hope someone can provide me with answers to these questions. It's very important for me to understand exactly how druid works and where it stores data in these very specific questions in order to be able to make a durable cluster.Anant Sharma
06/26/2023, 2:05 PM"maxRowsInMemory": 5000000, -> "maxRowsInMemory": 10000000,
"maxBytesInMemory": 0, -> "maxBytesInMemory": 1073741824,{1GiB}
taskCount": 1 -> taskCount": 3
taskDuration": "PT604800S" -> "taskDuration": "PT3600S", {1H}
the error that im receiving
{
"dataSource": "eber_vehicles_gps",
"stream": "eber.vehicle.gps.qc",
"partitions": 3,
"replicas": 1,
"durationSeconds": 604800,
"activeTasks": [
{
"id": "index_kafka_eber_vehicles_gps_918d828da3a87a0_loeichek",
"startingOffsets": {
"0": 267502
},
"startTime": "2023-06-26T13:38:44.469Z",
"remainingSeconds": 603302,
"type": "ACTIVE",
"currentOffsets": {
"0": 419986
},
"lag": {
"0": 0
}
},
{
"id": "index_kafka_eber_vehicles_gps_bf5f718a043f069_mhpbdeep",
"startingOffsets": {
"1": 287146
},
"startTime": "2023-06-26T13:38:44.334Z",
"remainingSeconds": 603302,
"type": "ACTIVE",
"currentOffsets": {
"1": 420187
},
"lag": {
"1": 0
}
}
],
"publishingTasks": [],
"latestOffsets": {
"0": 419986,
"1": 420187,
"2": 421028
},
"minimumLag": {
"0": 0,
"1": 0,
"2": 166
},
"aggregateLag": 166,
"offsetsLastUpdated": "2023-06-26T14:03:16.362Z",
"suspended": false,
"healthy": false,
"state": "UNHEALTHY_TASKS",
"detailedState": "UNHEALTHY_TASKS",
"recentErrors": []
}
jp
06/27/2023, 12:25 AMLokesh Mandhare
06/27/2023, 7:09 AMSiddharth Gautam
06/28/2023, 7:44 AMJulian Reyes
06/28/2023, 9:48 AMdruid.emitter.statsd.dimensionMapPath
to make use of custom metrics, however I was wondering if someone has been able to use it by uploading the json file to k8s cluster via helm and then set the dimensionMapPath
to the location of the previously uploaded fileSiddharth Gautam
06/28/2023, 12:30 PMsimon mikolajek
06/28/2023, 2:14 PM