Hi folks, I am seeing unexpected behaviour while ...
# troubleshooting
r
Hi folks, I am seeing unexpected behaviour while testing the mergeandrollup task. I see that the table originally had 3 segments, and after the MergeAndRollup task ran, it now has 5. The query results before and after the merge are correct. But I was expected the segment count to go down automatically. Is that the right assumption? I see the following exception in the minion logs. Caught exception while executing task: Task_MergeRollupTask_3735cc4d-aea0-4bb3-915d-aaeee735b4f1_1666896120043_0 org.apache.pinot.common.exception.HttpErrorStatusException: Got error status code: 500 (Internal Server Error) with reason: "Any segments from 'segmentsTo' should not be available in the table at this point. (tableName = 'videocollection_OFFLINE', segmentsFrom = '[merged_1d_1666896060010_0_videocollection_2022-09-19_2022-09-19_0]', segmentsTo = '[merged_1d_1666896120028_0_videocollection_2022-09-19_2022-09-19_0]', segmentsFromTable = '[merged_1d_1666896120028_0_videocollection_2022-09-19_2022-09-19_0, merged_1d_1666896060010_0_videocollection_2022-09-19_2022-09-19_0]')" while sending request: http://pinot-controller-0.pinot-controller-headless.pinot-quickstart.svc.cluster.local:9000/segments/videocollection/startReplaceSegments?type=OFFLINE&forceCleanup=true to controller: pinot-controller-0.pinot-controller-headless.pinot-quickstart.svc.cluster.local, version: Unknown at org.apache.pinot.common.utils.http.HttpClient.wrapAndThrowHttpException(HttpClient.java:442) ~[pinot-all-0.12.0-SNAPSHOT-jar-with-dependencies.jar:0.12.0-SNAPSHOT-6ffb9718e40dcc65a50b1c8854904a04a0f241b8] at org.apache.pinot.common.utils.FileUploadDownloadClient.startReplaceSegments(FileUploadDownloadClient.java:945) ~[pinot-all-0.12.0-SNAPSHOT-jar-with-dependencies.jar:0.12.0-SNAPSHOT-6ffb9718e40dcc65a50b1c8854904a04a0f241b8] at org.apache.pinot.plugin.minion.tasks.SegmentConversionUtils.startSegmentReplace(SegmentConversionUtils.java:144) ~[pinot-all-0.12.0-SNAPSHOT-jar-with-dependencies.jar:0.12.0-SNAPSHOT-6ffb9718e40dcc65a50b1c8854904a04a0f241b8] at org.apache.pinot.plugin.minion.tasks.SegmentConversionUtils.startSegmentReplace(SegmentConversionUtils.java:130) ~[pinot-all-0.12.0-SNAPSHOT-jar-with-dependencies.jar:0.12.0-SNAPSHOT-6ffb9718e40dcc65a50b1c8854904a04a0f241b8] at org.apache.pinot.plugin.minion.tasks.BaseMultipleSegmentsConversionExecutor.preUploadSegments(BaseMultipleSegmentsConversionExecutor.java:117) ~[pinot-all-0.12.0-SNAPSHOT-jar-with-dependencies.jar:0.12.0-SNAPSHOT-6ffb9718e40dcc65a50b1c8854904a04a0f241b8] at org.apache.pinot.plugin.minion.tasks.BaseMultipleSegmentsConversionExecutor.executeTask(BaseMultipleSegmentsConversionExecutor.java:228) ~[pinot-all-0.12.0-SNAPSHOT-jar-with-dependencies.jar:0.12.0-SNAPSHOT-6ffb9718e40dcc65a50b1c8854904a04a0f241b8] at org.apache.pinot.plugin.minion.tasks.BaseMultipleSegmentsConversionExecutor.executeTask(BaseMultipleSegmentsConversionExecutor.java:65) ~[pinot-all-0.12.0-SNAPSHOT-jar-with-dependencies.jar:0.12.0-SNAPSHOT-6ffb9718e40dcc65a50b1c8854904a04a0f241b8] at org.apache.pinot.minion.taskfactory.TaskFactoryRegistry$1.runInternal(TaskFactoryRegistry.java:121) [pinot-all-0.12.0-SNAPSHOT-jar-with-dependencies.jar:0.12.0-SNAPSHOT-6ffb9718e40dcc65a50b1c8854904a04a0f241b8] at org.apache.pinot.minion.taskfactory.TaskFactoryRegistry$1.run(TaskFactoryRegistry.java:95) [pinot-all-0.12.0-SNAPSHOT-jar-with-dependencies.jar:0.12.0-SNAPSHOT-6ffb9718e40dcc65a50b1c8854904a04a0f241b8] at org.apache.helix.task.TaskRunner.run(TaskRunner.java:75) [pinot-all-0.12.0-SNAPSHOT-jar-with-dependencies.jar:0.12.0-SNAPSHOT-6ffb9718e40dcc65a50b1c8854904a04a0f241b8] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?] at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:829) [?:?]
m
@Seunghyun @Haitao Zhang
h
ack
s
@reallyonthemove tous Can you provide the mergeRollupTask config?
also can you run
GET /segments/<tableName>?excludeReplacedSegments=true
from the swagger UI?
1. MergeRollupTask is trying to align data on time. If your input segment’s data spans over multiple days, there can be a possible cases where we increase the number of segments. 2. If your
maxNumRecordsPerSegment
config is too low, we can potentially increase the number of segments 3. We delete the original segment asynchronously. To see the active segments, we need to check the list of segments with
excludeReplacedSegments=true
. The segments appear with the filter will be remaining and the rest of segments will get cleaned up eventually
r
>>Can you provide the mergeRollupTask config?
"task": { "taskTypeConfigsMap": { "MergeRollupTask": { "1d.bucketTimePeriod": "1d", "1d.bufferTimePeriod": "1d", "schedule": "0 * * * * ?" } } },
>>can you run GET /segments/<tableName>?excludeReplacedSegments=true from the swagger UI?
Copy code
[
  {
    "OFFLINE": [
      "merged_1d_1666896120028_0_videocollection_2022-09-19_2022-09-19_0",
      "merged_1d_1666896060010_0_videocollection_2022-09-19_2022-09-19_0"
    ]
  }
]
Each original segment just contained 1 record. I did not explicitly set maxNumRecordsPerSegment but I am assuming given low number of records it should not be an issue.
>The segments appear with the filter will be remaining and the rest of segments will get cleaned up eventually
The output seems to indicate that only merged segements are valid, so should I assume the exception is benign? When should I expect the invalid segments to be cleaned up?
s
it should be cleaned up in the next round of the schedule
r
how does pinot determine the target segment count? in this case i just had 3 records across 3 segments so i expected it to be able to merge down to a single segment. But I see two merged segments..
h
it should be cleaned up in the next round of the schedule
To correct it a little bit: the replaced segments will be kept for at least 24h, and then it will be deleted by the next run of the retention manager, which is a periodic task.
this 24h is hard-coded as for now
@reallyonthemove tous can you go to swagger UI and get
/tasks/{taskType}/{tableNameWithType}/debug
with
verbosity = 1
? It will return all minion task details. The exception you posted here shows that minion did some duplicated work, I want to know more details to help you to debug it
It will also help us to understand why the original segments were merged into 2 segments
r
see attached
response_1666906881625.json
👆 this has the exception log
h
what is the current response of
GET /segments/<tableName>?excludeReplacedSegments=true
and
GET /segments/<tableName>?excludeReplacedSegments=false
?
r
Copy code
TRUE
[
  {
    "OFFLINE": [
      "merged_1d_1666906920054_0_videocollection_2022-09-19_2022-09-19_0",
      "merged_1d_1666906920055_0_videocollection_2022-09-19_2022-09-19_0"
    ]
  }
]

FALSE:

[
  {
    "OFFLINE": [
      "home_ankshah_Oakgrove_pinotclustersetup_scripts____inputdata_objects_object1_json",
      "home_ankshah_Oakgrove_pinotclustersetup_scripts____inputdata_objects_object2_json",
      "home_ankshah_Oakgrove_pinotclustersetup_scripts____inputdata_objects_object3_json",
      "merged_1d_1666906860069_0_videocollection_2022-09-19_2022-09-19_0",
      "merged_1d_1666906920054_0_videocollection_2022-09-19_2022-09-19_0",
      "merged_1d_1666906920055_0_videocollection_2022-09-19_2022-09-19_0"
    ]
  }
]
h
and can you also
GET /tasks/{taskType}/{tableNameWithType}/metadata
?
r
Copy code
{
  "id": "videocollection_OFFLINE",
  "simpleFields": {
    "watermarkMs_1d": "1663545600000"
  },
  "mapFields": {},
  "listFields": {}
}
h
just to double check: your table has a time column, right?
r
yes, tableconfig--> "segmentsConfig": { "timeColumnName": "created_at_timestamp", "segmentPushType": "APPEND", "schemaName": "videocollection", "replication": "1" }, schema--> "dateTimeFieldSpecs" : [ { "name": "created_at", "dataType": "STRING", "format": "1SECONDSSIMPLE_DATE_FORMAT:yyyy-MM-dd'T'HHmmssZ", "granularity": "1:HOURS" }, { "name": "created_at_timestamp", "dataType": "STRING", "format" : "1SECONDSSIMPLE_DATE_FORMAT:yyyy-MM-dd'T'HHmmssZ", "granularity": "1:HOURS" } ]
{ "columnName": "created_at_timestamp", "transformFunction": "toDateTime(_ts, 'yyyy-MM-dd''T''HHmmssZ')" }
h
thanks! this task behaves unexpectedly, let me check and see if this is a bug
r
thanks Haitao, jfyi, i have been doing table delete and recreating it with the same name. not sure if that can cause an issue with this taesk.
h
ah, I see, this is the problem here, because you delete and recreate the table again and again, so the same mergeRollup was performed again and again
if you create a table, don't delete data, then it should work as expected
you can play with a new table, and check the segments, do let us know if there are anything unexpected, we will help explain the logic and fix bugs if there are any
r
unfortuntately, delete table / create with same name could happen in the scenario i am testing. The problem i see here is that the segments belonging to the new table are not cleaned up after merge. currently the table has 6 segments for example. should i expect them to be cleaned up after 24 hours? if so then we are good
h
they should be cleaned up
r
also, could you explain why its left with 2 active segments after merge?
h
also, could you explain why its left with 2 segments?
I need to check the logic
👍 1
the bigger concern to me is the exception, it might cause problems if you keep deleting tables and recreating them
r
hm, is table delete leaking some state behind which is causing the problem?
h
I haven't figured out the root cause
is the table partitioned?
r
yes
"segmentPartitionConfig": { "columnPartitionMap": { "content.name": { "functionName": "Murmur", "numPartitions": 2 } } }
h
okay, that explains why it results in two segments
because segments belongs to different partitions cannot be merged
r
ah.. makes sense
h
but there is indeed a bug we need to fix, which caused the exception
the good news is it will not cause real problems, the task will retry and fix itself
👍 1
r
i confimed the older segments were cleanedup today.
👍 1
h
good to know it works as expected!