Hello everyone, I know its a little technical ques...
# troubleshooting
a
Hello everyone, I know its a little technical question but, Would changing MAX_DOC_PER_CALL variable from 10000 to 100000 in DocIdSetPlanNode class cause any problem you could foresee? I am trying to write a custom function which basically does smoothing on a numeric column in order to remove unnecessary (for me) records. I have realized accessed record blocks limited by MAX_DOC_PER_CALL variable. I am asking this because my smoothing fuction performs better with more data. My query almost always has "limit 1000000" option and bandwidth is an important resource for me. I would appreciate it if you could share your thoughts with me šŸ™
k
Hi, do you mean you are not able to get more than 10K records in result even with higher limit? I am not sure running such a large scan query is good idea. Also, are you using APIs or one of our connectors to run the query?
Also, can you provide a sample query so that we can understand the usecase better
a
i am trying to implement a transform function which extends from BaseTransformFunction class and overrides transformToIntValuesSV function and i want to use this function in my where clause. For example:
select colA,colB from myTable where messageTime  >= 1651064199000 and messageTime <= 1651068260000 and myReductionFunction(colA,colB,messageTime) = '1' order by messageTime limit 1000000
transformToIntValuesSV takes a ProjectionBlock which contains MAX_DOC_PER_CALL amounts of records per call. I just want to know if it would cause a side effect if i change MAX_DOC_PER_CALL from 10_000 to 100_000? Processing 100_000 records (more records) achieves better results for the function i want to implement.
k
yeah that is not advisable imo. It will lead to significantly lower performance for Pinot.