Apache Pinot #troubleshooting

Diogo Baeder

04/17/2022, 9:34 PM

Hi guys, I'm having an issue running Pinot 0.10 locally where, sometimes after I stop my local docker-compose services and start them again, my segments go missing and marked as in "bad" status. It doesn't happen all the time, like now I tested and after the 3rd restart of the services I managed to have them in a "bad" state. I compared directories before and after and couldn't find out what's happening that caused them to "spoil". Any clues why this might be happening? More details in the thread here.

Saumya Upadhyay

04/18/2022, 3:54 AM

Hi All, what is the way to retain data in realtime table with no time limit. I configured tables with no retention time limit. But after 3,4 weeks I can see from query console that totalDocs are very few . What is the config property if we dont want to purge any data from table?

Saumya Upadhyay

04/18/2022, 6:35 AM

Hi All, Facing issue many times that data is lost, segments are keep increasing, how pinot decides to create new segments. From console I can see all segments are in Good state. How we can identify why data is getting lost intermittently, which is very frequent now . Which logs we should look, controller , server, broker? Getting this message from controller logs : Invalid retention time: null null for table: table1_REALTIME, skip

KISHORE B R

04/18/2022, 8:46 AM

Hi everyone, I was trying to ingest a JSON data through Kafka. One of the columns is an array of nested JSON and I have marked it as JSON datatype in the schema. When I publish the data to the topic, I get an error in the Pinot server which is "Caused by: java.lang.IllegalStateException: Cannot read single-value from Collection". A sample record would be "fieldToValueMap" : { "Agent_phone_number" : 2807536641, "Call_end_time" : "2021-09-20 194141", "Calling_number" : "4025165405", "Call_start_time" : "2021-09-20 193819", "Account_number" : "4T1QUDSKPI", "Customer_name" : "Dan", "Queue" : { "qdetails" : [ { "queue_duration" : 229, "qname" : "q2" }, { "queue_duration" : 90, "qname" : "q3" } ] }, "Agent_id" : "K3GDP9" }, "nullValueFields" : [ ] Where am I going wrong? I have attached the schema and configuration files in the thread.

Luis Fernandez

04/18/2022, 8:09 PM

hey friends, i’m seeing our space in zookeeper almost getting to max usage in terms of disk, we have 5gb disk space, we currently have it setup with

Copy code

- name: ZK_PURGE_INTERVAL
              value: "1"
            - name: ZK_SNAP_RETAIN_COUNT
              value: "3"

in the logs i can see things getting set:

Copy code

2022-04-15 16:14:35,914 [myid:1] - INFO  [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3
2022-04-15 16:14:35,915 [myid:1] - INFO  [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 1
2022-04-15 16:14:35,979 [myid:1] - INFO  [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started.
2022-04-15 16:14:35,980 [myid:1] - INFO  [main:ManagedUtil@46] - Log4j found with jmx enabled.
2022-04-15 16:14:35,988 [myid:1] - INFO  [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed.

however i don’t see any more logs of clean up logs after, is there a reason for this? also, i can see that the space is being chugged by the logs does anyone know why things may not get cleaned up? Thank you would appreciate your help.

Prashant Pandey

04/19/2022, 10:49 AM

Hi team. I have a column with very high cardinality. It is defined as a MV column in my schema. We also have an inverted index defined on it. However, we want to disable dictionary on this column as it’s very high cardinality and the dictionary size is almost 50% of the total segment size. However, Pinot does not let me disable it and errors out with the following message: “Cannot create an Inverted index on column tags__KEYS specified in the noDictionaryColumns config” Can I disable dict. of this column and still have inverted index defined on it? Also, when should we use a forward index vs. an inverted index?

Srini Kadamati

04/19/2022, 5:13 PM

hey y’all! @User and I were having some trouble getting Pinot connected to Superset and would love any help here. We tried both the URI’s in the superset docs (https://superset.apache.org/docs/databases/pinot) and the Pinot docs (https://docs.pinot.apache.org/integrations/superset) and neither seemed to work for us. I specifically tested with the pinot quickstart docker image and kept getting generic SQLalchemy errors, which makes it hard to debug! Hugh’s trying to test this PR (https://github.com/apache/superset/pull/19724) in the Superset repo, that improves the Pinot capabilities in Superset ⚡ cc @User @User @User

Yeongju Kang

04/20/2022, 1:08 PM

Hello folks, I have some questions about memory consumption of server instances. • How much RAM will be taken(roughly estimated) by a segment if I ingest 687MB csv, which generates 109MB segment gzipped tar file? • I have 5 server nodes and each has 4272/3884/4438/3493/3661 segments. They took 27/8/6/3/18 Gi RAM each. What makes them different from each other? kind of raw data cache? Thanks in advance.

Diogo Baeder

04/20/2022, 2:57 PM

Hey guys, sorry to ask something about Trino here, but it's related to Pinot and unfortunately I couldn't get help from anyone at their community. I'm doing an experiment with Trino and Pinot, and I noticed that a query I do to different tables in Pinot ends up with Trino actually querying all of the existing segments in Pinot, completely bypassing any partitioning I defined for my tables. Is this expected? Has somebody here ever experienced this as well, but solved it? I'm thinking about using dynamic tables in Trino to work around that issue, but it just feels dirty to have to do that...

Abhijeet Kushe

04/20/2022, 7:24 PM

I am using Pinot 0.9.1 I wanted to know how order by works The below query returns a response

Copy code

select taskName, taskResult, distinctcount(personId)  from events where accountId = 1100609261882 AND workId = '40d9652e-c543-4bd2-aa4d-a11c7b23a6df' group by taskName, taskResult order by mode(createdOn) asc limit 10000

but this throws an exception

Copy code

select taskName, taskResult, distinctcount(personId)  from events where accountId = 1100609261882 AND workId = '40d9652e-c543-4bd2-aa4d-a11c7b23a6df' group by taskName, taskResult order by createdOn asc limit 10000

Copy code

[
  {
    "message": "QueryExecutionError:\nProcessingException(errorCode:450, message:InternalError:\njava.lang.NullPointerException\n\tat org.apache.pinot.core.operator.combine.GroupByOrderByCombineOperator.mergeResults(GroupByOrderByCombineOperator.java:230)\n\tat org.apache.pinot.core.operator.combine.BaseCombineOperator.getNextBlock(BaseCombineOperator.java:120)\n\tat org.apache.pinot.core.operator.combine.BaseCombineOperator.getNextBlock(BaseCombineOperator.java:50)",
    "errorCode": 200
  }
]

mode(createdOn) asc

is the difference what makes it work ..is this a bug ?

Slackbot

04/21/2022, 2:45 AM

This message was deleted.

kaushal aggarwal

04/21/2022, 10:27 AM

[ { "message": "UnknownColumnError\norg.apache.pinot.spi.exception.BadQueryRequestException Unknown columnName 'Geanixx' found in the query\n\tat org.apache.pinot.broker.requesthandler.BaseBrokerRequestHandler.getActualColumnName(BaseBrokerRequestHandler.java:1762)\n\tat org.apache.pinot.broker.requesthandler.BaseBrokerRequestHandler.fixColumnName(BaseBrokerRequestHandler.java:1696)\n\tat org.apache.pinot.broker.requesthandler.BaseBrokerRequestHandler.fixColumnName(BaseBrokerRequestHandler.java:1717)\n\tat org.apache.pinot.broker.requesthandler.BaseBrokerRequestHandler.fixColumnName(BaseBrokerRequestHandler.java:1717)", "errorCode": 710 } ]

Grace Lu

04/21/2022, 3:52 PM

Hi team, I am seeing inaccurate result in pinot aggregation query 👀. For this table that has 7B records, I try to run a group by uuid to see how many records for each uuid, and for most of the uuid they should have around 100 records. But when I run something like

select uuid, count(*) from table group by 1

I get very inaccurate aggregation result, for example, uuid

will show it only has 3 records in count() here, but if I only query specifically for this uuid, like `select uuid, count() from table where uuid='a' group by 1` , it will show the correct result which is 100. Can someone help me understand what is going on here?🙏

Xiang Fu

04/21/2022, 7:19 PM

what’s the jdbc connection string when controller and brokers are using https @User

Saumya Upadhyay

04/22/2022, 7:54 AM

Is it normal for segments go OFFLINE if server is not that much busy , facing this issue almost in dev , Qa env. and data is getting lost due to this , only 2 or 3 segments are ONLINE and sometimes data is not coming to pinot. Checking logs for errors if any.

Luis Fernandez

04/22/2022, 3:56 PM

hey friends, we are using loki to do logging ingestion of pinot logs however with the default setup that we have from the helm chart it seems like the searching capabilities for logging are not as great, do you guys have a configuration for log4j that better satisfies log lookups in tools like loki or google logs search?

Diogo Baeder

04/22/2022, 9:28 PM

Hey folks, does anybody know how to forcefully push down Pinot queries to the broker when using Trino? I really need a solution to be able to do multi-table queries and have been trying to use Trino for that, but I'm absolutely disappointed at how hard it's been to make this work - if I just do normal queries in Trino it tries to query ALL of the segments, completely bypassing any partitioning (including the time column), which is insane.

Diogo Baeder

04/22/2022, 9:29 PM

I'm almost dropping Trino and trying PrestoDB instead. If that doesn't work either, I'll have to brute-force this thing and rely on implementing the joins myself.

Diogo Baeder

04/24/2022, 2:56 PM

Quick question: is it possible to run a query on a certain table, but with an IN_SUBQUERY where the subquery is run against another table?

Diogo Baeder

04/25/2022, 12:25 AM

Me again. New issue, not sure if it's me or if it's a genuine bug: I'm trying to ingest JSON data for one of my columns, but I keep getting an error for that column:

Cannot read single-value from Collection:

. More on this thread.

yelim yu

04/25/2022, 1:32 AM

Hi, Does schema evolution only work at batch tables? We wanted to add a new column on hybrid table (offline table + realtime streaming table) which includes upsert columns. When we added new column on schema config, we also needed to change table config since the new columns should have beed overwritten. It means we must have deleted original realtime table and re-generated realtime table. Is it right way of creating new columns in a streaming upsert table? example here only shows batch table

Lars-Kristian Svenøy

04/25/2022, 11:00 AM

Hello everyone 👋 I'm seeing a problem where Pinot is not able to ingest a JSON object, it just shows up as null in the table... Will post more details in thread

Harish Bohara

04/25/2022, 11:48 AM

Any idea why Pinot server JVM usage grows as records grows - it does glow slowly. However, this way it will eventually have high heap usage. (I do have 5-6 inverted and sorted indexes in my table - with very low cardinality columns) • using off heap and MMAP for segments in my setup • have ~ 100-150 segments • 500-600M rows -> continue to grow over 150M per day.. • 6 server nodes - 8GB is given to JVM and rest is available to off-heap What I expected - it will grow and as segments go to disk it should come back the lower bound. I expected that this cycle should continue and the the lower bound of memory should remain constants. Am i missing some setting?

Saumya Upadhyay

04/26/2022, 9:31 AM

I am setting up prometheus to monitor pinot, some metrices are not coming up, guessing it might be issue with setting up scrape configs , please tell where we need to add this config in which file :

Copy code

controller:
  ...
  service:
    annotations:
      "<http://prometheus.io/scrape|prometheus.io/scrape>": "true"
      "<http://prometheus.io/port|prometheus.io/port>": "8008"
  ...
  podAnnotations:
    "<http://prometheus.io/scrape|prometheus.io/scrape>": "true"
    "<http://prometheus.io/port|prometheus.io/port>": "8008"

Lars-Kristian Svenøy

04/26/2022, 2:29 PM

Hello everyone 👋 For upsert tables its specified that you need to partition your input stream (kafka) by the primary key. How is this achieved when working with composite keys? I'm also curious how this works together with the column partition map, I thought it had to correspond to the partition in kafka?

Ali Atıl

04/27/2022, 7:23 AM

Hello everyone, I know its a little technical question but, Would changing MAX_DOC_PER_CALL variable from 10000 to 100000 in DocIdSetPlanNode class cause any problem you could foresee? I am trying to write a custom function which basically does smoothing on a numeric column in order to remove unnecessary (for me) records. I have realized accessed record blocks limited by MAX_DOC_PER_CALL variable. I am asking this because my smoothing fuction performs better with more data. My query almost always has "limit 1000000" option and bandwidth is an important resource for me. I would appreciate it if you could share your thoughts with me 🙏

Rohit Sivakumar

04/27/2022, 1:55 PM

Hello Pinot team, I’m learning to work with Pinot and have hit a couple edge cases, that I couldn’t find the answers to in the Docs. I’ll post them here as two separate threads. There’s some weirdness around using _id as a column name. I’m trying to ingest data into Pinot from an OLTP data-store, and I wanted to have the primary-key be a column named “_id”. During ingestion, I found that our 32-digit hexadecimal string is converted into a much longer string if the column were named “_id”. Renaming the column to “id” works just fine. Is

_id

a reserved name in Pinot? Will attach screenshots with both with _id and id as column names in this thread.

Pavel Stejskal

04/27/2022, 3:18 PM

Hello! Do you have any working tutorial for Spark batch loading for the latest version of Pinot? After migration of jars to plugins-external. Cannot make it working at all

Lars-Kristian Svenøy

04/28/2022, 8:49 AM

Hello everyone. Thanks again for all your help with everything so far. I have a question regarding upsert, and how to deal with deduping for a certain scenario.. details in thread.

troywinter

04/28/2022, 5:20 PM

<!here> I have some realtime table’s consuming segments in error state after upgrade and restart, resetting the segment and restart did not fix this problem, any suggestions?