Apache Pinot #troubleshooting

Elon

03/16/2022, 5:18 PM

I can't see the full schema, is it an mv column?

Facundo Bianco

03/16/2022, 8:18 PM

Hi All 👋, I'm trying to configure a date format like this "_2020-12-31T195921.522-0400_" and created table-schema.json as

Copy code

"dateTimeFieldSpecs": [{
    "name": "timestampCustom",
    "dataType": "STRING",
    "format" : "1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd'T'HH:mm:ss.SSZZ",
    "granularity": "1:MILLISECONDS"
  }]

Table is generated successfully but POST command returns

Copy code

{
  "code": 500,
  "error": "Caught exception when ingesting file into table: foo_OFFLINE. null"
}

I discovered is related to date format, could you kindly indicate how should it be? I used this site to generate the custom format. Thanks in advance!

Grace Lu

03/16/2022, 11:26 PM

Hi team, we ran into lots of issue when setting up spark ingestion job with Yarn. The latest issue we saw is that the application master reported the following error after the job is submitted to the cluster and no resources can be assigned to the job:

Copy code

Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.yarn.api.records.impl.pb.ProtoUtils.convertToProtoFormat(Lorg/apache/hadoop/yarn/api/records/ExecutionType;)Lorg/apache/hadoop/yarn/proto/YarnProtos$ExecutionTypeProto;
	at org.apache.hadoop.yarn.api.records.impl.pb.ExecutionTypeRequestPBImpl.setExecutionType(ExecutionTypeRequestPBImpl.java:73)

We wonder if pinot has also introduced this class in its dependencies and if it is conflicted with the library in our hadoop cluster itself? We are at spark 2.4.6, hadoop 2.9.1, pinot 0.9.2, and seems like pinot 0.9.2 is built with hadoop2.7.0 and spark 2.4.0, have we tested the compatible spark/hadoop version for running ingestion jobs?

Jonathan Meyer

03/17/2022, 4:18 PM

Hello Pinot community, long time no see 🙂 Is there any way for

SUM

to not return 0 when there's actually no values to aggregate ? i.e. return

null

in such case

Tony Requist

03/17/2022, 7:22 PM

I have a realtime table with

Copy code

"realtime.segment.flush.threshold.rows": "10000000",
        "realtime.segment.flush.threshold.time": "6h",
        "realtime.segment.flush.threshold.segment.size": "400M",

I changed these values two days ago, previously the "rows" limit was 0. Pinot is generating segments with 3,333,333 rows, every ~90 minutes, 95-100MB -- significantly below any of the limits. Server logs show

Starting consumption on realtime consuming segment ... maxRowCount 33333

and

Stopping consumption due to row limit nRows=3333333

- I am trying to figure out where that limit is coming from.

Luis Fernandez

03/17/2022, 9:13 PM

hey friends i want to run your thoughts thru something I have been doing some chaos exercises in pinot to see how it reacts this is my current scenario: Chaos exercise in pinot: System config: 1 minion, 2 servers, 2 brokers, 3 controllers, 3 zookeepers, data replication 2, backup gcs, environment GKE Scenario: downsize to 1 server, remove server pvc, see impact, try to go back to normal. (2 servers) Steps: 1. Downsize server to 1 with kubectl scale 2. Remove pvc in server 1 with kubectl delete pvc 3. Observation: p99 response time in system still strong not noticeable changes 4. Upsize back to 2 with kubectl scale 5. Observation: things don’t kick in automatically it seems like there’s some manual steps I have to do, don’t see new server consuming and having data pulled from gcs, still see the old server in the servers UI in the pinot-controller, it seems like I need to run a rebalance at this point 6. Update offline and online tags from old server with endpoint in pinot-controller 7. Seems like we can issue a rebalance now 8. Issuing with following: dryRun=false, reassignInstances=true, includeConsuming=false, bootstrap=true, downtime=false, minAvailableReplicas=true, bestEfforts=false 9. Observation: not seeing noticeable changes in p99 response time At this point the second instance is still not in a great state and not consuming, however the system is okay performing still at ms for p99s I’m wondering the following: Question: • What to look for when a rebalance is done in the pinot-controller-logs? • When to delete the old server tag? Do I need to also issue a updateBrokerResource, I try to delete but it says that Instance Server_10.12.64.88_8098 exists in ideal state for table and it doesn’t let me drop, at this point I cannot see the tables in the UI • Any other thing I should have done while rebalancing?

Sandeep R

03/18/2022, 1:19 AM

Hi Team, I have problem with this timestamp, We have this date format in table "LOG_TS": "2022-03-09T164742.995+00:00", and I am adding below date format, not sure if this is correct format?

Copy code

"name": "LOG_TS",
      "dataType": "LONG",
      "format": "1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-mm-ddThh:mm:ss.sssZ",
      "granularity": "1:MILLISECONDS"

Luis Fernandez

03/18/2022, 4:12 PM

hey friends, i have a question regarding

Table Consuming Latency

I have been turning off and on various part of pinot to see how it behaves, this time i decided to turn off for sometime the kafka app that produces the records to pinot, i saw a latency increase when i turned off the app and at least for p99, it was 160ms and now is over a minute, when things like this happen when do you expect pinot to get back to its regular level does it ever get back? I was thinking as the day goes by maybe and this topic start to get less traffic then maybe things come down but I was wondering if that somehow can come back any other way. Ofc this is still pretty fast but I’m wondering what happens if I were to take down the app for a longer time how could that impact the p99 times

Luis Fernandez

03/18/2022, 7:00 PM

hey friends it’s me again, I was using apache ab to do a simple load test to the brokers in pinot, we noticed that the exceptions in the server sky rocketed while ab was going, it seems like this is the stack trace

Copy code

Encountered exception while processing requestId 9610 from broker Broker_pinot-broker-1.pinot-broker-headless.pinot.svc.cluster.local_8099
java.lang.NullPointerException: null
        at org.apache.pinot.core.util.trace.TraceContext.getTraceInfo(TraceContext.java:191) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependenci
es.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a]
        at org.apache.pinot.core.query.executor.ServerQueryExecutorV1Impl.processQuery(ServerQueryExecutorV1Impl.java:223) ~[pinot-all-0.10.
0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a]
        at org.apache.pinot.core.query.executor.QueryExecutor.processQuery(QueryExecutor.java:60) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-depen
dencies.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a]
        at org.apache.pinot.core.query.scheduler.QueryScheduler.processQueryAndSerialize(QueryScheduler.java:151) ~[pinot-all-0.10.0-SNAPSHO
T-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a]
        at org.apache.pinot.core.query.scheduler.QueryScheduler.lambda$createQueryFutureTask$0(QueryScheduler.java:137) ~[pinot-all-0.10.0-S
NAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a]
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
        at shaded.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListe
nableFutureTask.java:111) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a]
        at shaded.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58) [pinot-all-0.10.0-SNAPSHOT-jar-with-dep
endencies.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a]
        at shaded.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75) [pinot-all-0.10.0-S
NAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:829) [?:?]

does anyone know what this NullPointer may refer to?

Weixiang Sun

03/18/2022, 8:25 PM

In upsert table, can we update the timestamp of the row?

Bordin Suwannatri

03/21/2022, 5:15 AM

hi now i try to create real time table consume kafka. it's created but status BAD. Please Help recommend this. error log is --> Mar 21 120640 poc-pinot01 pinot-admin.sh: 2022/03/21 120640.937 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-pinot-uat-(1ad8bfc9_DEFAULT)] Event 1ad8bfc9_DEFAULT : Unable to find a next state for resource: user_stream_tykb_REALTIME partition: user_stream_tykb__2__0__20220321T0506Z from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:CONSUMING Mar 21 120640 poc-pinot01 pinot-admin.sh: 2022/03/21 120640.937 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-pinot-uat-(1ad8bfc9_DEFAULT)] Event 1ad8bfc9_DEFAULT : Unable to find a next state for resource: user_stream_tykb_REALTIME partition: user_stream_tykb__3__0__20220321T0506Z from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:CONSUMING Mar 21 120640 poc-pinot01 pinot-admin.sh: 2022/03/21 120640.937 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-pinot-uat-(1ad8bfc9_DEFAULT)] Event 1ad8bfc9_DEFAULT : Unable to find a next state for resource: user_stream_tykb_REALTIME partition: user_stream_tykb__4__0__20220321T0506Z from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:CONSUMING Mar 21 120640 poc-pinot01 pinot-admin.sh: 2022/03/21 120640.937 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-pinot-uat-(1ad8bfc9_DEFAULT)] Event 1ad8bfc9_DEFAULT : Unable to find a next state for resource: user_stream_tykb_REALTIME partition: user_stream_tykb__5__0__20220321T0506Z from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:CONSUMING Mar 21 120640 poc-pinot01 pinot-admin.sh: 2022/03/21 120640.937 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-pinot-uat-(1ad8bfc9_DEFAULT)] Event 1ad8bfc9_DEFAULT : Unable to find a next state for resource: user_stream_tykb_REALTIME partition: user_stream_tykb__6__0__20220321T0506Z from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:CONSUMING Mar 21 120640 poc-pinot01 pinot-admin.sh: 2022/03/21 120640.937 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-pinot-uat-(1ad8bfc9_DEFAULT)] Event 1ad8bfc9_DEFAULT : Unable to find a next state for resource: user_stream_tykb_REALTIME partition: user_stream_tykb__7__0__20220321T0506Z from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:CONSUMING Mar 21 120640 poc-pinot01 pinot-admin.sh: 2022/03/21 120640.937 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-pinot-uat-(1ad8bfc9_DEFAULT)] Event 1ad8bfc9_DEFAULT : Unable to find a next state for resource: user_stream_tykb_REALTIME partition: user_stream_tykb__8__0__20220321T0506Z from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:CONSUMING Mar 21 120640 poc-pinot01 pinot-admin.sh: 2022/03/21 120640.938 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-pinot-uat-(1ad8bfc9_DEFAULT)] Event 1ad8bfc9_DEFAULT : Unable to find a next state for resource: user_stream_tykb_REALTIME partition: user_stream_tykb__9__0__20220321T0506Z from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:CONSUMING Mar 21 120707 poc-pinot01 systemd-logind: Removed session 590.

Ali Atıl

03/21/2022, 7:22 AM

Hi everyone, Do indexes also work for multi-valued columns?

Diana Arnos

03/22/2022, 2:01 PM

👋 hey there I have a strange situation going on. I have 2 servers setup up. Eventually, they had a problem and restarted and still running fine.I can see in the logs that they are consuming data normally:

Copy code

Consumed 261 events from (rate:3.1030054/s), currentOffset=763096, numRowsConsumedSoFar=288096, numRowsIndexedSoFar=288096
....
[Consumer clientId=consumer-455, groupId=] Discovered group coordinator <redacted> (id: 2147483646 rack: null)

But the controller still show them with

dead

status and when I try to query the data, I see in the Broker log:

Copy code

No server found for request 1: select responseId from responseCount limit 1

And this is the response from the query API:

Copy code

{
  "exceptions": [],
  "numServersQueried": 0,
  "numServersResponded": 0,
  "numSegmentsQueried": 0,
  "numSegmentsProcessed": 0,
  "numSegmentsMatched": 0,
  "numConsumingSegmentsQueried": 0,
  "numDocsScanned": 0,
  "numEntriesScannedInFilter": 0,
  "numEntriesScannedPostFilter": 0,
  "numGroupsLimitReached": false,
  "totalDocs": 0,
  "timeUsedMs": 0,
  "offlineThreadCpuTimeNs": 0,
  "realtimeThreadCpuTimeNs": 0,
  "segmentStatistics": [],
  "traceInfo": {},
  "minConsumingFreshnessTimeMs": 0,
  "numRowsResultSet": 0
}

How can I make the Controller see they are alive? 👀

Weixiang Sun

03/23/2022, 4:52 AM

When I am trying to use the lookup UDF join between dimension table and realtime table, it does not work. But it works for dimension table and offline table, Is it expected? I do not see such restriction from https://docs.pinot.apache.org/users/user-guide-query/lookup-udf-join. Is there anything missing?

Bordin Suwannatri

03/23/2022, 8:30 AM

hello everyone i found some error whith transformFunction jsonPathString i can not use word order in jsonPathString --> "transformFunction": "jsonPathString(order,'$.channel')" -->this is not work. i test modify json replace from order to hello and user this --> "transformFunction": "jsonPathString(hello,'$.channel')" it's working. why i can not use "order". my real json massage they use "order". Please help.

eywek

03/23/2022, 4:59 PM

Hello, I was wondering if it’s possible to partition segments based on a field value (but without any transformation). For example, I store in pinot events from multiple websites, those events have name (i.e.

purchase

, `page_view`…) and I would like to create a segment by event name (with a size limit ofc). Since those events are user defined I can’t really know how many partitions I’ll have. I’ve seen Murmur, Hashcode… partition config but it doesn’t insure me that each event type will have a dedicated segment (e.g. I don’t want

page_view

and

purchase

events to be in the same segments, to avoid loading any

page_view

data when doing a query on

page_view

ones) Thank you

Wei Li

03/24/2022, 6:51 AM

Hi, I am setting up pinot in AWS EKS, The clusters are successfully set up in EKS. However, when I try to create schema and load data (Sec 3.4 in this doc https://docs.pinot.apache.org/basics/getting-started/kubernetes-quickstart) by running this script:`kubectl apply -f pinot/pinot-realtime-quickstart.yml` I see the job are created but not running.

ahsen m

03/24/2022, 5:45 PM

hello, is there any tutorial connecting pinot with mongodb ?

Luis Fernandez

03/25/2022, 6:44 PM

anyone know the reason why a server that has been marked as Dead, and updated its tags and after issued a rebalance would be still pop in the

IdealState

in zookeeper?

Diogo Baeder

03/25/2022, 11:35 PM

Hi folks! Now that we're using Pinot with realtime tables in production, I'm also doing some experiments with offline tables for something else I'm developing. However, one thing I'd like to do is to be able to partition the data according to the values in some of the dimension columns. I'll follow in a thread:

Diogo Baeder

03/27/2022, 11:03 PM

Hi again folks! Related to my previous question, but not the same: what's the best partitioning strategy for a STRING column: Murmur, HashCode or ByteArray? What are the criteria I should use to choose what's the best for my case?

Diana Arnos

03/28/2022, 12:08 PM

Hello again 😬 How can I setup S3 as deep storage while using the helm chart? I tried adding the configs from this article to

controller.extra.configs

, but every time I do it the Controller starts responding with

502 Bad Gateway

and I can't see anything wrong in the logs. Results from

helm template

on the thread.

Bordin Suwannatri

03/28/2022, 3:56 PM

hi guys i have multiple kafka sasl separate Kerberos. i don't know what parameter on real time table use for point to krb5.conf or content inside krb5.conf. i need to config realtime tables and multiple kdc. Please recommend which parameter or some example use for that.

Luis Fernandez

03/28/2022, 5:28 PM

hey friends... I was issuing rolling updates for pinot-servers with

kubectl

however I noticed that when I run this command I always get a brand new server and have to issue rebalances again, is restarting servers something that requires rebalancing? I'm pretty sure it must be something funky going on with our config

Lakshmanan Velusamy

03/28/2022, 6:45 PM

Hi Community, Can the dimension tables be created across different tenants with the same name ?

ahsen m

03/29/2022, 1:34 AM

so i updated values like

Copy code

persistence:
      enabled: true
      accessMode: ReadWriteOnce
      size: 2G
      mountPath: /var/pinot/controller/data
      storageClass: ""
      extraVolumes:
        - name: gcp-credentials-volume
          secret:
            secretName: gcp-credentials
            items:
            - key: gcp_creds_json
              path: gcp_credentials.json
      extraVolumeMounts:
        - name: gcp-credentials-volume
          mountPath: /opt/pinot/gcp
          readOnly: true
but when i run helm template testing --debug .  the template it generates does not have any volume mount named `gcp-credentials-volume`, any idea's?

sunny

03/29/2022, 1:43 AM

Hi, I was issuing partitioning in Pinot. When I query 'select where in' partition column, It doesn't show any record. But when I query 'select where not in' partition column, It seems ok. And after flushing segment, query 'select where in' result in right record. but after producing row (before flushed segments), it doesn't show record *)realtime table *)partition column : subject *) kafka topic partitions = 3 *) pinot partitiom function : Murmur

Mohammed Galalen

03/29/2022, 6:01 AM

Hi team, I was trying to compile pinot from source on macbook pro M1 and I got two errors during the compilation one regarding the

protoc-gen-grpc-java-1.4.0-osx-x86_64

and the other

com.github.eirslett:frontend-maven-plugin:1.1

I had to upgrade ``com.github.eirslett:frontend-maven-plugin`` to

1.11.0

and downloaded the

protoc-gen-grpc-java-1.4.0-osx-x86_64

manually. But I couldn't run the example, and I'm getting this error

Copy code

Failed to start a Pinot [SERVER] at 15.16 since launch
java.lang.RuntimeException: java.util.concurrent.RejectedExecutionException: event executor terminated
    at org.apache.pinot.core.transport.QueryServer.start(QueryServer.java:136) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-649f5988d5746869ef6a690f4747ff4d6fb9c607]
    at org.apache.pinot.server.starter.ServerInstance.start(ServerInstance.java:165)

Kamal Chavda

03/29/2022, 5:50 PM

Is anyone using Tableau with Pinot? Getting this error when trying to connect to hosted instance:

Diogo Baeder

03/30/2022, 12:31 AM

Hey guys, a few surprises I had with 0.10.0: • The

segmentPartitionConfig

map doesn't accept the mapping of column to partition config directly, as the table configuration documentation says, but rather can only contain a

columnPartitionMap

field it seems, and then this field in its turn can contain the mapping between column and partition config • The

segmentsConfig

seems to have had its old

replicasPerPartition

renamed to

replication

, if I understand correctly - or maybe I just don't understand where each should be used, if both are valid (although the config docs don't mention

replicasPerPartition

anymore) Should I open a ticket on GitHub about these? Or am I getting something wrong perhaps?