Nizar Hejazi
05/24/2022, 9:00 PMselect distinct (company) from role_with_company limit 1000000 -- answer: 51
Queries w/ less than or greater than predicates returns always the correct results:
select count(distinct company) from role_with_company where company < '6269223774083d800011fd95' limit 1000000 -- answer: 36
select count(distinct company) from role_with_company where company > '6269223774083d800011fd95' limit 1000000 -- answer: 14
On the other hand, equality predicates when the segment is in CONSUMING state does not return the correct results:
select count(distinct company) from role_with_company where company = '6269223774083d800011fd95' limit 1000000 -- answer: 0, when segment is in CONSUMING state
When the segment is COMMITTED, the query returns the correct results:
select count(distinct company) from role_with_company where company = '6269223774083d800011fd95' limit 1000000 -- answer: 1, when segment is COMMITTED
Anyone aware of a change in behaviour that was introduced recently?
@Richard Startin @Jackie
Latest nightly build commit: 0.11.0-SNAPSHOT-438c53b-20220520
Previous nightly build commit: 0.11.0-SNAPSHOT-3403619-20220507Hello
05/24/2022, 11:35 PMLars-Kristian Svenøy
05/25/2022, 10:24 AMTommaso Peresson
05/25/2022, 2:45 PMitems
containing a list of item_id
. From what I've tried distinctcounthllmv()
can't be used as an aggregated function in a star-tree index. Has anyone ever faced a similar problem? If yes how did you solved it? Is it possible to calculate the raw-hll state at ingestion time and then perform the estimation at query time?
Thanks everyone for helpingAnish Nair
05/26/2022, 3:31 AMAtri Sharma
05/26/2022, 7:19 AMTiger Zhao
05/26/2022, 3:01 PMERROR [PinotSegmentUploadDownloadRestletResource] [jersey-server-managed-async-executor-17] Caught internal server exception while uploading segment
java.lang.NullPointerException: Table config is not available for table 'test_table_OFFLINE'
any ideas as to what is causing this? I see that the table has been created successfully.Andy Li
05/26/2022, 5:11 PMFernando Barbosa
05/26/2022, 10:25 PMFernando Barbosa
05/26/2022, 10:25 PMFernando Barbosa
05/26/2022, 10:25 PMabhinav wagle
05/26/2022, 11:25 PMCaught exception while reading message using schema: <redacted>
java.io.EOFException: null
at org.apache.avro.io.BinaryDecoder$ByteArrayByteSource.readRaw(BinaryDecoder.java:966) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7]
at org.apache.avro.io.BinaryDecoder.doReadBytes(BinaryDecoder.java:372) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7]
at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:289) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7]
at org.apache.avro.io.ResolvingDecoder.readString(ResolvingDecoder.java:209) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7]
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:469) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7]
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:459) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7]
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:191) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7]
at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:259) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7]
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:247) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7]
at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7]
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7]
at org.apache.pinot.plugin.inputformat.avro.SimpleAvroMessageDecoder.decode(SimpleAvroMessageDecoder.java:89) [pinot-avro-0.10.0-SNAPSHOT-shaded.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7]
at org.apache.pinot.plugin.inputformat.avro.SimpleAvroMessageDecoder.decode(SimpleAvroMessageDecoder.java:43) [pinot-avro-0.10.0-SNAPSHOT-shaded.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7]
at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.processStreamEvents(LLRealtimeSegmentDataManager.java:507) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7]
at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.consumeLoop(LLRealtimeSegmentDataManager.java:416) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7]
at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager$PartitionConsumer.run(LLRealtimeSegmentDataManager.java:576) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7]
at java.lang.Thread.run(Thread.java:829) [?:?]
Zsolt László
05/27/2022, 10:07 AMPInot currently relies on Pulsar client version 2.7.2. Users should make sure the Pulsar broker is compatible with the this client version.Say I have a Pulsar cluster of version 2.4.1 in place already; then I won't be able to consume its traffic with the most up-to-date Pinot binary? Thanks in advance for any help!
Fernando Barbosa
05/27/2022, 12:25 PMschema.json:
containing my table schema (after transforming from ascv as pointed out in here
• table.json:
where I am using `streamConfigs`to pass the confluent authentication keys
Two questions:
1. Is it wrong to pass the credentials in that file (please disregard security issues because this is a very local and small test) ?
2. I keep getting a return 500: that says the Consumer couldnt be formed.
I would really really appreciate your help. BTW I am following startree recipes. 🆘Alice
05/27/2022, 1:28 PMFernando Barbosa
05/27/2022, 2:26 PM"stream.kafka.decoder.prop.schema.registry.rest.url": "<https://xxxxx.uk-central2.gcp.confluent.cloud>",
"stream.kafka.schema.registry.url": "<https://xxxxx.uk-central2.gcp.confluent.cloud>",
Scott deRegt
05/27/2022, 3:57 PMspark
Batch Ingestion job when moving from --master local --deploy-mode client
to --master yarn --deploy-mode cluster
(as suggested here for production environments). I would greatly appreciate some guidance from others who have successfully configured this spark job. Details in thread 🧵Stuart Millholland
05/27/2022, 4:43 PMAlice
05/28/2022, 6:29 AMDiogo Baeder
05/29/2022, 11:32 AMDiogo Baeder
05/29/2022, 12:52 PMLaunchDataIngestionJob
job with a file spec, for how long do I need to keep that job spec file around? Can it be deleted right after the job finishes, if I downloaded it from somewhere before I triggered the job?Vishal Garg
05/30/2022, 5:31 AMselect metric_1, sum(metric_2) from table where some_filter = 'x' group by 1 limit 100
If I hit this query through Pinot portal, I get the integer value for sum(metric_2) but from pinot Java client it return double value. I am expecting it to return Integer value. My query would be dynamic in nature so can't query type specific data, I am always querying columns as string in the following way resultSet.getString(row,col)
. Is there any way to configure Java client to read as integer value instead of double?Mahesh babu
05/30/2022, 5:57 AMMahesh babu
05/30/2022, 5:57 AMAli Atıl
05/30/2022, 11:14 AMMahesh babu
05/30/2022, 1:10 PMKevin Peng
05/30/2022, 3:39 PMKevin Peng
05/30/2022, 4:20 PM] Failed to execute goal org.apache.maven.plugins:maven-shade-plugin:3.2.1:shade (default) on project pinot-kafka-2.0: Error creating shaded jar:
I really don't need pinot kafka for my current test install is there a way to bypass this or fix the shade issue? Anyone run into this issue before. Before that I ran into issue with the spotless plugin which I commented out in the pom.xml in root folder and in the pinot-common directory.Sowmya Gowda
05/31/2022, 6:42 AMCannot read single-value from Object[]: [Staff RN (Med Surg, Ortho/Neuro, GI/GU floor] for column: jobTitle
Luis Fernandez
05/31/2022, 2:13 PM0.10.0
we ran into this issue: https://github.com/apache/pinot/pull/8337 so we had to create a script to do the imports daily, however, for some reason pinot servers are exhausting memory (32gbs) and before running the job they are mostly at half capacity what are some of the reasons that our pinot servers would ran out of memory from these ingestion jobs? also we are using the standalone job and we change the input directory in our script every time it finishes daily. Would appreciate any help!