https://pinot.apache.org/ logo
Join Slack
Powered by
# general
  • c

    Chengxuan Wang

    05/10/2022, 5:51 AM
    quick question:
    Copy code
    {\"H3IndexFilterOperator Time\":16},{\"DocIdSetOperator Time\":16}
    what does the number mean in the query
    traceinfo
    ? is it 16ms?
    j
    • 2
    • 3
  • c

    coco

    05/10/2022, 9:45 AM
    Hi. Team I am using hdfs as deepstore. I am trying to do ingestion batch with spark on deepstore hdfs cluster. I am having difficulty trying to use another hdfs cluster as input of batch job spec. Is such a deployment configuration possible?
    k
    n
    • 3
    • 12
  • t

    Timothy Spann

    05/11/2022, 4:50 PM
    I found a typo in this page https://dev.startree.ai/docs/pinot/recipes/pulsar
    m
    r
    m
    • 4
    • 8
  • l

    Luy

    05/11/2022, 5:51 PM
    It would be great if there's some docs of details how to set all env of pinot & thirdeye for this on my machine.
    m
    s
    • 3
    • 8
  • l

    Luy

    05/11/2022, 7:11 PM
    I started Apache Zookeeper, Pinot Controller, Pinot Broker, and Pinot Server. Now how can I create and add new table in cluster manager?
    m
    m
    n
    • 4
    • 11
  • a

    Andy Li

    05/11/2022, 9:29 PM
    Hi team, we're using Presto with Pinot and would like to support pushdown of functions like
    COALESCE
    or multi-column
    CASE
    statements on the Pinot side. This seems reasonable for predicates as currently it looks like push down logic is on aggregations / predicates. However, we're looking for some performance improvements here by having this as a
    SELECT
    pushdown instead of having to return all data to Presto for processing as we can "aggregate" row-wise for various operators and take advantage of certain indexing i.e. bloom filters, etc. for
    COALESCE
    ,
    CONCAT
    , etc. Are there concerns or pointers around this? @Xiang Fu
    x
    • 2
    • 14
  • s

    Saumya Upadhyay

    05/12/2022, 4:40 AM
    Do we have any option to save pinot ingestion time in table so that we know if any latency in table while ingesting data from kafka.
    m
    n
    • 3
    • 2
  • c

    coco

    05/12/2022, 8:32 AM
    I have created a batch pipeline that stores datafiles from cloudera impala parquet table to pinot cluster. How to gracefully swap segments if the number of input files gets smaller? Like this: https://docs.pinot.apache.org/configuration-reference/job-specification#segment-name-generator-spec
    segment.name.prefix : normalizedDate
    exclude.sequence.id : false
    -- input data file
    Copy code
    <hdfs://data/pinot_poc/input_table/yyyymmdd=20220512/data-file-0.parq>
    <hdfs://data/pinot_poc/input_table/yyyymmdd=20220512/data-file-1.parq>
    <hdfs://data/pinot_poc/input_table/yyyymmdd=20220512/data-file-2.parq>
    -- pinot segment
    Copy code
    <hdfs://data/pinot_poc/controller/segments/pinot_table/batch_2022-05-12_2022-05-12_0>
    <hdfs://data/pinot_poc/controller/segments/pinot_table/batch_2022-05-12_2022-05-12_1>
    <hdfs://data/pinot_poc/controller/segments/pinot_table/batch_2022-05-12_2022-05-12_2>
    ------------------------------ If I redo the batch and the data file is reduced to two: -- input data file
    Copy code
    <hdfs://data/pinot_poc/input_table/yyyymmdd=20220512/data-file-0.parq>
    <hdfs://data/pinot_poc/input_table/yyyymmdd=20220512/data-file-1.parq>
    segment.name: fixed
    If I have to use the 'segment.name:fixed' setting, how can I gracefully delete the segment 'batch_2022-05-12_2022-05-12_2'?
    k
    n
    • 3
    • 4
  • t

    Tanmay Movva

    05/12/2022, 12:08 PM
    Hello! Is there a php client available for Pinot?
    m
    • 2
    • 2
  • k

    Kishore G

    05/13/2022, 12:24 AM
    If you are around and would love to know how Cisco Webex is powering End-user analytics using Pinot - you can join https://cisco.webex.com/cisco/j.php?MTID=m308d0880b393a6d722394f402a23dd9d
  • n

    Nisheet

    05/13/2022, 8:20 AM
    Hi team
    d
    m
    +2
    • 5
    • 20
  • a

    Alice

    05/13/2022, 2:21 PM
    Hi team, I have a column property like this, when I query this field, part of its value is …..>>>ignored size:5051. Does it mean the maxLength is smaller than the real size for the string content? Or Pinot stores the whole string, but doesn’t show the full content when queried? { “name”: “content”, “dataType”: “STRING”, “maxLength”: 17825792 }
    m
    • 2
    • 3
  • m

    Map

    05/13/2022, 5:53 PM
    Hi, according to https://docs.pinot.apache.org/basics/data-import/complex-type#ingestion-configurations
    with
    unnestFields
    , a record with the nested collection will unnest into multiple records
    If we have two fields to unnest, each field is an array of 4 elements, does it mean we would get 4 * 4 = 16 records?
    r
    • 2
    • 1
  • n

    Nizar Hejazi

    05/13/2022, 6:40 PM
    Created this github issue to propose returning Null (instead of default value) in response for selection queries (if a config is set): https://github.com/apache/pinot/issues/8697 Please review and let me know your input
    👍 2
    • 1
    • 1
  • a

    Alice

    05/15/2022, 2:18 AM
    Hi team. Could you please give some suggestion here? I’ve assigned 3 servers for a stream type table about 200m records withe reported 25G size. Each server has 64G memory. Some queries returned servers not responded and “errorCode”: 427. The Kafka topic has 3 partitions and it’s not gonna change the partition number. Does it help if I scale up pinot servers to 6 for this table in this case?
    h
    m
    • 3
    • 11
  • a

    Alice

    05/16/2022, 6:56 AM
    Hi team, how to check if instanceAssignmentConfigMap config takes effect?
    m
    • 2
    • 1
  • h

    Harish Bohara

    05/16/2022, 8:41 AM
    i have 2-3 fileds in metricFieldSpecs. These columns captures taken to do some operation (e.g. time taken to sent -> delivery of a items). Any idea of how to get a histogram of this data (to be used in Superset)?
    m
    • 2
    • 1
  • k

    Karin Wolok

    05/16/2022, 2:25 PM
    If anyone wants to submit a talk: https://sessionize.com/pulsar-summit-san-francisco-2022/
    ✅ 2
  • r

    Ram

    05/17/2022, 8:41 PM
    Hello,
    👋 3
  • r

    Ram

    05/17/2022, 8:46 PM
    I'm trying to evaluate Pinot within our bank (note, we already use an existing commercial olap product) ingesting streaming data from Flink + kafka with pre-processed data into pinot. This seems to work all good and latency is matching our requirement. We also have client currently streaming the realtime data from cube so trying to see if there's any such streaming client API available for querying Pinot. I could see that Pinot integrates with Presto / Trino, so can someone please let me know if there's any link that I can refer to see how to implement streaming client to query realtime data from pinot (specifically the initial query may yield the initial snapshot of data and thereafter delta updates on the underlying query).
    m
    r
    +3
    • 6
    • 28
  • h

    harry singh

    05/18/2022, 9:41 AM
    Hi , Wanted to understand what should we do in regards to handling "nulls" in aggregation queries. So Pinot saves default values instead of nulls but it will effect the final result where the default value coincides with a data point, how are other folks handling this and what can we do here?
    d
    m
    k
    • 4
    • 9
  • s

    sunny

    05/19/2022, 2:16 AM
    Hi all, when stopping components of pinot (controller, broker, server) using admin script, sometimes component is not stopped and even is not logged.
    Copy code
    apache-pinot-0.9.3-bin/bin/pinot-admin.sh StopProcess -controller/-server/-broker
    How do I know why it's not shutting down when executing stop command ? Also, is there any other way to stop pinot component safely? Thank you in advance.
    m
    p
    • 3
    • 6
  • k

    Karin Wolok

    05/19/2022, 9:09 AM
    🥳 Who wants to come visit in Israel? Submit to speak at the FIRST Java Conference in Tel Aviv! https://javasummitil.com https://sessionize.com/javasummit-il-22
  • k

    Karin Wolok

    05/19/2022, 10:02 AM
    📣 Current (formerly Kafka Summit) is looking for speakers!!! (Austin, TX in October) If you have a topic that you think might be valuable to this community, please submit a talk 😉 https://sessionize.com/current-2022/
  • k

    Karin Wolok

    05/19/2022, 10:51 AM
    📣 ApacheCon North America Another open CFP - Closes on THIS MONDAY, so please submit a talk ASAP! (New Orleans in October) https://apachecon.com/acna2022/
  • k

    Karin Wolok

    05/19/2022, 10:52 AM
    If you get accepted to any conferences, please let us know and we will send you swag! 😃
  • a

    Arash

    05/19/2022, 10:04 PM
    operationalizing ml
    👋 1
  • d

    Diana Arnos

    05/20/2022, 10:07 AM
    Hey there :) Do someone have a helm chart and config example for running more than 1 zookeeper running? For instace: I want to run 2 brokers, 2 controllers, 3 zookeepers and 4 servers Is it possible?
    m
    m
    • 3
    • 6
  • m

    Matthias

    05/20/2022, 4:09 PM
    Is there any terraform provider to manage apache pinot (e.g. Setting up tables and streams etc)?
    k
    • 2
    • 2
  • a

    Abhay Rawat

    05/23/2022, 10:16 AM
    Hi Pinot team, we are trying to deploy Pinot in AWS ECS. One issue that we are facing is if one of the server instance goes down and comes back up with another instance id (in our case IP address), we lose the segments registered with that server. Is there a requirement that controller/broker/server instances have a stable identifier?
    m
    • 2
    • 6
1...414243...160Latest