https://pinot.apache.org/ logo
Join Slack
Powered by
# general
  • m

    Mann Mehta

    10/11/2022, 7:21 PM
    In relational databases we have the recursive query do we have any feature to write any recursive queries on pinot tables? Example of one of the SQL recursive queries for relational database https://medium.com/swlh/recursion-in-sql-explained-graphically-679f6a0f143b
    m
    h
    • 3
    • 5
  • a

    abhinav wagle

    10/11/2022, 10:15 PM
    Hellos, just checking with how folks on the community are working with Tenants in Pinot. Specifically how/where to keep a mapping of
    Tenant
    to
    Server
    . Are folks keeping this in git and executing REST api's as here : https://docs.pinot.apache.org/basics/components/tenant
    m
    • 2
    • 2
  • p

    parthiv shah

    10/12/2022, 4:03 PM
    Is Pinot supports
    with
    clause in query?
    s
    r
    • 3
    • 8
  • r

    Rohit Anilkumar

    10/12/2022, 9:07 PM
    Newbie with Kubernetes and infrastructure stuff and hence asking this question. Is it possible to launch Pinot components in multiple AWS EC2 instances like we do in a multi cluster druid deployment? Are there any drawbacks to this approach except for the obvious difficulty in managing multiple servers?
    g
    m
    • 3
    • 6
  • a

    abhinav wagle

    10/12/2022, 9:33 PM
    Hellos, Since by Default Pinot cluster has `DefaultTenant`(all servers/brokers) created. Before re-assigning specific broker, server to a specific tenant name. Which API should be invoked to untag instances from
    DefaultTenant
    ?
    n
    • 2
    • 3
  • a

    Aaron Weiss

    10/13/2022, 2:55 PM
    The RetentionManager task that handles enforcing the Realtime retention policy runs every 6 hours by default per: https://dev.startree.ai/docs/pinot/concepts/segment-retention Is there any way to override that value to say 1 hour?
    j
    • 2
    • 3
  • a

    Aaron Weiss

    10/13/2022, 2:56 PM
    Side question, I assume all the PeriodicTasks are controller level and apply across all Pinot tables?
    j
    • 2
    • 1
  • m

    Matthew Kerian

    10/13/2022, 7:29 PM
    Quick question. Granularity is useful primarily for increasing efficiency when reading? We have data that is separated my nanoseconds so I’m not sure if there’s a general rule of thumb on what the granularity should be. If we’re only ingesting ~100 rows per second on a table is there a certain level of granularity we would want for that?
    m
    • 2
    • 1
  • r

    reallyonthemove tous

    10/13/2022, 9:21 PM
    Is it possible to have two tables with the same name even if they belong to two pinot tenants?
    m
    • 2
    • 1
  • d

    Dan DC

    10/14/2022, 9:49 AM
    Hi all, long time I've not shown up here :) I was wondering if the recordings of the RTA summit will be made public and where we can find them? I was trying to watch them in Hopin but they space won't load for me. Thanks!
    r
    n
    • 3
    • 3
  • p

    Priyank Bagrecha

    10/15/2022, 7:10 AM
    how do i update cluster config via helm charts? i'm trying to enable v2 multi stage query engine and below mechanism in the values.yaml isn't working.
    Copy code
    cluster:
      name: my-pinot-cluster
      extra:
        configs: |-
          pinot.multistage.engine.enabled=true
          pinot.server.instance.currentDataTableVersion=4
          pinot.query.server.port=8421
          pinot.query.runner.port=8442
    x
    • 2
    • 4
  • m

    Michael Latta

    10/16/2022, 4:06 AM
    I am looking into tiered storage in Pinot. The docs show configuring tenants for different tiers. 1) do I need to query each tier separately? 2) it shows pinot_server as only storage type, does that mean they are all stored in hot storage? 3) if 2 is no, how do I configure hot storage size per tenant/server vs deep storage?
    h
    • 2
    • 1
  • g

    Grace Walkuski

    10/17/2022, 4:47 PM
    Hi, Looking to understand the current capabilities around getting large amounts of data out of Pinot in some sort of chunks. The docs are hard to sort through. I’d need any of these supported with aggregation. Can it stream? page? shard? If not now, is in the works soon? tia
    g
    k
    x
    • 4
    • 8
  • k

    Kevin Xu

    10/18/2022, 6:10 AM
    Hi all, Could someone help me find out Which reasons could cause RealtimeToOfflineSegmentsTask minion task canceled when running more than 2 hours?
    Copy code
    INFO [BaseMultipleSegmentsConversionExecutor] [TaskStateModelFactory-task_thread-0] RealtimeToOfflineSegmentsTask on table got canceled
    h
    • 2
    • 4
  • j

    Julius Norinder

    10/18/2022, 9:43 AM
    Hi all, Julius here. I started looking into Pinot a couple of days ago and have got it up and running on minkube using the helm charts. Not without a bit of a struggle though. I filed an issue (https://github.com/apache/pinot/issues/9606). I'll give it a try to create a PR for someone to review later on. However, I'm curious about the setup. If I already have a Kafka echo system up and running do I need to setup a new one for Pinot as well? Seems a bit unnecessary to setup more than one Zookeeper, for instance. Is it possible to "re-use" existing infrastructure?
    r
    • 2
    • 2
  • d

    Deena Dhayalan

    10/18/2022, 11:06 AM
    Hi Team , While going through the JSON Index, I found out that DisableCrossArrayUnnest -> 2nd JSONObject should be addresses[1] Kindly update 🙂
    m
    • 2
    • 2
  • d

    Deena Dhayalan

    10/18/2022, 12:28 PM
    Copy code
    {
      "unrecognizedProperties": {
        "/schema/dateTimeFieldSpecs/0/sampleValue": null,
        "/offline/tableIndexConfig/jsonIndexConfigs/person/maxLevels": 2,
        "/offline/tableIndexConfig/jsonIndexConfigs/person/excludeArray": false,
        "/offline/tableIndexConfig/jsonIndexConfigs/person/disableCrossArrayUnnest": true
      },
      "status": "TableConfigs jsontest successfully added"
    }
    Copy code
    I took master branch which is pinot 0.12 but It produces the above unrecognizedProperties , In which branch can I use this "jsonIndexConfigs"?
    @Mayank @Xiang Fu
    x
    • 2
    • 1
  • a

    abhinav wagle

    10/18/2022, 10:04 PM
    Hello Team, Is there an easy way to track #of segments ingested by Pinot over time either on grafana or some Pinot API ?
    a
    • 2
    • 3
  • c

    Carl

    10/19/2022, 5:37 PM
    Hi team, we have a use case for retrieving records by scanning ~10b records and filtering by date range and id(1m ids in total) without doing any aggregations, with ~100 qps at peak time. Is it possible to scale and configure Pinot to support this kind of use case, if so what would be the best latency we can expect?
    m
    a
    • 3
    • 42
  • e

    Eric Song

    10/20/2022, 7:43 AM
    Hi folks, here are some questions about ingesting data from Apache Iceberg to Pinot, and I want to hear your advices. What I want to do is ingesting data from Apache Iceberg elegantly, like using SQL to select some data in Iceberg and using already exist plugin to generate and push segement.
    x
    s
    +3
    • 6
    • 9
  • k

    Kevin Xu

    10/25/2022, 3:14 AM
    Hi team, Here are two questions: 1. I wonder whether pinot realtime table can ingest data from two or more different topics that have the same schemas. 2. If the Q1 answer is no, is there any support arrangement in the future?
    m
    k
    • 3
    • 3
  • a

    Ashish Kumar

    10/25/2022, 5:32 AM
    Hi team, any thoughts/suggestion/preference on using pinot-minion framework vs spark+airflow for doing offline batch ingestion in pinot?
    m
    a
    • 3
    • 4
  • e

    EO W

    10/25/2022, 6:20 AM
    Is 'purge task' possible for realtime tables in 0.11.0 release? Hi all, In the 0.11.0 release notes, I saw that support segment upload for all tables, including real-time tables that are not upsert mode. The same release includes a purgeTaskGenerator implementation (https://github.com/apache/pinot/pull/8589) that facilitates the purge process. However, only works for offline tables. https://github.com/apache/pinot/blob/release-0.11.0/pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/purge/PurgeTaskGenerator.java#L62
    Copy code
    if (tableConfig.getTableType() == TableType.REALTIME) {
            LOGGER.warn("Skip generating task: {} for real-time table: {}", taskType, tableName);
            continue;
     }
    I think, if it is possible to upload a segment to a real-time table, it is possible to download an existing segment and purge the record to create a new segment and upload it. Is the action to purge for realtime table not possible yet? Or is it just that this built-in task is implemented to only target offline tables?
    f
    • 2
    • 5
  • n

    Nizar Hejazi

    10/25/2022, 3:50 PM
    Does Pinot collect record ingestion time by default or do we need to explicitly added as a field to the schema and define its value using a Groovy or inbuilt functions?
    r
    • 2
    • 1
  • a

    Ashish Kumar

    10/26/2022, 9:55 AM
    Hi team, I was trying to do batch ingestion from s3 parquet files into pinot, but getting this error (any help/pointers)
    Copy code
    java.lang.RuntimeException: Failed to create IngestionJobRunner instance for class - org.apache.pinot.plugin.ingestion.batch.spark.SparkSegmentGenerationJobRunner
            at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.kickoffIngestionJob(IngestionJobLauncher.java:145)
            at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.runIngestionJob(IngestionJobLauncher.java:121)
            at org.apache.pinot.tools.admin.command.LaunchDataIngestionJobCommand.execute(LaunchDataIngestionJobCommand.java:130)
            at org.apache.pinot.tools.Command.call(Command.java:33)
            at org.apache.pinot.tools.Command.call(Command.java:29)
            at picocli.CommandLine.executeUserObject(CommandLine.java:1953)
            at picocli.CommandLine.access$1300(CommandLine.java:145)
    spark-submit cmd:
    Copy code
    export PINOT_VERSION=0.11.0
    export PINOT_DISTRIBUTION_DIR=/workspace/apache-pinot-0.11.0-bin
    
    spark-submit --class org.apache.pinot.tools.admin.command.LaunchDataIngestionJobCommand --master local --deploy-mode client --conf "spark.driver.extraJavaOptions=-Dplugins.dir=${PINOT_DISTRIBUTION_DIR}/plugins -Dlog4j2.configurationFile=${PINOT_DISTRIBUTION_DIR}/conf/pinot-ingestion-job-log4j2.xml" --conf "spark.driver.extraClassPath=${PINOT_DISTRIBUTION_DIR}/lib/pinot-all-${PINOT_VERSION}-jar-with-dependencies.jar"--jars  ${PINOT_DISTRIBUTION_DIR}/lib/pinot-all-${PINOT_VERSION}-jar-with-dependencies.jar -jobSpecFile /workspace/jupyter_notebooks/_examples/coefficient.yaml
    m
    t
    • 3
    • 6
  • a

    Arthur Zhou

    10/27/2022, 6:30 PM
    Hi team, does Pinot support as a data source for Grafana currently? I see some discussion here https://github.com/grafana/grafana/issues/20141. What’s current status?
    m
    a
    p
    • 4
    • 5
  • w

    Weixiang Sun

    10/27/2022, 8:37 PM
    Hi team, when ingesting the data from kafka topic using "stream.kafka.consumer.type": "lowlevel", how to track the kafka offset lagging from ingestion perspective?
    m
    • 2
    • 4
  • v

    vishal

    10/28/2022, 6:11 AM
    Hi Team, we have created Realtime tables and we pushed data into those tables. in table section we are able to see the reported size and estimated size what are those means? and for some tables size showing 0 Bytes even data is already there. can somebody help me with this? attaching screen shots:
  • m

    Mathieu Alexandre

    10/28/2022, 2:01 PM
    Hello here, what would you advise me as a backup/restore procedure please ? The main goal is to reproduce prod context in an other env for debug purpose.
    x
    • 2
    • 4
  • m

    Mamlesh

    11/01/2022, 4:03 AM
    Hi All, can anyone explian monitoring pinot on local system, tutorial given for only with kubernetes. im using this link "https://docs.pinot.apache.org/v/release-0.10.0/operators/operating-pinot/monitoring#jmx-to-prometheus" not able to fetch metrics on 8080 port. can anyone help me out. thanks
    x
    • 2
    • 3
1...525354...160Latest