https://pinot.apache.org/ logo
Join Slack
Powered by
# general
  • r

    Ryan Clark

    07/13/2021, 4:31 PM
    JDBC I've added the pinot-client jar file to my DataGrip as a new Driver. It detects
    org.apache.pinot.client.PinotDriver
    . When I test the connection, I get this error. Any ideas why? Does Pinot integrate well with Tableau yet?
    Copy code
    Driver class 'org.apache.commons.lang3.tuple.Pair' not found.
    k
    p
    r
    • 4
    • 6
  • x

    Xiang Fu

    07/14/2021, 8:19 AM
    can you try to cast it to long?
    m
    • 2
    • 5
  • s

    sriramdas sivasai

    07/14/2021, 5:38 PM
    hello everyone, does any have any idea on this ? . i' m using latest release version of pinot (0.7.1). while doing the spark batchIngestion, its is throwing this. thanks
    Copy code
    Trying to create instance for class org.apache.pinot.plugin.ingestion.batch.spark.SparkSegmentGenerationJobRunner
    Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/api/java/function/VoidFunction
    	at java.base/java.lang.Class.getDeclaredConstructors0(Native Method)
    	at java.base/java.lang.Class.privateGetDeclaredConstructors(Class.java:3137)
    	at java.base/java.lang.Class.getConstructor0(Class.java:3342)
    	at java.base/java.lang.Class.getConstructor(Class.java:2151)
    	at org.apache.pinot.spi.plugin.PluginManager.createInstance(PluginManager.java:295)
    	at org.apache.pinot.spi.plugin.PluginManager.createInstance(PluginManager.java:264)
    	at org.apache.pinot.spi.plugin.PluginManager.createInstance(PluginManager.java:245)
    	at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.kickoffIngestionJob(IngestionJobLauncher.java:135)
    x
    k
    • 3
    • 14
  • s

    sriramdas sivasai

    07/14/2021, 8:03 PM
    hello everyone, i see the queries are not returning any response if i add any UDF's in the query specifically on time. like this is the example query below
    Copy code
    select SUM(total_run_time) from events where user_id = 'XXXXX' GROUP BY TIMECONVERT(time,'SECONDS','HOURS')
    here is my table config
    Copy code
    {
      "OFFLINE": {
        "tableName": "events_OFFLINE",
        "tableType": "OFFLINE",
        "segmentsConfig": {
          "timeType": "SECONDS",
          "timeColumnName": "time",
          "replication": "1"
        },
        "tenants": {
          "broker": "DefaultTenant",
          "server": "DefaultTenant"
        },
        "tableIndexConfig": {
          "autoGeneratedInvertedIndex": false,
          "createInvertedIndexDuringSegmentGeneration": false,
          "loadMode": "MMAP",
          "enableDefaultStarTree": true,
          "enableDynamicStarTreeCreation": false,
          "aggregateMetrics": true,
          "nullHandlingEnabled": false
        },
        "metadata": {},
        "ingestionConfig": {
          "batchIngestionConfig": {
            "segmentIngestionType": "APPEND",
            "segmentIngestionFrequency": "DAILY"
          }
        },
        "isDimTable": false
      }
    }
    i'm actually trying out with less number of records of 0.5million and its has 1 metrics and 1 time stamp and 5 dimensions. please let me know, is there any change that needs to be done in the table config to make the queries run faster. Thanks
    j
    m
    • 3
    • 8
  • s

    sriramdas sivasai

    07/15/2021, 12:10 AM
    hello everyone, im trying to run the spark batch ingestion job with spark-submit. while running the command, its not able to pickup the plugins and throwing as below.
    Copy code
    2021/07/15 00:07:42.306 ERROR [PluginManager] [main] Failed to load plugin [pinot-avro] from dir [/data_ssd/spark-retry/apache-pinot-incubating-0.7.1-bin/plugins/pinot-input-format/pinot-avro]
    java.lang.IllegalArgumentException: object is not an instance of declaring class
    	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
    	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
    	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
    	at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
    	at org.apache.pinot.spi.plugin.PluginClassLoader.<init>(PluginClassLoader.java:50) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
    	at org.apache.pinot.spi.plugin.PluginManager.createClassLoader(PluginManager.java:196) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
    	at org.apache.pinot.spi.plugin.PluginManager.load(PluginManager.java:187) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
    	at org.apache.pinot.spi.plugin.PluginManager.init(PluginManager.java:157) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
    	at org.apache.pinot.spi.plugin.PluginManager.init(PluginManager.java:123) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
    	at org.apache.pinot.spi.plugin.PluginManager.<init>(PluginManager.java:104) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
    	at org.apache.pinot.spi.plugin.PluginManager.<clinit>(PluginManager.java:46) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
    	at org.apache.pinot.tools.admin.command.LaunchDataIngestionJobCommand.main(LaunchDataIngestionJobCommand.java:54) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
    	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
    	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
    	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
    	at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
    	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) [spark-core_2.11-2.4.6.jar:2.4.6]
    	at <http://org.apache.spark.deploy.SparkSubmit.org|org.apache.spark.deploy.SparkSubmit.org>$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) [spark-core_2.11-2.4.6.jar:2.4.6]
    	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) [spark-core_2.11-2.4.6.jar:2.4.6]
    	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) [spark-core_2.11-2.4.6.jar:2.4.6]
    	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) [spark-core_2.11-2.4.6.jar:2.4.6]
    	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) [spark-core_2.11-2.4.6.jar:2.4.6]
    	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) [spark-core_2.11-2.4.6.jar:2.4.6]
    	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) [spark-core_2.11-2.4.6.jar:2.4.6]
    2021/07/15 00:07:42.338 ERROR [PluginManager] [main] Failed to load plugin [pinot-batch-ingestion-spark] from dir [/data_ssd/spark-retry/apache-pinot-incubating-0.7.1-bin/plugins/pinot-batch-ingestion/pinot-batch-ingestion-spark]
    java.lang.IllegalArgumentException: object is not an instance of declaring class
    	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
    	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
    	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
    	at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
    	at org.apache.pinot.spi.plugin.PluginClassLoader.<init>(PluginClassLoader.java:50) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
    	at org.apache.pinot.spi.plugin.PluginManager.createClassLoader(PluginManager.java:196) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
    	at org.apache.pinot.spi.plugin.PluginManager.load(PluginManager.java:187) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
    	at org.apache.pinot.spi.plugin.PluginManager.init(PluginManager.java:157) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
    	at org.apache.pinot.spi.plugin.PluginManager.init(PluginManager.java:123) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
    	at org.apache.pinot.spi.plugin.PluginManager.<init>(PluginManager.java:104) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
    	at org.apache.pinot.spi.plugin.PluginManager.<clinit>(PluginManager.java:46) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
    	at org.apache.pinot.tools.admin.command.LaunchDataIngestionJobCommand.main(LaunchDataIngestionJobCommand.java:54) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
    	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
    	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
    	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
    	at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
    	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) [spark-core_2.11-2.4.6.jar:2.4.6]
    	at <http://org.apache.spark.deploy.SparkSubmit.org|org.apache.spark.deploy.SparkSubmit.org>$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) [spark-core_2.11-2.4.6.jar:2.4.6]
    	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) [spark-core_2.11-2.4.6.jar:2.4.6]
    	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) [spark-core_2.11-2.4.6.jar:2.4.6]
    	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) [spark-core_2.11-2.4.6.jar:2.4.6]
    	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) [spark-core_2.11-2.4.6.jar:2.4.6]
    	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) [spark-core_2.11-2.4.6.jar:2.4.6]
    	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) [spark-core_2.11-2.4.6.jar:2.4.6]
    Does any face this issue ?
    b
    m
    • 3
    • 4
  • s

    Sevvy Yusuf

    07/15/2021, 1:44 PM
    Hi everyone, I'm trying to use the controller API to create tenants and assign them brokers and servers but running into some issues. All of our broker and server instances are created with a "DefaultTenant" tag and when I make a POST request to /tenants I end up with a 500 error with message "Failed to allocate broker instances to Tag" due to not having enough untagged instances. Is there a way to create the instances without the "DefaultTenant" tag? I've tried manually changing the tag to "untagged" using the /instances endpoint per this page in the docs but I'm still running into the same issue. It works ok if I just use the /instances endpoint to update the tag but it feels like a hack doing it that way. Can someone advise on whether I'm missing a step and/or the best approach please? Thanks
    n
    • 2
    • 9
  • e

    Evan Galpin

    07/15/2021, 3:35 PM
    Would anyone be able to point me to either docs or code that would provide lower-level detail on the structure of a segment file and how to create one? Not how to use the admin tools to create a segment, but rather what the admin tool is doing to create a segment from an Avro input file for example. I’m curious about the
    Segment Metadata Push
    bulk ingestion strategy[1], which seems to imply writing segments to one of a few distributed file systems first, and then informing the controller about the segments and their associated metadata. I suppose I’m looking for the generic internals to create a segment from input data. Is `SegmentGenerationUtils.java`[2] the right starting place? Thanks! [1] https://docs.pinot.apache.org/basics/data-import/batch-ingestion#3-segment-metadata-push [2] https://github.com/apache/incubator-pinot/blob/master/pinot-common/src/main/java/o[…]che/pinot/common/segment/generation/SegmentGenerationUtils.java
    k
    m
    • 3
    • 3
  • r

    Ronie Paolo

    07/15/2021, 4:59 PM
    Hello! I have an instance in which I want to deploy Pinot and other 3 instances where a Zookeeper cluster is deployed. I would like to connect Pinot with this quorum (the 3 Zookeepers servers). How can I set my 3 Zookeeper urls in the Pinot Controller configuration file (controller.zk.str)? Or I'm going in the wrong way? I would like to receive some orientation in order to use Pinot with my 3 Zookeeper servers. Thanks!
    m
    n
    • 3
    • 4
  • e

    Evan Galpin

    07/15/2021, 7:22 PM
    Is there any performance implication associated with the number of segments that compose a given table?
    m
    • 2
    • 10
  • k

    kelv

    07/16/2021, 3:54 AM
    Hi! i'd like to build a realtime dashboard on a webpage, with panels that show last N messages, top values in last defined time period etc. Updates should be reflected on the webpage within a second, ideally. My questions are: is such a use case suited for pinot? is there any intention to provide a long-poll query interface, so I can minimize the number of queries repeatedly polling pinot?
    m
    k
    • 3
    • 6
  • p

    Pedro Silva

    07/16/2021, 10:54 AM
    Hello, does pinot have support for ingesting avro kafka messages? Is it in the roadmap?
    b
    m
    c
    • 4
    • 6
  • a

    Anusha

    07/16/2021, 3:38 PM
    Hello, In Pinot I have 2 tenants, tenant A and tenant B. I want to create same table in 2 tenants. Is that possible?
    k
    m
    • 3
    • 4
  • s

    suraj kamath

    07/19/2021, 5:12 AM
    Hi, I am exploring the possibility of using apache spark to move the segments from realtime table to offline table. What job type can I use in the Ingestion job spec to achieve this ? Has anyone achieved this , if so it would be helpful if you could point me to a doc/wiki
    m
    n
    +2
    • 5
    • 8
  • a

    Ananth Packkildurai

    07/19/2021, 2:34 PM
    I noticed an interesting comment from Uber's recent article on Pinot usage for its support system analytical infrastructure. The article was published three days back, but is this statement still true for Pinot?
    While Pinot is good at handling our SLAs, it comes with its own challenges. Pinot is an append-only database, which means users can only append records, rather than being able to update or delete existing records. This makes it difficult to compute even simple metrics, like the number of open orders by city. Query needs to identify the latest record for each order and count if the status is open.
    Pinot also has limited query capabilities. When we started working with Pinot it was lagging in its capability to support JOIN operations with other tables. This forced us to denormalize the data before insertion into the database. Denormalizing multi-value fields, such as tags or badges, will result in an explosion of records if the database does not support complex data types like arrays. Pinot’s limited capabilities for upsert, join, and complex data types made our data modeling challenging for certain metrics.
    m
    y
    • 3
    • 10
  • m

    Map

    07/19/2021, 9:18 PM
    Hi, I have several applications and I would like to watch for a metric they expose and send over Kakfa as a message. When I ingest the Kafka messages into Pinot, is there a way to aggregate them so that only the latest messages sent by each application are kept? If not, which is to say we have to keep all the messages, is there a way to query Pinot to show only the latest message for each application?
    k
    j
    • 3
    • 9
  • l

    Lakshmanan Velusamy

    07/20/2021, 6:20 AM
    Hi Community, We have a table that records events emitted by an entity (timestamp, entity_id, status (OPEN/CLOSED)). Events are sparse, emitted only when there is a state change. We want to compute at any given point in time, how many entities are open (also want to track the trend, in a time range plot the trend of # of entities open). Are there any time series functions to help with this ?
    k
    m
    • 3
    • 5
  • y

    Yupeng Fu

    07/20/2021, 4:01 PM
    hey, we just published this blog on geospatial in pinot support. If you are interested in, you can also tune in the meetup talk above
    🙌 4
    ❤️ 2
    k
    • 2
    • 1
  • r

    Ryan Clark

    07/20/2021, 7:54 PM
    🧵 Complex schema (un-nesting json) not showing up in table
    m
    j
    y
    • 4
    • 14
  • s

    suraj kamath

    07/20/2021, 9:28 PM
    Hi All, I have written down my little understanding of apache pinot - Tables and segments and tried to put it down in simple and fun terms. Would love if you folks could check it out and help me build many such articles around pinot https://medium.com/@surajkmth29/apache-pinot-tables-and-segments-a72dc5854876 PS: If there any comments/suggestions on the details of the blog, please drop a comment so that we can make it better and accessible to pinot community
    👏 5
    🙌 3
    k
    k
    • 3
    • 4
  • a

    Abhijeet Kushe

    07/21/2021, 2:17 PM
    I am interested in getting latest updates on kinesis-integration.This issue https://github.com/apache/incubator-pinot/issues/5648 mentions joining #kinesis-integration but I dont see the channel here in slack.Can someone point me to the right place to get more details ?
    m
    n
    +2
    • 5
    • 249
  • n

    Neil Teng

    07/21/2021, 4:14 PM
    Hi, I am interested in how the system time is synced across nodes. I pass a presto query like
    date > now() - interval'30' minute
    to pinot. How much I can be sure about the "now()" function? Is it be translated to a exact time in presto and then pass to pinot? Then how much difference it can have across different pinot nodes?
    m
    k
    • 3
    • 5
  • m

    Maitraiyee Gautam

    07/21/2021, 4:34 PM
    @User and are facing the issue with select * on the pinot, the select * is not reflecting all the columns of the table, when we do select with individual column names, they are getting reflected correctly, has anyone else faced any such problems?
    m
    k
    • 3
    • 6
  • m

    Mark Needham

    07/21/2021, 9:14 PM
    I wrote a blog post showing how to analyse GitHub events using Pinot + Streamlit - https://markhneedham.medium.com/analysing-github-events-with-apache-pinot-and-streamlit-2ed555e9fb78 piggybacking on the work of @User and @User!
    💯 6
    🍷 9
    k
    n
    • 3
    • 2
  • k

    Karin Wolok

    07/22/2021, 3:04 PM
    Don't miss today's meetup! Presented by @User! 🙂 https://www.meetup.com/apache-pinot/events/277818762/
    🍷 1
    a
    • 2
    • 1
  • r

    Ryan Clark

    07/22/2021, 6:55 PM
    Can Pinot be hosted in one AWS account and read a stream from another account? Perhaps a way to give the table config an account number.
    m
    d
    +3
    • 6
    • 23
  • m

    Map

    07/22/2021, 10:01 PM
    Hi, I know we can flush a segment based on the size, number of rows or time since creation. I wonder if there is a way to only trigger a flush at a certain time of the day, say midnight? I am asking because I notice it can take minutes to flush a segment, during which Pinot stops consuming new messages and hence there would be a delay of minutes. This may throw the users off. We might be doing it totally wrong and any suggestions would be appreciated!
    j
    s
    • 3
    • 21
  • t

    Trust Okoroego

    07/23/2021, 2:13 AM
    Hi, I get a blank screen when I open a table in pinot query console. Pinot version 1.71. I guess its something with the UI. Any one noticed this?
    k
    • 2
    • 4
  • r

    Ryan Clark

    07/23/2021, 5:02 PM
    I'm trying to implement S3 deep storage with a
    controller.conf
    , but I believe the controller is not reaching zk. I'm providing the
    zookeeper.zk.str
    by giving it the
    pinot-zookeeper
    endpoint. 🧵
    🙏 1
    b
    x
    • 3
    • 41
  • t

    Trust Okoroego

    07/24/2021, 11:01 AM
    Hi group, I am trying to do a join of two realtime tables, but I get an error that my segment is empty
    presto error: null value in entry: Server_172.23.0.5_8098=null.
    when I check the realtime table I don't have segments already created. but when I query the same table without a join it returns a result
    k
    x
    m
    • 4
    • 6
  • p

    prateek nigam

    07/26/2021, 12:00 PM
    Data Encryption at rest in apache pinot - using HDFS as deep store, apache pinot support that?
    m
    • 2
    • 2
1...242526...160Latest