https://pinot.apache.org/ logo
Join Slack
Powered by
# general
  • v

    vishal

    12/20/2022, 8:49 AM
    HI, i am trying to compile pinot code locally but some dependency is missing in pom file can somebody help with that?
    s
    • 2
    • 3
  • a

    Ashish Kumar

    12/21/2022, 4:17 AM
    Hi Team, I am trying to ingest data into DIMENSION table with composite keys using spark-batch-ingestion pinot jar. But seems like ingestion is not complete i.e. There are rows in source which are not available in Pinot DIMENSION table.
    s
    • 2
    • 22
  • v

    vishal

    12/21/2022, 10:03 AM
    Hi Team, i am trying to write
    UDF - Scalar function
    steps i've followed is as below: created java project with the package name
    org.apache.pinot.scalar.ScalarFunc
    . Pom file:
    Copy code
    <?xml version="1.0" encoding="UTF-8"?>
    <project xmlns="<http://maven.apache.org/POM/4.0.0>"
             xmlns:xsi="<http://www.w3.org/2001/XMLSchema-instance>"
             xsi:schemaLocation="<http://maven.apache.org/POM/4.0.0> <http://maven.apache.org/xsd/maven-4.0.0.xsd>">
        <modelVersion>4.0.0</modelVersion>
    
        <groupId>org.apache.pinot.scalar.ScalarFunc</groupId>
        <artifactId>ScalarFunc</artifactId>
        <version>1.0-SNAPSHOT</version>
    
        <properties>
            <maven.compiler.source>1.8</maven.compiler.source>
            <maven.compiler.target>1.8</maven.compiler.target>
            <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        </properties>
    
        <dependencies>
            <dependency>
                <groupId>org.apache.pinot</groupId>
                <artifactId>pinot-common</artifactId>
                <version>0.11.0</version>
            </dependency>
        </dependencies>
    
    
    </project>
    Main.java
    Copy code
    package org.apache.pinot.scalar.ScalarFunc;
    
    import org.apache.pinot.spi.annotations.ScalarFunction;
    
    public class Main {
        public static void main(String[] args) {
    //        System.out.println("Hello world!");
        }
    
        @ScalarFunction
        static String getdata(String ref){
            return "hurray testing is working";
        }
    }
    created jar file using
    mvn clean install
    and moved that jar file to
    pinot/plugins.
    and than reinstalled the pinot. tried to run
    select getdata(AirTime) from airlineStats limit 10
    this query. here AirTime is column name. but its returning error as below:
    Copy code
    [
      {
        "message": "QueryExecutionError:\norg.apache.pinot.spi.exception.BadQueryRequestException: Unsupported function: getdata not found\n\tat org.apache.pinot.core.operator.transform.function.TransformFunctionFactory.get(TransformFunctionFactory.java:304)\n\tat org.apache.pinot.core.operator.transform.TransformOperator.<init>(TransformOperator.java:65)\n\tat org.apache.pinot.core.plan.TransformPlanNode.run(TransformPlanNode.java:71)\n\tat org.apache.pinot.core.plan.SelectionPlanNode.run(SelectionPlanNode.java:71)",
        "errorCode": 200
      }
    ]
    am i doing anything wrong? reference: https://docs.pinot.apache.org/users/user-guide-query/scalar-functions#scalar-functions
    k
    k
    • 3
    • 25
  • a

    Ashish Kumar

    12/22/2022, 3:52 AM
    Hi Team, Is there any API to delete all the segments for a given prefix of a Pinot Table?
    m
    • 2
    • 1
  • r

    Ralph Debusmann

    12/22/2022, 11:00 AM
    Hi is it possible to post multiple SQL queries at the same time to reduce the number of REST calls to Pinot/network traffic?
  • v

    vishal

    12/22/2022, 11:24 AM
    Hi Team, i've one offline table as below:
    Copy code
    id, name, position, timestamp
    1,vishal,SDE1,
    2,Raj, SDE2
    1,vishal, SDE2
    1,vishal,NULL
    here i am try to featch data for id=1 so SQL query will be
    SELECT position FROM table where id=1
    it will return 3 records. but i want only one record which is
    1,vishal,SDE2
    . can we achieve through scalar function lets say we will pass id as arg and fetch all data in that function get write a logic which will merge the rows and return one string which is concat of all columns. EX.:
    Copy code
    SELECT data(1) FROM table;
    
    def of data func(id) {
    select all the data which has id=1 and merge those record and return one single string
    }
    is it possible? can we query inside the function? or we need to call api to query data inside the function?
    m
    • 2
    • 3
  • s

    Shakti Singh

    12/22/2022, 3:41 PM
    👋 Hi everyone!
    👋 7
  • r

    Rostan TABET

    12/23/2022, 1:29 PM
    Hello
  • r

    Rostan TABET

    12/23/2022, 1:31 PM
    For Bloom filters, the documentation states
    Bloom filter helps prune segments that do not contain any record matching an EQUALITY predicate.
    I wonder if it also includes
    IN
    predicates. For example, can a bloom filter be useful for a predicate in the form
    fruit IN ('banana', 'apple', 'orange')
    or do I need to change it to
    fruit = 'banana' OR fruit = 'apple' OR fruit = 'orange'
    ?
    k
    • 2
    • 2
  • d

    Dileep Kancharla

    12/23/2022, 3:00 PM
    Hello Team, Does Apache Pinot support materialised views ?
    k
    m
    • 3
    • 3
  • r

    Rohit Yadav

    12/26/2022, 7:20 AM
    Hi All, is there a possibility to soft delete Pinot tables and have an instant recovery if needed? OR Can I disable a table from only being queried but allow ingestions to happen? The requirement arises from table management use cases where I do not want a prod table to be deleted immediately but have a grace period for recovery. I came across an API to disable Pinot table that stops querying and ingestion for a table but recovery can be tricky in the case of Realtime tables(stream data getting evicted and recovery is not instant on high qps stream).
    n
    j
    s
    • 4
    • 15
  • a

    Amos Bird

    12/27/2022, 7:10 AM
    Hello! Since Pinot and Druid are very similar in many aspects, I wonder if both projects are from the same original project. Or many one is a fork of the other?
    m
    • 2
    • 2
  • a

    Amos Bird

    12/28/2022, 2:45 PM
    Hi! I found the concept of
    Raw value forward index
    very confusing. Is it just a column storage of the original data? I don't see why it's called an index.
    m
    • 2
    • 1
  • a

    Amos Bird

    12/28/2022, 2:58 PM
    It's also interesting to see that
    colB
    is reordered before doing dict-encoding in Dictionary-encoded forward index with bit compression (default)
    m
    s
    • 3
    • 4
  • t

    Timothy Spann

    12/29/2022, 12:31 AM
    My wrap up includes some videos/articles on Pinot https://medium.com/@tspann/2022-wrap-up-for-streaming-247cd21fd483
    👍 5
  • a

    Abhishek Dubey

    01/02/2023, 10:45 AM
    Hi Team, can we add columns to pre-aggregation list (or star-tree index) any point of time post data ingestion without impact to ingestion ? I understand it’ll need table configuration to be modified but do we need to stop ingestion ?
    k
    • 2
    • 3
  • s

    Shankar Uprety

    01/04/2023, 1:17 AM
    👋 Hi everyone!
    👋🏽 1
    👋 1
  • s

    Shreyans Bhavsar

    01/04/2023, 11:08 AM
    Hello! Can we do an upsert in the offline table using IngestFromFile API?
    m
    m
    • 3
    • 6
  • h

    Harshit

    01/04/2023, 11:23 AM
    Hello, I am getting following error while ingesting data via Flink
    Could not find index for column: gKey, type: FORWARD_INDEX, segment: /tmp/data/pinotServerData/key1_OFFLINE/key1_3_
    Schema
    Copy code
    {
      "schemaName": "key",
      "dimensionFieldSpecs": [
        {
          "name": "rootKey",
          "dataType": "STRING"
        },
        {
          "name": "gKey",
          "dataType": "STRING"
        }
      ],
      "primaryKeyColumns": [
        "gKey"
      ]
    }
    Table config
    Copy code
    {
      "tableName": "key",
      "tableType": "OFFLINE",
      "isDimTable": true,
      "segmentsConfig": {
        "schemaName": "key",
        "segmentPushType": "REFRESH",
        "replication": "1"
      },
      "tenants": {},
      "tableIndexConfig": {
        "loadMode": "MMAP"
      },
      "metadata": {
        "customConfigs": {}
      },
      "quota": {
        "storage": "200M"
      }
    }
  • h

    Harshit

    01/05/2023, 6:46 AM
    Hello, I am getting following error while ingesting data via Flink
    Could not find index for column: gKey, type: FORWARD_INDEX, segment: /tmp/data/pinotServerData/key1_OFFLINE/key1_3_
    Schema
    Copy code
    {
      "schemaName": "key",
      "dimensionFieldSpecs": [
        {
          "name": "rootKey",
          "dataType": "STRING"
        },
        {
          "name": "gKey",
          "dataType": "STRING"
        }
      ],
      "primaryKeyColumns": [
        "gKey"
      ]
    }
    Table config
    Copy code
    {
      "tableName": "key",
      "tableType": "OFFLINE",
      "isDimTable": true,
      "segmentsConfig": {
        "schemaName": "key",
        "segmentPushType": "REFRESH",
        "replication": "1"
      },
      "tenants": {},
      "tableIndexConfig": {
        "loadMode": "MMAP"
      },
      "metadata": {
        "customConfigs": {}
      },
      "quota": {
        "storage": "200M"
      }
    }
    m
    • 2
    • 1
  • s

    Shreeram Goyal

    01/05/2023, 9:13 AM
    Hi, I figured out that while applying transformation functions, the destination column name has to be changed from source column name which in our use case would cause a lot of issues as we would have to change all our queries. Is there a work around for this where we can transform a particular column keeping the column same?
  • p

    Pratik Tibrewal

    01/06/2023, 8:39 AM
    Does Pinot support additions or subtractions in date functions, something like:
    Copy code
    date_trunc('week', date_parse(datestr, '%y-%m-%d')) + interval '3' day
  • t

    Tim Berglund

    01/06/2023, 5:16 PM
    I should like to remind all three to four thousand of you that you still have all weekend to get your abstract in for the Real-Time Analytics Summit.
    ✅ 1
  • t

    Tim Berglund

    01/06/2023, 5:16 PM
    This isn’t a Pinot show as such, but given the topic, there ought to be a lot of Pinot content there. Send us your proposals! The doors close at midnight PST on Sunday. 💥
    🍷 2
  • u

    Unmesh Vijay Kadam

    01/09/2023, 8:05 AM
    Can Nodejs application be connected to Pinot? If so what are the packages that can be used in the Nodejs application to connect to pinot?
    x
    • 2
    • 4
  • p

    Prashant Korade

    01/09/2023, 6:11 PM
    Hi Team, Is there way to get ingestion time stamp of record in pinot when consuming records from kafka ? Thanks
    k
    n
    • 3
    • 8
  • t

    Tim Berglund

    01/09/2023, 9:43 PM
    I know I just said a few days ago that the Real-Time Analytics Summit CFP was closing, but we’ve decided to extend it by a week. Many of you got your proposals in, but in the end I decided this was all just too close to the holidays to cut it off now. You’ve got till Jan 16. Submit your talk, if you haven’t already! https://sessionize.com/real-time-analytics-summit-2023
    s
    v
    • 3
    • 5
  • r

    Rostan TABET

    01/10/2023, 9:18 AM
    Hi Pinot team, I have a question about the implementation of the Python client The method
    Cursor.fetchall
    has the following docstring :
    Fetch all (remaining) rows of a query result, returning them as a
    sequence of sequences (e.g. a list of tuples). Note that the cursor's
    arraysize attribute can affect the performance of this operation.
    However, the method's implementation is simply :
    Copy code
    return list(self)
    which basically creates a list by calling
    fetchone
    , i.e.
    self._results.pop(0)
    once for each element of the list
    self._results
    . I wonder if there is a reason for this, instead of something like :
    Copy code
    res = self._results
    self._results = []
    return res
    My main concern is about possible performance issues when the query result contains many rows
    r
    m
    • 3
    • 4
  • s

    Sachin Mittal Consultant

    01/10/2023, 5:08 PM
    Hello folks I am trying to evaluate pinot as a realtime data store and I am trying to consume data from aws kinesiss. Right now I am able to figure out the config and custom data transformation for the same I just wanted to rerun it from scratch. I am on macos and built pinot from source. Where does pinot store all the data and configs we create using admin ? How can i delete that and start fresh by restarting all services from scratch ?
  • s

    Sachin Mittal Consultant

    01/10/2023, 8:08 PM
    Folks I am trying to ingest from kinesis stream and I am getting this error:
    Caught exception while decoding row
    These mostly seem to be from my transformation functions, I have used some in-built functions and for those specific functions the stack trace indicated what problems could be and I have fixed them. However I also needed from groovy scripts too and I executed this scripts standalone to check if they are working fine and they do However it still does not seem to be decoding row and I am not able to now figure out where the problem is. The stack trace I get is something like:
    Copy code
    org.apache.pinot.shaded.com.fasterxml.jackson.core.JsonParseException: Invalid UTF-8 start byte 0x89
     at [Source: (ByteArrayInputStream); line: 1, column: 3]
    	at org.apache.pinot.shaded.com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:2337) ~[pinot-all-0.13.0-SNAPSHOT-jar-with-dependencies.jar:0.13.0-SNAPSHOT-782c3c2df59d2b173ba9ef595aeabd27cb00a332]
    	at org.apache.pinot.shaded.com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:710) ~[pinot-all-0.13.0-SNAPSHOT-jar-with-dependencies.jar:0.13.0-SNAPSHOT-782c3c2df59d2b173ba9ef595aeabd27cb00a332]
    	at org.apache.pinot.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidInitial(UTF8StreamJsonParser.java:3607) ~[pinot-all-0.13.0-SNAPSHOT-jar-with-dependencies.jar:0.13.0-SNAPSHOT-782c3c2df59d2b173ba9ef595aeabd27cb00a332]
    	at org.apache.pinot.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._decodeCharForError(UTF8StreamJsonParser.java:3350) ~[pinot-all-0.13.0-SNAPSHOT-jar-with-dependencies.jar:0.13.0-SNAPSHOT-782c3c2df59d2b173ba9ef595aeabd27cb00a332]
    	at org.apache.pinot.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidToken(UTF8StreamJsonParser.java:3582) ~[pinot-all-0.13.0-SNAPSHOT-jar-with-dependencies.jar:0.13.0-SNAPSHOT-782c3c2df59d2b173ba9ef595aeabd27cb00a332]
    	at org.apache.pinot.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._handleUnexpectedValue(UTF8StreamJsonParser.java:2688) ~[pinot-all-0.13.0-SNAPSHOT-jar-with-dependencies.jar:0.13.0-SNAPSHOT-782c3c2df59d2b173ba9ef595aeabd27cb00a332]
    	at org.apache.pinot.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._nextTokenNotInObject(UTF8StreamJsonParser.java:870) ~[pinot-all-0.13.0-SNAPSHOT-jar-with-dependencies.jar:0.13.0-SNAPSHOT-782c3c2df59d2b173ba9ef595aeabd27cb00a332]
    	at org.apache.pinot.shaded.com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:762) ~[pinot-all-0.13.0-SNAPSHOT-jar-with-dependencies.jar:0.13.0-SNAPSHOT-782c3c2df59d2b173ba9ef595aeabd27cb00a332]
    	at org.apache.pinot.shaded.com.fasterxml.jackson.databind.ObjectReader._bindAsTree(ObjectReader.java:2058) ~[pinot-all-0.13.0-SNAPSHOT-jar-with-dependencies.jar:0.13.0-SNAPSHOT-782c3c2df59d2b173ba9ef595aeabd27cb00a332]
    	at org.apache.pinot.shaded.com.fasterxml.jackson.databind.ObjectReader._bindAndCloseAsTree(ObjectReader.java:2044) ~[pinot-all-0.13.0-SNAPSHOT-jar-with-dependencies.jar:0.13.0-SNAPSHOT-782c3c2df59d2b173ba9ef595aeabd27cb00a332]
    	at org.apache.pinot.shaded.com.fasterxml.jackson.databind.ObjectReader.readTree(ObjectReader.java:1739) ~[pinot-all-0.13.0-SNAPSHOT-jar-with-dependencies.jar:0.13.0-SNAPSHOT-782c3c2df59d2b173ba9ef595aeabd27cb00a332]
    	at org.apache.pinot.spi.utils.JsonUtils.bytesToJsonNode(JsonUtils.java:211) ~[pinot-all-0.13.0-SNAPSHOT-jar-with-dependencies.jar:0.13.0-SNAPSHOT-782c3c2df59d2b173ba9ef595aeabd27cb00a332]
    	at org.apache.pinot.plugin.inputformat.json.JSONMessageDecoder.decode(JSONMessageDecoder.java:61) [pinot-all-0.13.0-SNAPSHOT-jar-with-dependencies.jar:0.13.0-SNAPSHOT-782c3c2df59d2b173ba9ef595aeabd27cb00a332]
    	at org.apache.pinot.plugin.inputformat.json.JSONMessageDecoder.decode(JSONMessageDecoder.java:73) [pinot-all-0.13.0-SNAPSHOT-jar-with-dependencies.jar:0.13.0-SNAPSHOT-782c3c2df59d2b173ba9ef595aeabd27cb00a332]
    	at org.apache.pinot.plugin.inputformat.json.JSONMessageDecoder.decode(JSONMessageDecoder.java:37) [pinot-all-0.13.0-SNAPSHOT-jar-with-dependencies.jar:0.13.0-SNAPSHOT-782c3c2df59d2b173ba9ef595aeabd27cb00a332]
    	at org.apache.pinot.spi.stream.StreamDataDecoderImpl.decode(StreamDataDecoderImpl.java:47) [pinot-all-0.13.0-SNAPSHOT-jar-with-dependencies.jar:0.13.0-SNAPSHOT-782c3c2df59d2b173ba9ef595aeabd27cb00a332]
    	at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.processStreamEvents(LLRealtimeSegmentDataManager.java:549) [pinot-all-0.13.0-SNAPSHOT-jar-with-dependencies.jar:0.13.0-SNAPSHOT-782c3c2df59d2b173ba9ef595aeabd27cb00a332]
    	at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.consumeLoop(LLRealtimeSegmentDataManager.java:434) [pinot-all-0.13.0-SNAPSHOT-jar-with-dependencies.jar:0.13.0-SNAPSHOT-782c3c2df59d2b173ba9ef595aeabd27cb00a332]
    	at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager$PartitionConsumer.run(LLRealtimeSegmentDataManager.java:629) [pinot-all-0.13.0-SNAPSHOT-jar-with-dependencies.jar:0.13.0-SNAPSHOT-782c3c2df59d2b173ba9ef595aeabd27cb00a332]
    	at java.lang.Thread.run(Thread.java:832) [?:?]
    Is there a way to debug is better ?
1...565758...160Latest