https://pinot.apache.org/ logo
Join Slack
Powered by
# getting-started
  • c

    Carlos

    01/17/2023, 4:19 PM
    Thanks in advance!
  • s

    Seunghyun

    01/17/2023, 8:59 PM
    @Carlos Yes, ingesting realtime + offline data to pinot and use it as the backend storage for Superset is a common usage pattern for Pinot. We have a documentation around Superset-Pinot https://docs.pinot.apache.org/integrations/superset
  • c

    Carlos

    01/18/2023, 7:35 AM
    Hi @Seunghyun, thanks for your response. I have succesfully connected Superset to Pinot. My problem is knowing if it is posible to dedupe data in offline tables.
    s
    l
    • 3
    • 33
  • g

    GerardJ

    01/23/2023, 7:48 PM
    I want to ingest data from an AVRO-encoded Pulsar topic into a Pinot REALTIME table. Is that currently possible? If so, how can I do it?
    a
    n
    • 3
    • 5
  • t

    Trust Okoroego

    01/25/2023, 11:22 PM
    Hi, How can I pass basic auth credentials to INSERT INTO statement. My Pinot controller has basic auth enabled. I tried below and got an error:
    Copy code
    INSERT INTO "orders" FROM FILE '<s3://xxxxxx/>'
    OPTION(
      taskName=myTask-s3,
      input.fs.className=org.apache.pinot.plugin.filesystem.S3PinotFS,
      input.fs.prop.accessKey=xxxxxxx,
      input.fs.prop.secretKey=xxxxxxxxx,
      authToken='Basic xxxxxx=='
      input.fs.prop.region=us-east-1
    )
    I got bellow error message
    Copy code
    ProcessingException(errorCode:450, message:InternalError:
    org.apache.pinot.sql.parsers.SqlCompilationException: Caught exception while parsing query: INSERT INTO "orders"
    FROM FILE '<s3://xxxxxxxx>'
    	at org.apache.pinot.sql.parsers.CalciteSqlParser.compileToSqlNodeAndOptions(CalciteSqlParser.java:136)
    	at org.apache.pinot.controller.api.resources.PinotQueryResource.executeSqlQuery(PinotQueryResource.java:135)
    	at org.apache.pinot.controller.api.resources.PinotQueryResource.handlePostSql(PinotQueryResource.java:103)
    ...
    Caused by: org.apache.pinot.sql.parsers.SqlCompilationException: OPTION statement requires two parts separated by '='
    	at org.apache.pinot.sql.parsers.CalciteSqlParser.extractOptionsMap(CalciteSqlParser.java:486)
    	at org.apache.pinot.sql.parsers.CalciteSqlParser.compileToSqlNodeAndOptions(CalciteSqlParser.java:131)
    	... 27 more)
    removing the authToken line I get below error
    Copy code
    [
      {
        "message": "QueryExecutionError:\norg.apache.commons.httpclient.HttpException: Unable to get tasks states map. Error code 400, Error message: {\"code\":400,\"error\":\"No task is generated for table: orders, with task type: SegmentGenerationAndPushTask\"}\n\tat org.apache.pinot.common.minion.MinionClient.executeTask(MinionClient.java:123)\n\tat org.apache.pinot.core.query.executor.sql.SqlQueryExecutor.executeDMLStatement(SqlQueryExecutor.java:102)\n\tat org.apache.pinot.controller.api.resources.PinotQueryResource.executeSqlQuery(PinotQueryResource.java:145)\n\tat org.apache.pinot.controller.api.resources.PinotQueryResource.handlePostSql(PinotQueryResource.java:103)",
        "errorCode": 200
      }
    ]
    ✅ 1
    a
    m
    • 3
    • 13
  • r

    Rahul Patwari

    01/28/2023, 1:48 PM
    Hello, I am a newbie to pinot! referencing this doc: https://docs.pinot.apache.org/users/tutorials/schema-evolution
    Pinot only allows adding new columns to the schema. In order to drop a column, change the column name or data type, a new table has to be created.
    Is this applicable only for realtime tables (or) offline tables as well? Are there any plans to support renaming of a column? Any reasons why renaming is not supported?
    m
    • 2
    • 1
  • k

    K.K. GAYAN SANJEEWA

    01/30/2023, 4:47 AM
    Hello All, Need some help on theoretical perspective. I need to know where pinot store the data, it's like since pinot keeps very new data inside it self and after some time it push it back to some where else. What I need to know this other DB where old data remains . Can I use postgreSQL for that one
    m
    m
    m
    • 4
    • 5
  • d

    Dimuth

    02/01/2023, 10:32 AM
    Hi All,
  • d

    Dimuth

    02/01/2023, 10:35 AM
    Im new to pinot and trying to connect pinot connector with trino and work there. So i just run a quickstart in pinot and connect it with trino. Then try to run a simple select query but it give errors. Is there any limitaions in this or wht?..
    x
    • 2
    • 4
  • a

    Amol

    02/07/2023, 4:32 AM
    Anyone knows a quick (scripted) way to convert mysql ddl into pinot schema.json?
    • 1
    • 2
  • p

    piby

    02/09/2023, 1:10 PM
    Hi, I am currently testing pinot and am not able to create realtime and offline table with same name. I want to injest data from kafka in the realtime table and let pinot handles creating and offloading segments to offline table. Here is the error I get
    2023/02/09 13:00:24.206 INFO [AddTableCommand] [main] Executing command: AddTable -tableConfigFile null -offlineTableConfigFile /var/pinot/data/epoch_table_offline_table_config.json -realtimeTableConfigFilenull -schemaFile /var/pinot/data/epoch_table_schema.json -controllerProtocol http -controllerHost pinot-controller -controllerPort 9000 -user null -password [hidden] -exec
    2023/02/09 13:00:24.514 INFO [AddTableCommand] [main] {"code":400,"error":"TableConfigs: epoch_table already exists. Use PUT to update existing config"}
    realtime-table.yaml
  • m

    Mark Needham

    02/09/2023, 3:01 PM
    You need to call AddTable with the
    -update
    flag
    👍 1
    p
    • 2
    • 1
  • m

    Mark Needham

    02/09/2023, 3:01 PM
    when you add the second table
  • j

    Julius Norinder

    02/10/2023, 3:20 PM
    Hi and Happy Friday to y'all! I have a question about Pinot and Zookeeper. We already have a Zookeeper (and Kafka) deployed in production. Do we need a second Zookeeper to handle Pinot separately or can we point Pinot to the existing one? Perhaps the answer is obvious, since the name Zookeeper would imply having more than one animal to attend to! 😄
    p
    s
    • 3
    • 5
  • a

    Ashwin Raja

    02/11/2023, 2:41 AM
    Is there a way to use minion based ingestion to replace single segments? like what I'm trying to is: • let's say a segment (
    my-table_0_10_0
    ) points at parquet file in s3
    <s3://my-bucket/data/part-0000-abc.parquet>
    • I'd like to replace that segment; I have REST API access to everything, but can't exec into my controller or anything like that. • My table has, let's say 100 segments • If I just replace that file and rerun minion based ingestion via my tables
    SegmentGenerationAndPush
    task, it's going to kick off 100 tasks, which I don't really want to do, since that'll take a while and I just want my one segment So is there a way to kick off only the subtask for that single file/segment?
  • d

    Dhar Rawal

    02/13/2023, 8:36 PM
    Any idea how to configure the pinot-controller kubernetes manifest and/or the pinot-controller.conf file for connecting to zookeeper when zookeeper requires SASL authentication? The pinot documentation talks about setting "controller.zk.str", but nothing about how to set zookeeper auth as sasl and specifying zookeeper userid and password
  • i

    Irtisaam

    02/14/2023, 1:36 PM
    Hey Everyone! I'm a beginner and using pinot for the first time! Can anyone help me with how to run pinot with kafka on docker-compose? just a basic tutorial maybe!will be much helpful!
  • l

    Lewis Yobs

    02/14/2023, 1:46 PM
    We have a 'Getting Started' document on that topic at:
    <https://dev.startree.ai/docs/pinot/getting-started/ingest-streaming-data-source>
  • i

    Irtisaam

    02/16/2023, 8:48 AM
    Hey everyone! Hope you are doing well! Kindly recommend me an article on the working flow and architecture of apache pinot!
    j
    • 2
    • 6
  • d

    Dhiwakar N A

    02/21/2023, 3:27 PM
    Newbie question 🙂 : How does using Pinot compare with a situation say using a "snowflake warehouse with connectors" for real time analytics? One obvious difference is being open source. Are there any more advantages (like operational simplicity, cost, etc?)
    m
    • 2
    • 7
  • s

    Shreeram Goyal

    03/02/2023, 8:27 AM
    Hi, was wondering if changing the config :
    realtime.segment.flush.threshold.rows
    from 100000 to 10000 and then refreshing segments would also refresh the already committed segments i.e. segments would have 10000 entries now? Please let me know
    m
    x
    • 3
    • 8
  • p

    piby

    03/02/2023, 5:47 PM
    Hi, We are in the process of setting up ETL pipelines. The processed data will be stored on S3 in parquet format using datetime column for partitioning. The idea is to connect pinot offline table with S3 and let minions handle SegmentGenerationAndPush tasks. Can pinot handle parquet files with snappy/lz4 compression? What about dictionary encoding? I do not see any documentation on how to add custom config for ParquetRecordReader here https://docs.pinot.apache.org/basics/data-import/pinot-input-formats Could someone point me to the right place to read more about it? Or any tips for me in general on how to efficiently store data on S3 using parquet format? Thanks!
    l
    a
    +2
    • 5
    • 5
  • p

    pramod shenoy

    03/04/2023, 12:33 AM
    Hi, I am trying to deploy quickstart pinot helmchart on kubernetes cluster and getting connection timeout from Controller
    • 1
    • 5
  • s

    Sid

    03/20/2023, 6:42 PM
    Hi team, been exploring apache pinot for the first time. I'm unable to make the filter function work on pinot tables consuming events from kafka. I wanted to filter events based on event_names field in each kafka event. I get the below error and I tried setting up the Groovy field in controller.conf file, still no luck. org.apache.pinot.segment.loca^Cjava.lang.RuntimeException: Caught exception while executing filter function: Caused by: java.lang.NumberFormatException: For input string: "{event_name}" Any help would be appreciated.
  • m

    Mark Needham

    03/21/2023, 10:04 AM
    I’ve only tried it out for a basic example here - https://dev.startree.ai/docs/pinot/recipes/filtering-ingestion Can you give a bit more info on what your events look like & what the filter fn looks like too?
  • m

    Maksim Levin

    03/21/2023, 10:25 AM
    Hello! If I call
    INSERT INTO
    command in PQL for batch data import, what defines the degree of parallelism in this case? Number of minions' processors?
    d
    • 2
    • 2
  • a

    Albert Latacz

    03/21/2023, 10:31 AM
    Hey Folks 👋 I am looking at our test setup and feeling in bit of dire straits. Currently we develop under Windows and build on Linux servers. During the build we start required environment locally (mainly Zookeeper, Kafka and Flink), run tests against it and then tear it down at the end of the build. I'm trying to figure out what is the best way to incorporate Apache Pinot into this setup. Has anyone got experience or can recommend best practices for testing against Apache Pinot under Windows? Is it possible to run it locally under MS Windows 10 or start it directly from within JVM process?
    m
    • 2
    • 4
  • k

    Kavya Vishwanath Bhadrappa

    03/22/2023, 3:17 AM
    hi team, we have a realtime pinot table configured with ingestion-aggregation. can we also setup start-tree index on this table?
    j
    • 2
    • 6
  • k

    Kun

    03/25/2023, 9:23 AM
    Hi, all. Is it possible to change the Key of JSON Field to lower case? For example,
    Copy code
    {
      "columnName": "converted_json",
      "transformFunction": "map_keys(raw_json, x -> lower(x))"
    }
    j
    • 2
    • 3
  • t

    Tanmay Varun

    03/26/2023, 6:05 AM
    Hi everybody
1...7891011Latest