https://pinot.apache.org/ logo
Join Slack
Powered by
# troubleshooting
  • d

    Daniel Lavoie

    02/17/2021, 2:31 AM
    So it will scale horizontally based on how many files you have in a blob store
  • n

    Nick Bowles

    02/17/2021, 2:32 AM
    Do you happen to have any example spec files you can share that are redacted?
  • d

    Daniel Lavoie

    02/17/2021, 2:32 AM
    It auto manage ingestion scheduling and task distribution.
  • n

    Nick Bowles

    02/17/2021, 2:32 AM
    I see some config options in https://github.com/apache/incubator-pinot/blob/master/pinot-minion/src/main/java/o[…]pinot/minion/executor/SegmentGenerationAndPushTaskExecutor.java
  • d

    Daniel Lavoie

    02/17/2021, 2:33 AM
    Yup that’s the task. I think @Neha Pawar or @Xiang Fu are working on preparing some docs.
  • d

    Daniel Lavoie

    02/17/2021, 2:34 AM
    Say you have 2000 files, the task will generate a sub task for each file and schedule them to available minion workers.
    👍 1
  • n

    Nick Bowles

    02/17/2021, 2:34 AM
    I’m also using the GCP (gs) plugin to pull in, so was going to see if that’ll install on a minion and if I could convert the jobspec to command line options and try that way
  • d

    Daniel Lavoie

    02/17/2021, 2:34 AM
    Let me see if I can get an example
  • n

    Nick Bowles

    02/17/2021, 2:35 AM
    Awesome thank you so much. I’ll try to do some config on the minion and see if I can get it working
  • d

    Daniel Lavoie

    02/17/2021, 2:36 AM
    In your table config, you can configure something like this:
    Copy code
    "ingestionConfig": {
        "batchIngestionConfig": {
          "segmentIngestionType": "APPEND",
          "segmentIngestionFrequency": "HOURLY",
          "batchConfigMaps":[
               {
                 "inputDirURI": "gs://<input root data dir>",
                 "inputFormat": "json",
                 "includeFileNamePattern": "glob:**/*.gz",
                 "input.fs.className": "org.apache.pinot.plugin.filesystem.GcsPinotFS
               }]
        }
    }
  • d

    Daniel Lavoie

    02/17/2021, 2:37 AM
    then call the
    /task/schedule
    which will result in the task being triggered.
  • n

    Nick Bowles

    02/17/2021, 2:37 AM
    that’s the endpoint on the controller correct?
  • d

    Daniel Lavoie

    02/17/2021, 2:38 AM
    Yes
  • n

    Nick Bowles

    02/17/2021, 2:38 AM
    oh awesome, that’s two birds with one stone then because I was trying to figure out the best way to kick off programmatically. Awesome thanks Daniel really appreciate it.
  • d

    Daniel Lavoie

    02/17/2021, 2:39 AM
    It gets better
  • d

    Daniel Lavoie

    02/17/2021, 2:39 AM
    The table also now support a cron schedule.
    Copy code
    "task": {
           "taskTypeConfigsMap": {
               "SegmentGenerationAndPushTask": {
                 "schedule" : "0 0 * ? * * *"
           }
         }
       }
  • d

    Daniel Lavoie

    02/17/2021, 2:39 AM
    This will trigger batch ingestion for any new files found in the configured input dir
  • d

    Daniel Lavoie

    02/17/2021, 2:39 AM
    and filters the files already processed
  • n

    Nick Bowles

    02/17/2021, 2:40 AM
    another thing I was going to have to solve for 😛 was going to try and kick that off using Jenkins or a k8s cronjob
  • d

    Daniel Lavoie

    02/17/2021, 2:41 AM
    That cron will require a config on the controller :
    controller.task.scheduler.enabled=true
  • d

    Daniel Lavoie

    02/17/2021, 2:41 AM
    BTW, it’s important that your minion workers are configured with the required GcsPinotFS configs and credentials, just like the controller and server.
  • k

    Kha

    02/19/2021, 6:15 PM
    Hi there, I'm currently importing extremely large CSVs in batches into pinot. Does Pinot have a functionality that tells you the CSV row number if there's errors with the CSV when important as a batch file into Pinot?
  • k

    Kishore G

    02/19/2021, 6:46 PM
    There is a hidden variable $rowId when you query Pinot
  • k

    Kishore G

    02/19/2021, 6:46 PM
    You can use that to debug
  • n

    Nick Bowles

    02/19/2021, 9:36 PM
    I have the mode for the batch ingest set to
    APPEND
    , believe I tried
    REPLACE
    as well before but it I think it didn’t like that
  • n

    Nick Bowles

    02/24/2021, 12:08 AM
    I also tried to put the column name in quotes like
    "item_date"
  • g

    Gergely Lendvai

    02/24/2021, 4:54 PM
    Hi everyone, I was trying to add custom udfs to Pinot and it seems they can be used (at least I was able to use it from query console), however when I was checking the logs I saw that none of the plugins under the plugins folder were managed to load during initialization. I also tried it without my custom jar but it resulted in the same. Here are some examples errors:
  • g

    Gergely Lendvai

    02/24/2021, 4:54 PM
    Copy code
    2021/02/24 15:56:42.896 ERROR [PluginManager] [main] Failed to load plugin [pinot-thrift] from dir [/opt/pinot/plugins/pinot-input-format/pinot-thrift]
    java.lang.IllegalArgumentException: object is not an instance of declaring class
    	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
    	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
    	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
    	at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
    	at org.apache.pinot.spi.plugin.PluginClassLoader.<init>(PluginClassLoader.java:50) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d74224a14b07bbe3e5c4909b51856886e29510df]
    	at org.apache.pinot.spi.plugin.PluginManager.createClassLoader(PluginManager.java:196) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d74224a14b07bbe3e5c4909b51856886e29510df]
    	at org.apache.pinot.spi.plugin.PluginManager.load(PluginManager.java:187) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d74224a14b07bbe3e5c4909b51856886e29510df]
    	at org.apache.pinot.spi.plugin.PluginManager.init(PluginManager.java:157) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d74224a14b07bbe3e5c4909b51856886e29510df]
    	at org.apache.pinot.spi.plugin.PluginManager.init(PluginManager.java:123) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d74224a14b07bbe3e5c4909b51856886e29510df]
    	at org.apache.pinot.spi.plugin.PluginManager.<init>(PluginManager.java:104) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d74224a14b07bbe3e5c4909b51856886e29510df]
    	at org.apache.pinot.spi.plugin.PluginManager.<clinit>(PluginManager.java:46) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d74224a14b07bbe3e5c4909b51856886e29510df]
    	at org.apache.pinot.tools.admin.PinotAdministrator.main(PinotAdministrator.java:182) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d74224a14b07bbe3e5c4909b51856886e29510df]
    2021/02/24 15:56:42.897 ERROR [PluginManager] [main] Failed to load plugin [pinot-kafka-2.0] from dir [/opt/pinot/plugins/pinot-stream-ingestion/pinot-kafka-2.0]
    java.lang.IllegalArgumentException: object is not an instance of declaring class
    	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
    	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
    	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
    	at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
    	at org.apache.pinot.spi.plugin.PluginClassLoader.<init>(PluginClassLoader.java:50) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d74224a14b07bbe3e5c4909b51856886e29510df]
    	at org.apache.pinot.spi.plugin.PluginManager.createClassLoader(PluginManager.java:196) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d74224a14b07bbe3e5c4909b51856886e29510df]
    	at org.apache.pinot.spi.plugin.PluginManager.load(PluginManager.java:187) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d74224a14b07bbe3e5c4909b51856886e29510df]
    	at org.apache.pinot.spi.plugin.PluginManager.init(PluginManager.java:157) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d74224a14b07bbe3e5c4909b51856886e29510df]
    	at org.apache.pinot.spi.plugin.PluginManager.init(PluginManager.java:123) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d74224a14b07bbe3e5c4909b51856886e29510df]
    	at org.apache.pinot.spi.plugin.PluginManager.<init>(PluginManager.java:104) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d74224a14b07bbe3e5c4909b51856886e29510df]
    	at org.apache.pinot.spi.plugin.PluginManager.<clinit>(PluginManager.java:46) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d74224a14b07bbe3e5c4909b51856886e29510df]
    	at org.apache.pinot.tools.admin.PinotAdministrator.main(PinotAdministrator.java:182) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d74224a14b07bbe3e5c4909b51856886e29510df]
  • n

    Nick Bowles

    02/24/2021, 6:29 PM
    The table says the state is “BAD” although the number of reported segments are 100%. Recieved a few broken connection errors when ingesting so running again, but I’m not sure if that’s related to the different query results.
  • n

    Nick Bowles

    02/26/2021, 8:52 PM
    Getting some failed tasks when doing a batch ingestion, and looking at the minion logs I see a lot of
    Caused by: <http://java.net|java.net>.SocketTimeoutException: Read timed out
    Any tips or things I need to tweak to avoid this?
1...143144145...166Latest