https://pinot.apache.org/ logo
Join Slack
Powered by
# troubleshooting
  • m

    Mahesh babu

    11/10/2022, 9:43 AM
    Hi Team i'm trying to ingest data from s3 .i fallowed https://docs.pinot.apache.org/basics/data-import/pinot-file-system/amazon-s3 But facing issues with the bucket access . 2022/11/10 093402.415 ERROR [LaunchDataIngestionJobCommand] [main] Got exception to kick off standalone data ingestion job - software.amazon.awssdk.services.s3.model.S3Exception: The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint. (Service: S3, Status Code: 301, Request ID: E3AWJB2MWP964Z45, Extended Request ID: X56x0ByZ93OtW6KKpgc3FuKVfujvNd2qcqefSb/h2/0HLrGFO45qCD8cpPgQKS1Frh463mDTSI4=)
    r
    s
    • 3
    • 4
  • v

    Varagini Karthik

    11/10/2022, 2:14 PM
    Hi Team, segments are not creating for REALTIME table when i checked the controller logs
    java.io.IOException: No space left on devicei
    found the following exception how to handle is ?
    l
    • 2
    • 2
  • v

    Varagini Karthik

    11/10/2022, 2:35 PM
    I gave 1gb for controller/data
  • t

    Thomas Steinholz

    11/10/2022, 2:55 PM
    hi team, I have two segments in my offline table that are in an “error” state. I have tried reloading segments and status but the table seems unable to return to a good state. Any recommendations for table recovery?
    a
    • 2
    • 5
  • x

    xuyen

    11/10/2022, 6:20 PM
    Hi all, has anyone ran into this issue where I’m connecting a table from Pinot to Superset. I have a timestamp column that works fine when I query from within Pinot Dashboard but it shows up as a Long in Superset for some reason. The format ex: ‘2022-11-03 234505.822’
    m
    • 2
    • 2
  • x

    xuyen

    11/10/2022, 6:21 PM
    I’m having problems using it as the Time column in Superset. I manually set it as a temporal field in the Dataset settings but I’m having problems generating charts.
  • p

    Priyank Bagrecha

    11/11/2022, 2:39 AM
    We created an offline table, and by mistake tried ingesting data from a s3 bucket that the cluster doesn't have access to. Data ingestion is done via a spark job. We realized the mistake, and then triggered another ingestion job that points to the right s3 bucket for loading the segments. However we are still seeing errors in the logs about access denied in loading from the wrong bucket, and no segments have been loaded. It seems like the segment loading in servers is still re-trying from the wrong bucket. How do we stop that and force to load segments from the correct bucket? cc @suraj sheshadri
    m
    s
    • 3
    • 32
  • s

    suraj sheshadri

    11/12/2022, 1:14 AM
    Hello, we are seeing that the qps for a table with basic replication of 3 which has the data spread across all 15 servers is performing comparatively better than the qps for a table where we are using replication groups of 3 with 5 servers each.. is there any recommendation on which approach to use and how to improve the performance for the replication group scenario. Thanks.
    m
    p
    • 3
    • 4
  • c

    cheng

    11/13/2022, 2:36 AM
    Hello, I have a problem between Tableau desktop (2021.4) and Apache Pinot 0.11.0 using JDBC (pinot-jdbc-client-0.11.0.jar, pinot-java-client-0.11.0.jar, async-http-client-2.12.3.jar) by compling the code from origin/release-0.11.0 (https://github.com/apache/pinot.git) in OpenJDK-11. These binaries are placed in the directory Tableau drive as the instruction https://docs.pinot.apache.org/integrations/tableau. The messages of error catched from Apache pinot log are posted in the following bloc. I have also used pinot-jdbc-client-0.9.3.jar, pinot-java-client-0.9.3.jar and async-http-client-1.9.21.jar, Tableau Desktop work well with Pinot-0.11, but JDBC with version 0.93 do not support Pinot authentication. Could I have some tips to it? Thanks.
    Copy code
    ***************************************************
    You can always go to <http://localhost:9000> to play around in the query console
    Getting Helix leader: 192.168.0.165_9000, Helix version: 1.0.4, mtime: 1668291356337
    Starting TaskMetricsEmitter with running frequency of 300 seconds.
    [TaskRequestId: auto] Start running task: TaskMetricsEmitter
    [TaskRequestId: auto] Finish running task: TaskMetricsEmitter in 3ms
    Starting RetentionManager with running frequency of 21600 seconds.
    [TaskRequestId: auto] Start running task: RetentionManager
    Processing 6 tables in task: RetentionManager
    Start managing retention for table: airlineStats_OFFLINE
    Invalid retention time: null null for table: airlineStats_OFFLINE, skip
    Segment lineage metadata clean-up is successfully processed for table: airlineStats_OFFLINE
    Start managing retention for table: baseballStats_OFFLINE
    Invalid retention time: null null for table: baseballStats_OFFLINE, skip
    Segment lineage metadata clean-up is successfully processed for table: baseballStats_OFFLINE
    Start managing retention for table: dimBaseballTeams_OFFLINE
    Segment push type is not APPEND for table: dimBaseballTeams_OFFLINE, skip managing retention
    Segment lineage metadata clean-up is successfully processed for table: dimBaseballTeams_OFFLINE
    Start managing retention for table: starbucksStores_OFFLINE
    Segment: starbucksStores_OFFLINE_0 of table: starbucksStores_OFFLINE has invalid end time in millis: -1
    Segment lineage metadata clean-up is successfully processed for table: starbucksStores_OFFLINE
    Start managing retention for table: githubEvents_OFFLINE
    Segment push type is not APPEND for table: githubEvents_OFFLINE, skip managing retention
    Segment lineage metadata clean-up is successfully processed for table: githubEvents_OFFLINE
    Start managing retention for table: githubComplexTypeEvents_OFFLINE
    Segment push type is not APPEND for table: githubComplexTypeEvents_OFFLINE, skip managing retention
    Segment lineage metadata clean-up is successfully processed for table: githubComplexTypeEvents_OFFLINE
    Removing aged deleted segments for all tables
    Finish processing 6/6 tables in task: RetentionManager
    [TaskRequestId: auto] Finish running task: RetentionManager in 11ms
    Starting SegmentStatusChecker with running frequency of 300 seconds.
    [TaskRequestId: auto] Start running task: SegmentStatusChecker
    Processing 6 tables in task: SegmentStatusChecker
    Reading segment sizes from 1 servers for table: airlineStats_OFFLINE with timeout: 30000ms
    Finished reading information for table: airlineStats_OFFLINE
    Reading segment sizes from 1 servers for table: baseballStats_OFFLINE with timeout: 30000ms
    Finished reading information for table: baseballStats_OFFLINE
    Reading segment sizes from 1 servers for table: dimBaseballTeams_OFFLINE with timeout: 30000ms
    Finished reading information for table: dimBaseballTeams_OFFLINE
    Reading segment sizes from 1 servers for table: starbucksStores_OFFLINE with timeout: 30000ms
    Finished reading information for table: starbucksStores_OFFLINE
    Reading segment sizes from 1 servers for table: githubEvents_OFFLINE with timeout: 30000ms
    Finished reading information for table: githubEvents_OFFLINE
    Reading segment sizes from 1 servers for table: githubComplexTypeEvents_OFFLINE with timeout: 30000ms
    Finished reading information for table: githubComplexTypeEvents_OFFLINE
    Finish processing 6/6 tables in task: SegmentStatusChecker
    [TaskRequestId: auto] Finish running task: SegmentStatusChecker in 82ms
    Getting Helix leader: 192.168.0.165_9000, Helix version: 1.0.4, mtime: 1668291356337
    Starting SegmentRelocator with running frequency of 3600 seconds.
    [TaskRequestId: auto] Start running task: SegmentRelocator
    Processing 6 tables in task: SegmentRelocator
    Finish processing 6/6 tables in task: SegmentRelocator
    [TaskRequestId: auto] Finish running task: SegmentRelocator in 6ms
    Starting OfflineSegmentIntervalChecker with running frequency of 3600 seconds.
    [TaskRequestId: auto] Start running task: OfflineSegmentIntervalChecker
    Processing 6 tables in task: OfflineSegmentIntervalChecker
    Starting MinionInstancesCleanupTask with running frequency of 3600 seconds.
    [TaskRequestId: auto] Start running task: MinionInstancesCleanupTask
    [TaskRequestId: auto] Finish running task: MinionInstancesCleanupTask in 3ms
    Finish processing 6/6 tables in task: OfflineSegmentIntervalChecker
    [TaskRequestId: auto] Finish running task: OfflineSegmentIntervalChecker in 15ms
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/>, content-type null status code 200 OK
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/auth/info>, content-type null status code 200 OK
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/cluster/info>, content-type null status code 200 OK
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/cluster/configs>, content-type null status code 200 OK
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/users>, content-type null status code 200 OK
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/tables?type=realtime>, content-type null status code 200 OK
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/tables?type=offline>, content-type null status code 200 OK
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/tables/airlineStats_OFFLINE/externalview>, content-type null status code 200 OK
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/tables/airlineStats_OFFLINE/idealstate>, content-type null status code 200 OK
    Reading segment sizes from 1 servers for table: airlineStats_OFFLINE with timeout: 30000ms
    Reading segment sizes from 1 servers for table: baseballStats_OFFLINE with timeout: 30000ms
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/tables/baseballStats_OFFLINE/idealstate>, content-type null status code 200 OK
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/tables/dimBaseballTeams_OFFLINE/idealstate>, content-type null status code 200 OK
    Finished reading information for table: baseballStats_OFFLINE
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/tables/baseballStats_OFFLINE/size>, content-type null status code 200 OK
    Finished reading information for table: airlineStats_OFFLINE
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/tables/airlineStats_OFFLINE/size>, content-type null status code 200 OK
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/tables/dimBaseballTeams_OFFLINE/externalview>, content-type null status code 200 OK
    Reading segment sizes from 1 servers for table: dimBaseballTeams_OFFLINE with timeout: 30000ms
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/tables/baseballStats_OFFLINE/externalview>, content-type null status code 200 OK
    Reading segment sizes from 1 servers for table: githubComplexTypeEvents_OFFLINE with timeout: 30000ms
    Finished reading information for table: dimBaseballTeams_OFFLINE
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/tables/dimBaseballTeams_OFFLINE/size>, content-type null status code 200 OK
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/tables/githubComplexTypeEvents_OFFLINE/idealstate>, content-type null status code 200 OK
    Finished reading information for table: githubComplexTypeEvents_OFFLINE
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/tables/githubComplexTypeEvents_OFFLINE/size>, content-type null status code 200 OK
    Reading segment sizes from 1 servers for table: githubEvents_OFFLINE with timeout: 30000ms
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/tables/githubEvents_OFFLINE/idealstate>, content-type null status code 200 OK
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/tables/githubComplexTypeEvents_OFFLINE/externalview>, content-type null status code 200 OK
    Finished reading information for table: githubEvents_OFFLINE
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/tables/githubEvents_OFFLINE/size>, content-type null status code 200 OK
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/tables/githubEvents_OFFLINE/externalview>, content-type null status code 200 OK
    Reading segment sizes from 1 servers for table: starbucksStores_OFFLINE with timeout: 30000ms
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/tables/starbucksStores_OFFLINE/idealstate>, content-type null status code 200 OK
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/tables/starbucksStores_OFFLINE/externalview>, content-type null status code 200 OK
    Finished reading information for table: starbucksStores_OFFLINE
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/tables/starbucksStores_OFFLINE/size>, content-type null status code 200 OK
    ...
    ...
    [TaskRequestId: auto] Finish running task: RealtimeSegmentValidationManager in 2ms
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/tables>, content-type null status code 200 OK
    Handled request from 0:0:0:0:0:0:0:1 GET <http://localhost:9000/tables/baseballStats/schema>, content-type null status code 200 OK
    url string passed is : <http://192.168.0.165:8000/query/sql>
    Processed 
    Getting Helix leader: 192.168.0.165_9000, Helix version: 1.0.4, mtime: 1668291356337
    Handled request from 192.168.0.125 GET <http://192.168.0.165:9000/v2/brokers/tenants/DefaultTenant>, content-type application/json; charset=utf-8 status code 200 OK
    Getting Helix leader: 192.168.0.165_9000, Helix version: 1.0.4, mtime: 1668291356337
    Getting Helix leader: 192.168.0.165_9000, Helix version: 1.0.4, mtime: 1668291356337
    Handled request from 192.168.0.125 GET <http://192.168.0.165:9000/v2/brokers/tenants/DefaultTenant>, content-type application/json; charset=utf-8 status code 200 OK
    Getting Helix leader: 192.168.0.165_9000, Helix version: 1.0.4, mtime: 1668291356337
    Getting Helix leader: 192.168.0.165_9000, Helix version: 1.0.4, mtime: 1668291356337
    Starting TaskMetricsEmitter with running frequency of 300 seconds.
    [TaskRequestId: auto] Start running task: TaskMetricsEmitter
    [TaskRequestId: auto] Finish running task: TaskMetricsEmitter in 4ms
    Starting SegmentStatusChecker with running frequency of 300 seconds.
    [TaskRequestId: auto] Start running task: SegmentStatusChecker
    Processing 6 tables in task: SegmentStatusChecker
    Reading segment sizes from 1 servers for table: airlineStats_OFFLINE with timeout: 30000ms
    Finished reading information for table: airlineStats_OFFLINE
    Reading segment sizes from 1 servers for table: baseballStats_OFFLINE with timeout: 30000ms
    Finished reading information for table: baseballStats_OFFLINE
    Reading segment sizes from 1 servers for table: dimBaseballTeams_OFFLINE with timeout: 30000ms
    Finished reading information for table: dimBaseballTeams_OFFLINE
    Reading segment sizes from 1 servers for table: starbucksStores_OFFLINE with timeout: 30000ms
    Finished reading information for table: starbucksStores_OFFLINE
    Reading segment sizes from 1 servers for table: githubEvents_OFFLINE with timeout: 30000ms
    Finished reading information for table: githubEvents_OFFLINE
    Reading segment sizes from 1 servers for table: githubComplexTypeEvents_OFFLINE with timeout: 30000ms
    • 1
    • 1
  • s

    Sumit Khaitan

    11/13/2022, 8:28 AM
    Hi team, I am getting
    no space left on the device
    error on controller while trying to push the segment. When I checked the disk usage on controller, seems like
    /var/pinot/controller/data/<TABLE_NAME>
    path is storing the segment files. Shouldn't segment be only stored on server and not on controller ?
    m
    • 2
    • 3
  • c

    cheng

    11/13/2022, 10:49 PM
    Hi Teams, I have another issue can not be resolved. My tableau(2021.4) works well with Pinot-0.11.0 without the authentication, the total jars using are 8, It is not same as official guide, but missing any jar could not work. When an basic authentication added to Pinot, we can login into Pinot from browser as Chrom as before, but Tableau fails to connect to Pinot with the message in the jpprotocolserver.log as the figure showing below. In brief, Tableau 2021.4 with the 8 jars in its drive directory do not work well with Pinot-0.11.0 on authentication. Someone has any idea? Thanks.
    m
    • 2
    • 3
  • c

    cheng

    11/13/2022, 10:58 PM
    image.png
  • p

    Pratik Tibrewal

    11/14/2022, 5:49 AM
    Hey I have a table with PartialUpsert configured. Recently I added a new multivalue column of type
    Long
    in Pinot table and the corresponding kafka topic. Somehow the data in kafka topic seems to be fine but in the table, I am always getting
    -9223372036854775808
    in the new column in Pinot. Any suggestions on what might be the cause for this?
    ack 1
    y
    m
    k
    • 4
    • 7
  • m

    Mahesh babu

    11/14/2022, 10:36 AM
    do we have any way to load data from s3 subdirectory recursively in to apache pinot?
    r
    r
    x
    • 4
    • 11
  • v

    vmarchaud

    11/14/2022, 5:18 PM
    Hi, i've just upgraded to 0.11 (from 0.10) and after re-enabling groovy on brokers, i found out that some of our queries failed with NPE errors now:
    Copy code
    SELECT groovy('{"returnType":"STRING","isSingleValue":true}', 'arg0.toList().join('';'')', JSONEXTRACTKEY("labels", '$.*')) FROM datasource_609bc534f46c000300b29dcf_REALTIME WHERE (("timestamp" >= 1667833781565)) AND "labels" != '{}' GROUP BY 1 LIMIT 0,100
    m
    • 2
    • 6
  • s

    sunny

    11/15/2022, 1:33 AM
    Hi, In Pinot 0.11.0 version, I have set up a pinot server. (pinot.server.instance.currentDataTableVersion=4) Then a trino query error occurred.
    Copy code
    java.lang.NullPointerException: null value in entry: Server_pay-poc-pinot-w4.ay1.krane.9rum.cc_8001=null
        at com.google.common.collect.CollectPreconditions.checkEntryNotNull(CollectPreconditions.java:33)
        at com.google.common.collect.SingletonImmutableBiMap.<init>(SingletonImmutableBiMap.java:43)
        at com.google.common.collect.ImmutableBiMap.of(ImmutableBiMap.java:81)
        at com.google.common.collect.ImmutableMap.of(ImmutableMap.java:128)
        at com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:708)
        at com.google.common.collect.ImmutableMap.copyOf(ImmutableMap.java:686)
        at io.trino.plugin.pinot.PinotSegmentPageSource.queryPinot(PinotSegmentPageSource.java:221)
        at io.trino.plugin.pinot.PinotSegmentPageSource.fetchPinotData(PinotSegmentPageSource.java:182)
        at io.trino.plugin.pinot.PinotSegmentPageSource.getNextPage(PinotSegmentPageSource.java:150)
        at io.trino.operator.TableScanOperator.getOutput(TableScanOperator.java:311)
        at io.trino.operator.Driver.processInternal(Driver.java:410)
        at io.trino.operator.Driver.lambda$process$10(Driver.java:313)
        at io.trino.operator.Driver.tryWithLock(Driver.java:698)
        at io.trino.operator.Driver.process(Driver.java:305)
        at io.trino.operator.Driver.processForDuration(Driver.java:276)
        at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1092)
        at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
        at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:488)
        at io.trino.$gen.Trino_385____20221110_083442_2.run(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
    For reference, the error does not occur in pinot version 0.10.0. In pinot version 0.10.0, the default value of pinot.server.instance.currentDataTableVersion is 2, so it is expected that there is no problem. And I changed the pinot server configuration(pinot.server.instance.currentDataTableVersion=2) and the trino query succeeded. I know that the settings in the commit below have disappeared from the helm chart settings. I think this commit is related to the above query error, but I don’t understand the commit message. Can you give me an explanation of what that means?
    Copy code
    This is no longer needed as presto/trino side has upgraded the DataTable version
    https://github.com/apache/pinot/pull/9255
  • n

    Nickel Fang

    11/15/2022, 9:52 AM
    Hi Team. There are some config snippets of a table.
    Copy code
    "segmentPartitionConfig": {
          "columnPartitionMap": {
            "trace_id": {
              "functionName": "Murmur",
              "numPartitions": 8
            }
          }
        },
    From the Kafka side, I set trace_id as key when producing an event message. And there are 8 partitions of the topic. when I get the segment meta data, it seems not correspond to one partition.
    Copy code
    "segment.partition.metadata": "{\"columnPartitionMap\":{\"trace_id\":{\"numPartitions\":8,\"partitions\":[0,2,4,6],\"functionName\":\"Murmur\",\"functionConfig\":null}}}",
    m
    • 2
    • 1
  • l

    Luis Fernandez

    11/15/2022, 4:27 PM
    noob question, can we ingest segments created by pinot on GCS to another pinot env? say if we wanted to get data from prod GCS into one of our lower env, could we do it? how? we tried using the OfflineDataJob but haven’t be able to make it work, any thoughts?
    m
    s
    +2
    • 5
    • 14
  • k

    kurt

    11/15/2022, 5:59 PM
    newbie question Are the dateTimeField dataType docs up to date here: https://docs.pinot.apache.org/configuration-reference/schema I’m trying to configure a schema with this:
    Copy code
    "dateTimeFieldSpecs": [{
        "name": "day",
        "dataType": "STRING",
        "format" : "SIMPLE_DATE_FORMAT|yyyy-MM-dd",
        "granularity": "1:DAYS"
      }]
    I get this error message:
    invalid datetime format: SIMPLE_DATE_FORMAT|yyyy-MM-dd
    I also try the specific format examples given in the docs like
    SIMPLE_DATE_FORMAT|yyyy-MM-dd HH:mm:ss
    and
    SIMPLE_DATE_FORMAT|yyyy-MM-dd|IST
    and get the same
    invalid datetime format
    error. I believe I’m running the latest version of Apache Pinot. I just installed via the official Helm chart from the github repo. The Pinot pods are running
    apachepinot/pinot:latest-jdk11
    . Is there any way I can confirm what version of Pinot I’m using with the admin tool or with the web GUI?
    m
    k
    • 3
    • 4
  • p

    Priyank Bagrecha

    11/15/2022, 8:04 PM
    from https://dev.startree.ai/docs/pinot/concepts/segment-retention
    Copy code
    INFO
    There are a couple of scenarios where segments in offline tables won't be purged:
    
    * If the segment doesn't have an end time. This would happen if the segment doesn't contain a time column.
    * If the segment's table has a segmentIngestionType of REFRESH.
    In addition, segments will not be purged in real-time or offline tables if the retention period isn't specified.
    In my case, I have an offline table which doesn't have a time column. The table has a segment ingestion type of refresh and there is NO retention period configured on the table. Is this a reason for segments not getting deleted from the disc when invoking delete all segments api?
    m
    • 2
    • 1
  • e

    Ehsan Irshad

    11/16/2022, 6:30 AM
    Hi Team. We are creating a dimension table in Pinot, documentation doesnt state the need of replication property in tableConfig -> segmentConfig. But we cannot create a table without it. Can we correct the documentation. Also what value shall we set for dim? as its gonna be replicated to all the tagged servers anyway?
    m
    • 2
    • 1
  • l

    Loïc Mathieu

    11/16/2022, 11:38 AM
    Hi Team, Is there a way for a realtime table to consume from a list of Kafka topic or a regex ?
    m
    • 2
    • 6
  • m

    Mathieu Alexandre

    11/16/2022, 1:53 PM
    hello 👋 does anyone has already seen this upload error with
    adls
    as deepstore and image
    release-0.11.0
    ?
    Copy code
    Caused by: com.azure.storage.file.datalake.models.DataLakeStorageException: Status code 412, "{"error":{"code":"ConditionNotMet","message":"The condition specified using HTTP conditional header(s) is not met.
    m
    s
    • 3
    • 4
  • r

    Ralph Debusmann

    11/16/2022, 2:35 PM
    Hi, I am trying to ingest my first table into Pinot from Kafka... but I can't get it to work yet... can you spot what mistake I make here?
    sentiments-schema.jsonsentiments-table-realtime.json
    m
    s
    • 3
    • 33
  • r

    Ralph Debusmann

    11/16/2022, 2:37 PM
    The messages look like this... { "datetime": "2022-11-16 140335.000+00:00", "text": "blabla", "source": { "name": "twitter", "id": "1592881066515509248", "user": "papokeanu" }, "label": "gold", "sentiment": { "model": "finbert", "score": 43 } }
  • r

    Ralph Debusmann

    11/16/2022, 2:38 PM
    I suspect that the datetime column causes problems, but I don't see anything in the Pinot logs (or at least I can't spot it - the Pinot logs are massive...)
  • r

    Ralph Debusmann

    11/16/2022, 2:38 PM
    Maybe it is also a problem of the flattening of the nested JSON?
  • r

    Ralph Debusmann

    11/16/2022, 2:38 PM
    Any help would be greatly appreciated!
  • r

    Ralph Debusmann

    11/16/2022, 3:56 PM
    One more question - how can I drop a table + schema definition (in case I'd like to fix it)? There is only "AddTable" and no "DropTable"...
    m
    s
    • 3
    • 5
  • t

    Thomas Steinholz

    11/16/2022, 6:12 PM
    Hi team, I am having trouble with the
    order by
    clause using the time column of the table. The query ends up timing out for tables with more than hundreds of millions of rows when requesting more than a limit of 10s of records. For example, a query with a limit of 100 or 1000s will return with pinot server not-responded error. Are there any suggestions for improving the limit size of the
    order by
    operator on the time column of the table?
    m
    s
    • 3
    • 7
1...626364...166Latest