https://pinot.apache.org/ logo
Join Slack
Powered by
# troubleshooting
  • k

    Kishore G

    06/26/2020, 9:13 PM
    also in the controller
  • p

    Pradeep

    06/26/2020, 9:16 PM
    yup I did
  • p

    Pradeep

    06/26/2020, 9:17 PM
    For example
    <http://ec2-3-14-65-101.us-east-2.compute.amazonaws.com:9000/help#!/Segment/getServerToSegmentsMap|GET> /segments/{tableName}/servers
    api returns segments distributed across two servers (currently i have tagged one as realtime and other as offline)
  • k

    Kishore G

    06/26/2020, 9:18 PM
    does it print the segment uri in the log?
  • k

    Kishore G

    06/26/2020, 9:22 PM
    also whats the metadata for that segment
    Copy code
    curl -X GET --header 'Accept: application/json' '<http://controller_host>:port/segments/<tableName>/<segment_name>/metadata'
  • p

    Pradeep

    06/26/2020, 9:26 PM
    I don’t actually see any segments downloaded into the offline tagged machine, I guess swagger api was just showing me the ideal state.
    Copy code
    Got temporary error status code: 500 while downloading segment from: http://<ip>:9000/segments/<table>/<table>__60__0__20200625T191
    3Z to: /home/ubuntu/pinot/data/<table>_REALTIME/<table>__60__0__20200625T1913Z.tar.gz
    org.apache.pinot.common.exception.HttpErrorStatusException: Got error status code: 500 (Internal Server Error) with reason: "Failed to read response into
    file: /home/ubuntu/data/fileDownloadTemp/<table>/<table>__60__0__20200625T1913Z-1386265827082994" while sending request: http://<ip>:9000/segments/<table>/<table>__60__0__20200625T1913Z to controller: <ip>, version: Unknown
  • p

    Pradeep

    06/26/2020, 9:27 PM
    Copy code
    {
      "segment.realtime.endOffset": "78125",
      "segment.time.unit": "MILLISECONDS",
      "segment.start.time": "1593031658266",
      "segment.flush.threshold.size": "78125",
      "segment.realtime.startOffset": "0",
      "segment.end.time": "1593080340734",
      "segment.total.docs": "78125",
      "segment.table.name": "<table>_REALTIME",
      "segment.realtime.numReplicas": "1",
      "segment.creation.time": "1593112417336",
      "segment.realtime.download.url": "http://<ip>:9000/segments/<table>/<table>__57__0__20200625T1913Z",
      "segment.name": "<table>__57__0__20200625T1913Z",
      "segment.index.version": "v3",
      "custom.map": null,
      "segment.flush.threshold.time": null,
      "segment.type": "REALTIME",
      "segment.crc": "4231660115",
      "segment.realtime.status": "DONE"
    }
  • p

    Pradeep

    06/26/2020, 9:28 PM
    is the download url supposed to be something else?
  • k

    Kishore G

    06/26/2020, 9:28 PM
    it should be a valid url pointing to your s3
  • p

    Pradeep

    06/26/2020, 9:28 PM
    I do see the segments in S3 though
  • p

    Pradeep

    06/26/2020, 9:28 PM
    let me double check my config
  • k

    Kishore G

    06/26/2020, 9:28 PM
    but that url seems to be pointing to controller
  • p

    Pradeep

    06/26/2020, 9:29 PM
    yup
  • k

    Kishore G

    06/26/2020, 9:29 PM
    and not s3
  • k

    Kishore G

    06/26/2020, 9:29 PM
    that means segments were getting uploaded to controller
  • k

    Kishore G

    06/26/2020, 9:29 PM
    which is the default
  • p

    Pradeep

    06/26/2020, 9:35 PM
    weird, I see the exact segment being present in S3
  • p

    Pradeep

    06/26/2020, 9:36 PM
    and some of the latest segments are getting uploaded into S3 too
  • p

    Pradeep

    06/26/2020, 9:41 PM
    I found this in the controller logs
    Copy code
    Could not get directory entry for s3://<bucket>/<dir>/<table>/<table>__57__0__20200625T1913Z
    Copy /home/ubuntu/data/fileUploadTemp/<table>__57__0__20200625T1913Z.0943bed4-49e7-4bec-999b-35e9802b3d73 from local to s3://<bucket>/<dir>/<table>/<table>__57__0__20200625T1913Z
    Processing segmentCommitEnd(Server_<ip>_8098, 78125)
    Committing segment <table>__57__0__20200625T1913Z at offset 78125 winner Server_<ip>_8098
    Committing segment metadata for segment: <table>__57__0__20200625T1913Z
  • p

    Pradeep

    06/26/2020, 9:52 PM
    Okay, I think I got a rough picture of what’s going on, on the REALTIME server, helix state for the segment was changed from ONLINE to OFFLINE and then to DROPPED and parallely on the OFFLINE server segment state was changed from OFFLINE to ONLINE REALTIME server part seemed to have executed through but the OFFLINE part is stuck because of the S3 issue 1. Still not sure why segment url is set to
    controller ip
    2. IIUC Does having 1 replica imply, some segments might be not available for querying when they are moving across servers?
  • s

    Shounak Kulkarni

    06/29/2020, 4:26 PM
    Hey all, wanted to know if number of partitions matter in replica group concept? And if one of the replica group has a missing segment then how to make sure it downloads that segment from controller?
  • s

    Sidd

    06/29/2020, 4:27 PM
    Are you talking about numInstancesPerPartition or numPartitions in segmentPartitionConfig?
  • s

    Sidd

    06/29/2020, 4:28 PM
    replica groups and partitioning are two different concepts.
  • m

    Mayank

    06/29/2020, 4:28 PM
    I think @Shounak Kulkarni is referring to realtime topic partitions.
  • s

    Shounak Kulkarni

    06/29/2020, 4:29 PM
    Yes partitions on topic
  • m

    Mayank

    06/29/2020, 4:30 PM
    Your replica group size (num instance in replica) should be enough to consume the number of partitions you have as well as serve your qps/latency
  • s

    Shounak Kulkarni

    06/29/2020, 4:31 PM
    Okk and what about the missing segment?
  • m

    Mayank

    06/29/2020, 4:31 PM
    Why would the segment be missing?
  • s

    Shounak Kulkarni

    06/29/2020, 4:32 PM
    So this issue came up due to not enough space on server pv so it was not able to download the segment and gave up after 3 retries
  • m

    Mayank

    06/29/2020, 4:32 PM
    Then you need more storage on the server
1...117118119...166Latest