Hi Team, I was playing around with the getting sta...
# troubleshooting
c
Hi Team, I was playing around with the getting started stream data — meetupRsvp I am able to load data and query. I was trying to understand the size implications of the input data vs output table. I see a rest api with table size info, however it is giving size as 0. Any idea what could be causing this, since I can query the table
Copy code
/tables/{tableName}/size
{
  "tableName": "meetupRsvp",
  "reportedSizeInBytes": 0,
  "estimatedSizeInBytes": 0,
  "offlineSegments": null,
  "realtimeSegments": {
    "reportedSizeInBytes": 0,
    "estimatedSizeInBytes": 0,
    "missingSegments": 0,
    "segments": {}
  }
}
Also, where is the actual location on the disk where the segments are stored ? Is there some config for this ?
n
i believe this api doesn’t count realtime segments. meetupRsvp is a realtime only table
actual segments will be in /<java.io.tmpdir>/<millis>/dataDir
there should be a log on startup:
Copy code
2020/07/09 12:04:19.699 INFO [StartControllerCommand] [main] Executing command: StartController -clusterName QuickStartCluster -controllerHost null -controllerPort 9000 -dataDir /var/folders/3z/qn6k60qs6ps1bb6s2c26gx040000gn/T/1594321444562/data/PinotControllerDir0 -zkAddress localhost:2123
c
Thanks @Neha Pawar. Ill check
I see the data dirs as 0 bytes
Copy code
du -hsc /var/folders/v0/1cy26sms2rv0x68whkb0jc3r0000gn/T/1594321707531/data/PinotControllerDir0/127.0.0.1_9000/*
  0B	/var/folders/v0/1cy26sms2rv0x68whkb0jc3r0000gn/T/1594321707531/data/PinotControllerDir0/127.0.0.1_9000/fileDownloadTemp
  0B	/var/folders/v0/1cy26sms2rv0x68whkb0jc3r0000gn/T/1594321707531/data/PinotControllerDir0/127.0.0.1_9000/fileUploadTemp
  0B	/var/folders/v0/1cy26sms2rv0x68whkb0jc3r0000gn/T/1594321707531/data/PinotControllerDir0/127.0.0.1_9000/untarredFileTemp
I am able to see the data though in the UI / table
n
if the realtime segment is still in memory, there will be no segment
can you check list of segments from the swagger ui
also cehck similar directory on server
c
List of segments
Copy code
[
  {
    "REALTIME": [
      "meetupRsvp_REALTIME_1594321725918_0__0__1594321726045"
    ]
  }
]
n
so if there’s only one, it’s still an in-memory segment - not flushed to disk
if all you want to do is study segment sizes, try the batch quickstart.
c
yeah.. ok
Thanks for your comments. I was able to start batch and could see the size. Here are some logs
Copy code
Tarring segment from: /var/folders/v0/1cy26sms2rv0x68whkb0jc3r0000gn/T/pinot-1594410153012/output/transcript_OFFLINE_1570863600000_1572418800000_2 to: /var/folders/v0/1cy26sms2rv0x68whkb0jc3r0000gn/T/pinot-1594410153012/output/transcript_OFFLINE_1570863600000_1572418800000_2.tar.gz
Size for segment: transcript_OFFLINE_1570863600000_1572418800000_2, uncompressed: 6.03K, compressed: 1.58K
I see the uncompressed size as 6.03K. I am assuming this is the size of the segment. Also, Is it fair to assume that the pinot query reads the tar to get the results ? I am not sure why we tar the files, since it would be more costly to read. Thanks for all the help.
n
the compressed segments are kept in the deep store. in case of the batch quickstart, the controller’s local disk acts as deep store, hence you will find the compressed segments on controller. the uncompressed segments are kept on the servers, which are used for queries
if servers go down or nodes are added, the new servers will download and uncompress the tar from the deep store
c
aah ok. So the size uncompressed size will be same in both the deep store and servers ? BTW, what is the default location of these in the servers ?
n
the uncompressed size could be more in the server. because of indexes. the operator may choose to build indexes while building segments, or build only segment and then server builds indexes once it receives the segments. So the compressed segment may or may not have nidexes
c
This is what I see in the compressed folder
n
for quickstart, the location is somewhere in tmpDir/timestamp/PinotServerDataDir. it should’ve printed out when starting
👍 1
c
got it..thanks