Shen Wan
09/16/2020, 3:39 PMKishore G
Shen Wan
09/16/2020, 4:00 PMselect * from abc_test where service_slug='xyz'
very simple query like thisShen Wan
09/16/2020, 4:00 PMKishore G
Kishore G
Shen Wan
09/16/2020, 4:11 PMselect count(*) from …
Shen Wan
09/16/2020, 4:12 PMNeha Pawar
Shen Wan
09/16/2020, 4:13 PMShen Wan
09/16/2020, 4:13 PMNeha Pawar
select count(*), service_slug from abc_test group by service_slug order by count(*) limit 10
and use one of those?Shen Wan
09/16/2020, 4:14 PMKishore G
Kishore G
Shen Wan
09/16/2020, 4:16 PMShen Wan
09/16/2020, 4:16 PMShen Wan
09/16/2020, 4:18 PMselect count(*) from oas_log_test where service_slug='ofo4'
this returns nothingKishore G
Shen Wan
09/16/2020, 4:19 PMShen Wan
09/16/2020, 4:19 PMKishore G
Neha Pawar
Shen Wan
09/16/2020, 4:21 PMNeha Pawar
return Math.abs(value.hashCode()) % _numPartitions;
Shen Wan
09/16/2020, 4:25 PMShen Wan
09/16/2020, 4:26 PMShen Wan
09/16/2020, 4:27 PMservice_slug
column used for partition. How does Pinot handle this?Neha Pawar
After setting the above config, data needs to be partitioned with the same partition function and number of partitions before running Pinot segment build and push job for offline push. Realtime partitioning depends on the kafka for partitioning. When emitting an event to kafka, a user need to feed partitioning key and partition function for Kafka producer API
Kishore G
Shen Wan
09/16/2020, 4:31 PMMayank
Kishore G
Kishore G
Neha Pawar
{\"columnPartitionMap\":{\"service_slug\":{\"functionName\":\"HashCode\",\"numPartitions\":16,\"partitions\":[10]}}}
Mayank
Shen Wan
09/16/2020, 4:33 PMofo1
returns a bit more infoNeha Pawar
Neha Pawar
Mayank
Neha Pawar
Mayank
Shen Wan
09/16/2020, 4:38 PMMayank
column.service_slug.partitionFunction = Murmur
column.service_slug.numPartitions = 32
column.service_slug.partitionValues = 24
Mayank
Neha Pawar
Mayank
Kishore G
Mayank
Mayank
where service_slug in ('ofo1')
? I want to validate a theoryNeha Pawar
Mayank
Shen Wan
09/16/2020, 5:12 PMselect count(*) from oas_log_test where service_slug in ('ofo4')
returns nothingMayank
Mayank
Mayank
ofo4
?Neha Pawar
Shen Wan
09/16/2020, 5:38 PMShen Wan
09/16/2020, 5:41 PMselect distinct service_slug from oas_log_test where service_slug <> 'null'
Neha Pawar
Shen Wan
09/16/2020, 6:05 PMNeha Pawar
Neha Pawar
Shen Wan
09/16/2020, 6:07 PMShen Wan
09/16/2020, 6:07 PMShen Wan
09/16/2020, 6:08 PMNeha Pawar
Neha Pawar
Shen Wan
09/16/2020, 6:32 PMShen Wan
09/16/2020, 6:34 PMShen Wan
09/16/2020, 7:46 PMNeha Pawar
Neha Pawar
Shen Wan
09/16/2020, 8:20 PMKishore G
Neha Pawar
Shen Wan
09/16/2020, 8:26 PMShen Wan
09/16/2020, 8:26 PMNeha Pawar
Kishore G
Shen Wan
09/16/2020, 8:43 PMKishore G
Shen Wan
09/16/2020, 8:54 PMNeha Pawar
Shen Wan
09/16/2020, 8:59 PM/var/pinot/server/data/segment
Shen Wan
09/16/2020, 9:00 PM…/data/index
Neha Pawar
Shen Wan
09/16/2020, 9:14 PMShen Wan
09/16/2020, 9:15 PMindex_map
Shen Wan
09/16/2020, 10:49 PMdiskSizeInBytes
. Are the rest 40% raw data? Does this look reasonable?Shen Wan
09/16/2020, 10:57 PMNeha Pawar
Neha Pawar
Shen Wan
09/16/2020, 11:00 PMKishore G
Shen Wan
09/16/2020, 11:01 PMKishore G
Shen Wan
09/16/2020, 11:04 PMKishore G
Kishore G
Shen Wan
09/16/2020, 11:06 PMKishore G
Kishore G
Kishore G
Shen Wan
09/16/2020, 11:07 PMShen Wan
09/16/2020, 11:08 PMKishore G
Kishore G
Shen Wan
09/16/2020, 11:09 PMKishore G
Shen Wan
09/16/2020, 11:10 PMShen Wan
09/16/2020, 11:12 PMindex_map
add up to just ~60% of diskSizeInBytes
?Kishore G
Shen Wan
09/16/2020, 11:20 PMShen Wan
09/16/2020, 11:21 PMShen Wan
09/16/2020, 11:21 PMindex_map
Kishore G
Shen Wan
09/16/2020, 11:39 PMShen Wan
09/16/2020, 11:42 PMKishore G
Kishore G
Shen Wan
09/16/2020, 11:45 PMShen Wan
09/16/2020, 11:45 PMKishore G
Kishore G
Shen Wan
09/17/2020, 1:12 AMreq
and resp
but do not see anything related.Shen Wan
09/17/2020, 3:56 AMoas_log_test
and created a new table oas_log_test_v2
with new schema. But the new table contains 12 million very old records and new records are not flowing in.
Do we need to reset Kafka?Neha Pawar
Neha Pawar
Neha Pawar
Shen Wan
09/17/2020, 5:35 PMoas_log_test
is 404. external view of oas_log_test_v2
is stuck on CONSUMING.Neha Pawar
Shen Wan
09/17/2020, 5:57 PMNeha Pawar
Neha Pawar
Shen Wan
09/17/2020, 6:00 PMShen Wan
09/17/2020, 6:01 PMlargest
Neha Pawar
largest
will not remove older data from the table. that signal is for a new table to about where to start consumptionShen Wan
09/17/2020, 6:03 PMNeha Pawar
Neha Pawar
Shen Wan
09/17/2020, 6:03 PMNeha Pawar
Shen Wan
09/17/2020, 6:04 PMShen Wan
09/17/2020, 6:04 PMNeha Pawar
Shen Wan
09/17/2020, 6:05 PMShen Wan
09/17/2020, 6:05 PMNeha Pawar
Neha Pawar
Shen Wan
09/17/2020, 6:08 PMShen Wan
09/17/2020, 6:14 PMShen Wan
09/17/2020, 6:18 PMNeha Pawar
Shen Wan
09/17/2020, 7:17 PMShen Wan
09/17/2020, 7:18 PMShen Wan
09/17/2020, 7:19 PMShen Wan
09/17/2020, 7:22 PM2020/09/17 18:47:41.613 ERROR [LLRealtimeSegmentDataManager_oas_log_test_v2__11__0__20200917T1810Z] [oas_log_test_v2__11__0__20200917T1810Z] Could not build segment
Shen Wan
09/17/2020, 7:22 PMNeha Pawar
Neha Pawar
Shen Wan
09/17/2020, 7:47 PMShen Wan
09/17/2020, 7:47 PMShen Wan
09/17/2020, 7:50 PMShen Wan
09/17/2020, 8:00 PMShen Wan
09/17/2020, 8:04 PMNeha Pawar
Neha Pawar
} catch (Exception e) {
segmentLogger.error("Could not build segment", e);
but i dont see the exceptionShen Wan
09/17/2020, 8:05 PMShen Wan
09/17/2020, 8:06 PMShen Wan
09/17/2020, 8:06 PMNeha Pawar
Neha Pawar
Neha Pawar
Shen Wan
09/17/2020, 8:11 PMShen Wan
09/17/2020, 8:12 PMNeha Pawar
Shen Wan
09/17/2020, 8:16 PMShen Wan
09/17/2020, 8:17 PMShen Wan
09/17/2020, 8:18 PM