murat migdisoglu
10/14/2020, 9:31 PMCesar
10/15/2020, 7:20 PMVenkatesan V
10/19/2020, 5:30 PMChundong Wang
10/19/2020, 6:27 PMPERCENTILETDIGEST50
Seunghyun
10/19/2020, 10:56 PMmap
column? I see MAP_VALUE
function at https://docs.pinot.apache.org/users/user-guide-query/supported-transformations#multi-value-column-functions
However, I don’t see any any instruction on how I can configure the schema & store map value to segments.Sri Surya
10/21/2020, 4:37 AMSri Surya
10/21/2020, 9:38 AMDarrenApacheDrill
10/22/2020, 3:40 AMDerek
10/23/2020, 2:57 PMChundong Wang
10/23/2020, 8:05 PMItzik Lavon
10/24/2020, 4:44 PMNoah Prince
10/25/2020, 6:42 PMNoah Prince
10/26/2020, 1:47 PMlazy
mode that would set it to lazily pull segments as they are requested using an LRU cache. It should just take some modification to the SegmentDataManager
and maybe the table manager.
This would allow using s3 as the primary storage, with pinot as the query/caching layer for long term historical tiers of data. Similar to the tiering example, you’d have a third set of lazy servers for reading data older than 2 weeks. This is explicitly to avoid large EBS volume costs for very large data sets.
My main concern is this — a moderately sized dataset for us is 130GB a day. We have some that can be in the terra range per day. Using 500MB segments, you’re looking at ~260 segments a day. Maybe ~80k segments a year. In this case, broker pruning is very important because any segment query sent to the lazy server means materializing data from s3. This data is mainly time series, which means segments would be in time-bound chunks. Does Pinot broker prune segments by time? How is the broker managing segments? Does it just have an in-memory list of all segments for all tables? If so, metadata pruning will become a bottleneck for us on most queries. I’d like to see query time scale logarithmically with the size of the data.
Other concerns for us are around data types. It does not seem Pinot supports data types we commonly use like uint64, fixedpoints, etc. It also doesn’t seem to support nested data structures. How difficult would this be to add? Java BigInt
and BigDecimal
could handle the former assuming we implemented metadata handling. Nested data types is a little more nuanced.Dharak Kharod
10/26/2020, 9:51 PMNoah Prince
10/27/2020, 12:54 AMlâm nguyễn hoàng
10/27/2020, 8:45 PMSeunghyun
10/27/2020, 8:54 PMNoah Prince
10/28/2020, 2:02 PMcolumns.psf, creation.meta, index_map, metadata.properties
? I’m thinking for the s3 lazy loading, it might make sense to have separate caching settings for metadata vs columns.psf
. Like you may want to eagerly load all or most of the metadata since it’s small and means segments can be eliminated quickly.Noah Prince
10/28/2020, 6:23 PMRavi Chikkam
10/30/2020, 12:37 AMRavi Chikkam
10/30/2020, 2:51 AMTanmay Movva
10/30/2020, 7:07 AMTypeError: Failed to fetch
and an Undocumented response for any api call.
fyi, We have deployed Pinot in K8s.Noah Prince
11/02/2020, 3:22 PMKenny Bastani
11/02/2020, 10:25 PMGreg Simons
11/02/2020, 10:34 PMKenny Bastani
11/02/2020, 11:04 PM@here
notifications to this channel, but in this case, it’s important. Thanks everyone. calendly.com/karin-wolokChundong Wang
11/04/2020, 10:03 PMNoah Prince
11/05/2020, 4:52 PMvmarchaud
11/06/2020, 9:43 AMvmarchaud
11/06/2020, 10:01 AM-Dplugins.include=pinot-gcs
and that its bundled by default, however i'm trying out the 0.6.0-RC
and i got the following error:
2020/11/06 09:07:02.367 ERROR [PluginManager] [main] Failed to load plugin [pinot-gcs] from dir [/opt/pinot/plugins/pinot-file-system/pinot-gcs]
java.lang.IllegalArgumentException: object is not an instance of declaring class