Would anyone be able to point me to either docs or...
# general
e
Would anyone be able to point me to either docs or code that would provide lower-level detail on the structure of a segment file and how to create one? Not how to use the admin tools to create a segment, but rather what the admin tool is doing to create a segment from an Avro input file for example. I’m curious about the
Segment Metadata Push
bulk ingestion strategy[1], which seems to imply writing segments to one of a few distributed file systems first, and then informing the controller about the segments and their associated metadata. I suppose I’m looking for the generic internals to create a segment from input data. Is `SegmentGenerationUtils.java`[2] the right starting place? Thanks! [1] https://docs.pinot.apache.org/basics/data-import/batch-ingestion#3-segment-metadata-push [2] https://github.com/apache/incubator-pinot/blob/master/pinot-common/src/main/java/o[…]che/pinot/common/segment/generation/SegmentGenerationUtils.java
k
If you look at the
SegmentCreationMapper
class in the pinot-hadoop sub-project, it gives a fairly self-contained overview of what code is called to convert an input file to a segment. The key bit (in the
map()
method) is the call to
SegmentIndexCreationDriver.build()
.
🙏 1
e
Thanks Ken! Much appreciated 😁
m
Thanks @User