Apache Pinot

Hello everybody, I have a question for you. Is it possible to modify the metadata of a segment? I would like to:
• create the segments with spark and store them in hdfs
• move them with distcp to gcs
• load them with a metadata push to the cluster
but this leaves me with segments having `"custom.map": "{\"input.data.file.uri\":\"hdfs://***\"}",`  and instead I would want to have something like `"custom.map": "{\"input.data.file.uri\":\"gs://***\"}",`  so that the segment fetcher would know where to get the data from. Do you know if it's possible to do what I'm asking?  Thanks

if you’ve moved segments to gcs and provided that path as “outputDir” in your segment push job, according to the code you should not be seeing “hdfs” anymore. Are you talking about the metadata.properties inside the segment or the SegmentkMetadata in zookeeper? The data detcher only uses the one from zookeeper, which I believe should be correctly set

Hello, I'm talking about the metadata that I can fetch from `/segments​/{tableName}​/{segmentName}​/metadata` API, which from my understanding are stored in zk. Correct?

i believe so. but the field you want to look at is “segment.download.url”

```{
  "id": "&lt;segment_id&gt;",
  "simpleFields": {
    "segment.crc": "2408365506",
    "segment.creation.time": "1664198489345",
    "segment.download.url": "http://&lt;controller-host&gt;:9000/segments/&lt;table_name&gt;/&lt;segment_id&gt;",
    "segment.end.time": "1651536000000",
    "segment.index.version": "v3",
    "segment.push.time": "1664199765768",
    "segment.start.time": "1651536000000",
    "segment.time.unit": "MILLISECONDS",
    "segment.total.docs": "74478"
  },
  "mapFields": {
    "custom.map": {
      "input.data.file.uri": "hdfs://&lt;path-to-segment-in-hdfs&gt;"
    }
  },
  "listFields": {}
}```

this field points to the controller directly

also, what populates the
```"mapFields": {
    "custom.map": {
      "input.data.file.uri": "hdfs://&lt;path-to-segment-in-hdfs&gt;"
    }
  },```
field

that means deep store is not set up correctly, and it is defaulting to using the controller disk. have you added the configs for gcs deep store in controller/server config? <https://docs.pinot.apache.org/basics/data-import/pinot-file-system/import-from-gcp>

That might be the issue. I'll work on this

Is there a way to check the current config of the instances?

Where does the controller/server properties are red from? Are they fetched from the live controller through API or from the local deployment? I'm asking because in my case the live cluster runs on k8 and the ingestion job runs from a different machine that just has the default config. If i modify the segment download uri manually to gcs I can reload them from the cluster and the config seems to be propagating correctly on the cluster.

ok i fixed it by adding the config to the local deployment used to run the ingestion, thanks