Apache Flink

Hi Everyone

I’m using ‘filesystem’ connector to sink data into S3 in ‘parquet’ format using TableAPI. What I observed is the partitionedBy columns are missing in the parquet file. Here are the queries I’m using:

`CREATE TABLE data_to_sink (`
	`record_id STRING NOT NULL,`
	`request_id STRING NOT NULL,`
	`source_name STRING NOT NULL,`
	`event_type STRING NOT NULL,`
	`event_name STRING NOT NULL,`
	``date` STRING,`
	`results_count BIGINT`
`) PARTITIONED BY (record_id, source_name, `date`) WITH (`
    `'connector' = 'filesystem',`
    `'path' = '&lt;S3 path&gt;',`
    `'format' = 'parquet'`
`);`

`INSERT INTO data_to_sink`
`SELECT record_id, request_id, source_name, event_type, event_name,`
`DATE_FORMAT(TUMBLE_END(proc_time, INTERVAL '2' MINUTE), 'yyyy-MM-dd') AS record_date, COUNT(*) results_count`
`FROM data_from_source`
`GROUP BY record_id, request_id, source_name, event_type, event_name, TUMBLE(proc_time, INTERVAL '2' MINUTE);`

I can see the parquet files being created, but when I verified the schema using parquet-cli tool, the schema doesn’t show record_id, source_name, `date` fields. I verified the doc, but didn’t find any setting for this. Is this expected?