:wave: Extremely new user here, just trying to loa...
# troubleshooting
c
👋 Extremely new user here, just trying to load some data from parquet files and finding the process really hard compared to Spark. Every option I see within Flink requires me to specify some sort of schema (example). But I would much prefer to just rely on Parquet’s built in schema…. any advice?
m
Have you considered using the SQL implementation for Parquet? E.g. https://nightlies.apache.org/flink/flink-docs-stable/docs/connectors/table/formats/parquet/ ? I think that would be easier.
c
@Martijn Visser Thanks for the response, It seems like I have to specific a schema in that case too, no?
m
Yes, there’s no automatic import of Parquets schema to match it with the SQL type system
Such a type system doesn’t exist in the DataStream API, so there even more work is required
c
Maybe I am missing something. This code from the datastream API reference looks like a type system too:
Copy code
row_type = DataTypes.ROW([
    DataTypes.FIELD('f7', DataTypes.DOUBLE()),
    DataTypes.FIELD('f4', <http://DataTypes.INT|DataTypes.INT>()),
    DataTypes.FIELD('f99', DataTypes.VARCHAR()),
])
What is the difference you are referring too?