Hey, I have a Flink job on Java 11. After updati...
# troubleshooting
y
Hey, I have a Flink job on Java 11. After updating Flink from 1.15.2 to 1.16.2 I recieve tons of these:
Copy code
will be processed as GenericType. Please read the Flink documentation on "Data Types & Serialization" for details of the effect on performance and schema
On 1.15 it was totally fine and used POJO. Is there something I'm missing here? Why Flink isn't using POJO anymore?
m
I don't think anything was changed on this level. Could it be that this is a new job that you've started on 1.16 that now shows this error, while you haven't seen it in 1.15 because you restarted from a job there? And that you actually would see the same message in 1.15 is you would start a new job there too?
y
Well, we were running code on 1.15 for monthes, with multiple restarts (completely started from scratch, deleted flinkdeployment and then redeployed). And only codechange that I did was changing 1.15.2 to 1.16.2. And only then these logs appeared. Is there a way to enforce using POJO maybe? Or maybe there is some default configs for "unknown type"?
m
You can consider disabling generic types, then Flink will throw an
UnsupportedOperationException
if it encounters a data type that would go through Kryo
Ah I think I know why it's being logged
Can it be that your POJO contains a generic type?
y
yep, definetely can πŸ™‚
so it was same before, and now I have warning
makes sense πŸ˜‰
m
Indeed
y
So, if I have, let's say a "List" or "Map" changing it to "HashMap" and "ArrayList" may help to speed up things and use real POJO without GenericType?
m
Yes. Generic types are serialized with Kryo and that's definitely slower. See https://flink.apache.org/2020/04/15/flink-serialization-tuning-vol.-1-choosing-your-serializer-if-you-can/ for some benchmarks
πŸ™Œ 1
y
Thanks, Martijn.
m
Yw
y
last question Is same a true for state? State will be faster if I use "HashSet" instead of "Set"?
m
That I'm not 100% sure off, but I do think so
y
Thanks once again πŸ™‚