I have a `KeyedCoProcessFunction` in pyflink 1.16....
# troubleshooting
n
I have a
KeyedCoProcessFunction
in pyflink 1.16.0. When I call
ctx.get_current_key()
like the docs show, I just get
None
back instead of my key. Does this work for anyone else?
Makes it seem like
set_current_key
is never being called.
Seems like it works here for a
KeyedProcessFunction
I can call
set_current_key
on my own and things seem to work but 🥴
That doesn't resolve the fact that my datastream simply isn't being partitioned by selected keys. I'm doing things exactly like what is in that test module.
cc @Dian Fu in case you have any insight
_is_keyed_stream()
agrees that my connected stream is keyed
I can see my key selector properly returning the right keys in logs
Actually, I think I've just been misunderstanding how things are separated. I'm still surprised the key is not available when
process_element
is called. It seems it's only set in the timer context?
I think I was expecting unique instances of my process function to exist for each partition, but it seems that's not the case.
d
@Nathanael England I believe this is a bug…
Have created https://issues.apache.org/jira/browse/FLINK-31690 to track this issue.
n
So was it correct to think that each instance of my process function would only handle one key or would the same instance handle multiple keys?
d
@Nathanael England It will handle multiple keys as the key is already set into the state backend. The current key is not set into the context and so it’s not available to access it via
ctx.get_current_key
.
So if you don’t need to access the current key via
ctx.get_current_key
, there will be no problem. Otherwise, you need to manually compute the current key just as what you have done.
PS: Have already submitted a PR for this issue: https://github.com/apache/flink/pull/22323
🙌 1