Hey Flinkers, Just circling back around to an ear...
# troubleshooting
r
Hey Flinkers, Just circling back around to an earlier issue that I was encountering with the InvalidPidMappingException (related to several sinks that use exactly-once processing). Earlier today I did a deploy of the job of a Flink job within in a GKE environment. The job was running previously without issue, however after being suspended (taking a savepoint), and restoring from the savepoint after the deployment, I began seeing the follwing issue on the job:
Copy code
Caused by: org.apache.kafka.common.errors.InvalidPidMappingException: The producer attempted to use a producer id which is not currently assigned to its transactional id.
It appears to be stemming from a restore of the job after deployment since the job dies almost immediately. I’ve triply confirmed that nothing really changed in the job as it relates to transactions, however the following changes occurred, which may/may not be relevant: • The addition of a
rebalance()
operator that occurs upstream from the sink. • The topic that one of the sinks writes to has now changed (although the UID was not updated, but the topic is string based so it shouldn’t cause any schema issues) The issue seems to affect/fail against all of the sinks that use exactly-once processing. Any advice/recommendations would be welcome and glad to share some more details and specifics about the job if that would be helpful.