Is that correct ? Flink kafka producer ack level =...
# random
s
Is that correct ? Flink kafka producer ack level == delivery guarantee ? like None = 0 AT_LEAST_ONCE = 1 EXACTLY_ONCE = “all” ?
o
No: this is about the acknowledgement from Kafka, which says something about durability. Not about delivery guarantees, though the two are related. In short: • acks==0 means no guarantees on durability (not even that the data has been recaived by Kafka) • acks==1 means the partition leader has stored the data, but if it crashes before the data is replicated, the data may be lost • acks==-1/all means the data has been replicated to the in-sync replicas as well See https://docs.confluent.io/platform/current/installation/configuration/producer-configs.html#acks
s
many thanks @Oscar Westra van Holthe - Kind is that means I can configure any acks level I want in my kafka producer? cannot find any good documentation how to proper pass acks level into my flink kafka producer
and how to check what is default acsk level in flink kafka producer
m
You just have to configure checkpointing with the proper delivery guarantees
You can't pass ack levels yourself
s
@Martijn Visser thanks! so my original delivery_guarantee <> kafka acks level is correct?
Copy code
deliv_guarantee NONE = kafka producker ack: 0
deliv_guarantee AT_LEAST_ONCE = kafka producker ack:1
deliv_guarantee EXACTLY_ONCE = kafka producker ack: "all" ?
o
Again: No, it is not correct. The two are related, but definitely different things. The only thing you can say is that ack==0 means you can never offer more than “at most once”, but all other guarantees (“exactly once” or “at least once”) depend on other settings like the use of transactions and checkpoints. Besides, as @Martijn Visser mentioned, you cannot pass ack levels anyway, so drop them from your mind. Use checkpointing instead. Below is a link to a guide to delivery guarantees. It does not mention producers ack settings, but mentions that for “exactly once” you need to enable transactions (i.e., exactly once semantics on Kafka) and specify a transaction id. Oh, and enable checkpointing. https://medium.com/cloudera-inc/a-simple-guide-to-processing-guarantees-in-apache-flink-ca7e70431fdc
s
thanks @Oscar Westra van Holthe - Kind nice article but I still cant understand following. in example 1. I have flink with configured checkpointing. In a job config I have -
streamEnv.getCheckpointConfig.setCheckpointingMode(<http://CheckpointingMode.AT|CheckpointingMode.AT>_LEAST_ONCE)
2. in a KafkaSink I dont have configured
.setDeliverGuarantee(DeliveryGuarantee.EXACTLY_ONCE)
What will be ack configuration in my kafka sink?
o
Your delivery guarantee will be “at least once”, with whatever ack configuration is suitable to make that happen.
s
Our kafka sometimes pretty unstable (low in sink replicas) and our devops blaming us “set THE ack = 1”
o
I’m sorry if this doesn’t answer your question, but there is no information to be had on this. The difference between ack=1 and ack=all is negligible with respect to delivery guarantees. It only influences how much of the world must crash to break delivery guarantees.
m
I think you need to bring the discussion back to "what is the desired output"
3
s
@Martijn Visser agree. but they blame us that we dont know what is ack level 🙂 and actually it’s really interesting
m
It's an apple and oranges comparison. Low in-sync replicas is related to the way how Kafka replicates, see for example https://docs.confluent.io/kafka/design/replication.html If you're looking at Flink and Kafka, that has nothing to do with the way that Kafka replicates messages, since that's a Kafka cluster/broker thing. Transactions in Flink and Kafka are about how you achieve exactly-once semantics. See https://www.confluent.io/blog/transactions-apache-kafka/
👍 1