Artur Jablonski
09/16/2025, 1:43 PMAmanda
09/16/2025, 3:12 PMAbhilash Mandaliya
09/17/2025, 11:30 AMFabri
09/18/2025, 4:22 AMVaibhav Swarnkar
09/19/2025, 10:10 AMAbhilash Mandaliya
09/19/2025, 11:33 AMEFFECTIVELY_ONCE guarantee and the sink calls the record.fail()?KP
09/19/2025, 6:53 PMbenjamin99
09/20/2025, 9:14 AM{
"level": "warn",
"time": "2025-09-20T07:01:17.961738475Z",
"component": "public-rpc-server",
"error": {
"error": "oxia: failed to append to wal: 535046 can not immediately follow 459362: oxia: invalid next offset in wal",
"kind": "*errors.withStack",
"stack": null
},
"namespace": "bookkeeper",
"peer": "10.194.131.14:52392",
"shard": 6,
"message": "Write stream has been closed by error"
}
I did search the related topic in the Oxia github page, but found no issues/discussions. Does anyone have had facing the similar issues before, or have any clue about how to resolve it?Gergely Fábián
09/20/2025, 4:23 PMbhasvij
09/23/2025, 4:37 PMLari Hotari
09/27/2025, 1:09 PMArtur Jablonski
09/30/2025, 6:24 AMck_xnet
10/01/2025, 1:01 PMLari Hotari
10/02/2025, 4:05 AMFabri
10/03/2025, 8:38 PMPraveen Gopalan
10/06/2025, 6:37 AMzaryab
10/09/2025, 1:02 PMNicolas Belliard
10/15/2025, 2:18 PMdelayedDeliveryTrackerFactoryClassName.
We initially used InMemoryDelayedDeliveryTracker, (because we where using version 2.7 of pulsar) which caused acknowledged delayed messages to be reprocessed after a broker restart likely due to its state stored only in memory. Given our high message volume (millions), this behavior is problematic. A screenshot is available showing the lag escalation following a broker restart. We're generating delayed messages out of sequence, resulting in gaps within the acknowledged message stream. This causes non-contiguous ranges of messages to be marked as deleted or eligible for deletion. In our screenshot, the value of nonContiguousDeletedMessagesRanges is 16833.
To mitigate this following the upgrade of pulsar to the version 4.0.4, we updated the broker config to use org.apache.pulsar.broker.delayed.BucketDelayedDeliveryTrackerFactory, which should persist delayed delivery metadata to disk via BookKeeper ledger buckets.
However, after switching to the bucket-based tracker, we're still seeing the same behavior post-restart. A few observations and questions:
• I checked the pulsar_delayed_message_index_loaded metric and noticed that messages are still being loaded into memory, while pulsar_delayed_message_index_bucket_total remains at zero. Is this expected? Shouldn’t the bucket tracker be persisting and loading from disk?
• Are there additional broker settings required to fully enable bucket-based delayed delivery tracking? For example:
◦ Do we need to explicitly configure delayedDeliveryTrackerBucketSize or delayedDeliveryMaxNumBuckets?
◦ Is there any dependency on topic-level settings or namespace policies that could override the broker-level tracker configuration?
◦ Could other settings interfere with delayed message persistence?
Any insights or guidance would be greatly appreciated. Thanks for your help!benjamin99
10/17/2025, 3:01 AMJonatan Bien
10/21/2025, 7:14 PMMargaret Figura
10/22/2025, 4:00 PMprintln() or other work per message). Again, CPU usage is under 10% for all components, but I see the same small drops.
I started debugging and found Pulsar is dropping because the Netty connection's .isWritable() returns false and this causes Pulsar to immediately drop. This "Returns true if and only if the I/O thread will perform the requested write operation immediately", meaning there is room available in Netty's ChannelOutboundBuffer. I found that if I increase the Netty low/highWaterMarks, the drops go away, but it's not possible without a code change to Pulsar broker.
I'm looking for any suggestions on different configurations I can try. Thanks!!Vaibhav Swarnkar
10/25/2025, 7:26 PMKiryl Valkovich
10/26/2025, 7:42 PMAndrew
10/29/2025, 5:11 AMDavid K
10/29/2025, 12:47 PMJack Pham
10/29/2025, 5:41 PM..<subscription>-<consumerName>-DLQ) be as well, since it uses the consumer name in the producer’s name? Will consumer stop consuming message if this happen?
We are using Pulsar client 4.0.0 where producer name constructed as:
.producerName(String.format("%s-%s-%s-DLQ", this.topicName, this.subscription, this.consumerName))Romain
10/29/2025, 7:23 PMschemaValidationEnforced=true and isAllowAutoUpdateSchema=false (only under an approved process), so only admins can push schemas.
Here’s the issue: when a consumer is configured with a DeadLetterPolicy and a message fails too many times (or is negatively acknowledged repeatedly), the client will publish the message to a dead-letter topic (default name <topic>-<subscription>-DLQ) after the redelivery threshold.
That topic doesn’t necessarily exist ahead of time (unless created before), so when it’s first used it may trigger topic creation and/or schema registration. Because our namespace forbids auto schema updates and enforces schemas, this can fail - the consumer isn’t authorized to register the schema for the DLQ topic.
To work around this, we’re creating a separate namespace (e.g., <namespace>-dlq) where:
• isAllowAutoUpdateSchema=true
• schemaValidationEnforced=false
• so consumers can safely publish DLQ messages without schema conflicts.
Is this the recommended approach? Is there a cleaner way to allow DLQ schema creation while keeping production namespaces locked down?
Any official guidance or community best practices would be really appreciated 🙏
Thanks!Francesco Animali
10/30/2025, 8:52 AMChaitanya Gudipati
11/05/2025, 3:48 PMJack Pham
11/05/2025, 11:11 PM