https://pulsar.apache.org/ logo
Join SlackCommunities
Powered by
# general
  • s

    Soumya Ghosh

    01/16/2025, 8:49 PM
    We are observing an intermittent issue where consumer does not receive messages but there is a non-zero backlog in the subscription. Setup summary: Pulsar 4.0.1, 3 brokers, 5 bookies, 3 ZKs, Produce and Consume through Pulsar Java client (3.3.3) When we noticed this issue first, we couldn’t figure out what was and then after 2-3 minutes the data in backlog was delivered. But during that we did run CLI to consume messages from Earliest and received no messages. This time when issue was observed again, we again ran CLI from the broker node itself to consume messages from earliest. We observed that no messages were delivered for 10 minutes and then suddenly all backlog messages were delivered to original consumer and CLI consumer. The backlog in Java consumer was 30. The backlog in CLI consumer was 2697 as we were configured the subscription to consume from earliest message. Topics details Partition - 1 Dispatch throttling - 1 MB/s Publish throttling - 1 MB/s Retention - 1 day Message TTL - 1 day Pulsar topic stats, noted that blockedConsumerOnUnackedMsgs and blockedSubscriptionOnUnackedMsgs was false when backlog messages were not delivered.
    Copy code
    {
      "msgRateIn": 0,
      "msgThroughputIn": 0,
      "msgRateOut": 0,
      "msgThroughputOut": 0,
      "bytesInCounter": 270413,
      "msgInCounter": 2697,
      "systemTopicBytesInCounter": 0,
      "bytesOutCounter": 358717,
      "msgOutCounter": 3577,
      "bytesOutInternalCounter": 0,
      "averageMsgSize": 0,
      "msgChunkPublished": false,
      "storageSize": 270413,
      "backlogSize": 270413,
      "backlogQuotaLimitSize": -1,
      "backlogQuotaLimitTime": -1,
      "oldestBacklogMessageAgeSeconds": 29328,
      "oldestBacklogMessageSubscriptionName": "java-sdk",
      "publishRateLimitedTimes": 0,
      "earliestMsgPublishTimeInBacklogs": 0,
      "offloadedStorageSize": 268368,
      "lastOffloadLedgerId": 0,
      "lastOffloadSuccessTimeStamp": 0,
      "lastOffloadFailureTimeStamp": 0,
      "ongoingTxnCount": 0,
      "abortedTxnCount": 0,
      "committedTxnCount": 0,
      "publishers": [],
      "waitingPublishers": 0,
      "subscriptions": {
        "console_sub": {
          "msgRateOut": 0,
          "msgThroughputOut": 0,
          "bytesOutCounter": 0,
          "msgOutCounter": 0,
          "msgRateRedeliver": 0,
          "messageAckRate": 0,
          "chunkedMessageRate": 0,
          "msgBacklog": 2697,
          "backlogSize": 270413,
          "earliestMsgPublishTimeInBacklog": 0,
          "msgBacklogNoDelayed": 2697,
          "blockedSubscriptionOnUnackedMsgs": false,
          "msgDelayed": 0,
          "msgInReplay": 0,
          "unackedMessages": 0,
          "type": "Failover",
          "msgRateExpired": 0,
          "totalMsgExpired": 0,
          "lastExpireTimestamp": 0,
          "lastConsumedFlowTimestamp": 0,
          "lastConsumedTimestamp": 0,
          "lastAckedTimestamp": 0,
          "lastMarkDeleteAdvancedTimestamp": 0,
          "consumers": [
            {
              "msgRateOut": 0,
              "msgThroughputOut": 0,
              "bytesOutCounter": 0,
              "msgOutCounter": 0,
              "msgRateRedeliver": 0,
              "messageAckRate": 0,
              "chunkedMessageRate": 0,
              "availablePermits": 1000,
              "unackedMessages": 0,
              "avgMessagesPerEntry": 0,
              "blockedConsumerOnUnackedMsgs": false,
              "drainingHashesCount": 0,
              "drainingHashesClearedTotal": 0,
              "drainingHashesUnackedMessages": 0,
              "lastAckedTimestamp": 0,
              "lastConsumedTimestamp": 0,
              "lastConsumedFlowTimestamp": 0,
              "lastAckedTime": "1970-01-01T00:00:00Z",
              "lastConsumedTime": "1970-01-01T00:00:00Z"
            }
          ],
          "isDurable": true,
          "isReplicated": false,
          "allowOutOfOrderDelivery": false,
          "consumersAfterMarkDeletePosition": {},
          "drainingHashesCount": 0,
          "drainingHashesClearedTotal": 0,
          "drainingHashesUnackedMessages": 0,
          "nonContiguousDeletedMessagesRanges": 0,
          "nonContiguousDeletedMessagesRangesSerializedSize": 0,
          "delayedMessageIndexSizeInBytes": 0,
          "subscriptionProperties": {},
          "filterProcessedMsgCount": 0,
          "filterAcceptedMsgCount": 0,
          "filterRejectedMsgCount": 0,
          "filterRescheduledMsgCount": 0,
          "durable": true,
          "replicated": false
        },
        "java-sdk": {
          "msgRateOut": 0,
          "msgThroughputOut": 0,
          "bytesOutCounter": 358717,
          "msgOutCounter": 3577,
          "msgRateRedeliver": 0,
          "messageAckRate": 0,
          "chunkedMessageRate": 0,
          "msgBacklog": 30,
          "backlogSize": 3069,
          "earliestMsgPublishTimeInBacklog": 0,
          "msgBacklogNoDelayed": 30,
          "blockedSubscriptionOnUnackedMsgs": false,
          "msgDelayed": 0,
          "msgInReplay": 0,
          "unackedMessages": 0,
          "type": "Failover",
          "msgRateExpired": 0,
          "totalMsgExpired": 0,
          "lastExpireTimestamp": 0,
          "lastConsumedFlowTimestamp": 0,
          "lastConsumedTimestamp": 0,
          "lastAckedTimestamp": 0,
          "lastMarkDeleteAdvancedTimestamp": 0,
          "consumers": [
            {
              "msgRateOut": 0,
              "msgThroughputOut": 0,
              "bytesOutCounter": 0,
              "msgOutCounter": 0,
              "msgRateRedeliver": 0,
              "messageAckRate": 0,
              "chunkedMessageRate": 0,
              "availablePermits": 1000,
              "unackedMessages": 0,
              "avgMessagesPerEntry": 0,
              "blockedConsumerOnUnackedMsgs": false,
              "drainingHashesCount": 0,
              "drainingHashesClearedTotal": 0,
              "drainingHashesUnackedMessages": 0,
              "lastAckedTimestamp": 0,
              "lastConsumedTimestamp": 0,
              "lastConsumedFlowTimestamp": 0,
              "lastAckedTime": "1970-01-01T00:00:00Z",
              "lastConsumedTime": "1970-01-01T00:00:00Z"
            }
          ],
          "isDurable": true,
          "isReplicated": false,
          "allowOutOfOrderDelivery": false,
          "consumersAfterMarkDeletePosition": {},
          "drainingHashesCount": 0,
          "drainingHashesClearedTotal": 0,
          "drainingHashesUnackedMessages": 0,
          "nonContiguousDeletedMessagesRanges": 0,
          "nonContiguousDeletedMessagesRangesSerializedSize": 0,
          "delayedMessageIndexSizeInBytes": 0,
          "subscriptionProperties": {},
          "filterProcessedMsgCount": 0,
          "filterAcceptedMsgCount": 0,
          "filterRejectedMsgCount": 0,
          "filterRescheduledMsgCount": 0,
          "durable": true,
          "replicated": false
        }
      },
      "replication": {},
      "nonContiguousDeletedMessagesRanges": 0,
      "nonContiguousDeletedMessagesRangesSerializedSize": 0,
      "delayedMessageIndexSizeInBytes": 0,
      "compaction": {
        "lastCompactionRemovedEventCount": 0,
        "lastCompactionSucceedTimestamp": 0,
        "lastCompactionFailedTimestamp": 0,
        "lastCompactionDurationTimeInMills": 0
      },
      "metadata": {
        "partitions": 1,
        "deleted": false
      },
      "partitions": {}
    }
    Attached screenshots, backlog of 2.7k is captured when CLI consumer is started and it recovers after 10 minutes. The metrics for dispatch rate and dispatch throughput spikes at 12:13 when messages are delivered to CLI and a Java application.
    l
    • 2
    • 3
  • d

    David K

    01/16/2025, 9:06 PM
    Copy code
    "lastAckedTime" : "1970-01-01T00:00:00Z",
     "lastConsumedTime" : "1970-01-01T00:00:00Z"
  • d

    David K

    01/16/2025, 9:06 PM
    Seems a bit off.
  • s

    Sandesh Shingare

    01/17/2025, 1:18 PM
    Hello, What is the default value of
    maxPendingMessages
    in pulsar Java producer client? The doc says 0, meaning the check will be disabled by default. We have set
    blockIfQueueFull
    to true in our producer without explicitly setting
    maxPendingMessages
    config. In this case at what queue size the producer send calls will be blocked? I see 166 being shown as
    maxPendingMessages
    in the producer stats when I run my application. Can someone please help understand this? Pulsar client version: 4.0.1
    • 1
    • 1
  • j

    Jerome B

    01/17/2025, 11:24 PM
    Hi, is it possible that using backoff on a topic with subscription with a lot of message not consumed (I mean about 1 million), will overload pulsar ?
    l
    • 2
    • 7
  • b

    bmtrivedi

    01/19/2025, 3:22 AM
    Hi. Can zombie writes happen in a partitioned topic? If yes, how is it resolved? Consider a scenario that a single publisher is publishing to a partitioned topic. The publisher sends a publish request which times out (but that request is lingering in the Pulsar ecosystem, either in broker or bookkeeper) and then publisher retries the request which is successfully written. After some successful publishes, the first timed out lingering request went ahead and wrote the events in the partition, and it will then break the order of events.
    l
    • 2
    • 1
  • l

    Lari Hotari

    01/20/2025, 3:41 PM
    📯 [ANNOUNCE] Apache Pulsar 3.0.9 released The Apache Pulsar team is proud to announce Apache Pulsar version 3.0.9. Pulsar is a highly scalable, low latency messaging platform running on commodity hardware. It provides simple pub-sub semantics over topics, guaranteed at-least-once delivery of messages, automatic cursor management for subscribers, and cross-datacenter replication. For Pulsar release details and downloads, visit: https://pulsar.apache.org/download Release Notes are at: https://pulsar.apache.org/release-notes/versioned/pulsar-3.0.9/ We would like to thank the contributors that made the release possible. Regards, The Pulsar Team
    🎉 1
  • l

    Lari Hotari

    01/20/2025, 3:42 PM
    📯 [ANNOUNCE] Apache Pulsar 3.3.4 released The Apache Pulsar team is proud to announce Apache Pulsar version 3.3.4. Pulsar is a highly scalable, low latency messaging platform running on commodity hardware. It provides simple pub-sub semantics over topics, guaranteed at-least-once delivery of messages, automatic cursor management for subscribers, and cross-datacenter replication. For Pulsar release details and downloads, visit: https://pulsar.apache.org/download Release Notes are at: https://pulsar.apache.org/release-notes/versioned/pulsar-3.3.4/ We would like to thank the contributors that made the release possible. Regards, The Pulsar Team
    🎉 1
  • l

    Lari Hotari

    01/20/2025, 3:43 PM
    🎆 [ANNOUNCE] Apache Pulsar 4.0.2 released pulsarlogo The Apache Pulsar team is proud to announce Apache Pulsar version 4.0.2. Pulsar is a highly scalable, low latency messaging platform running on commodity hardware. It provides simple pub-sub semantics over topics, guaranteed at-least-once delivery of messages, automatic cursor management for subscribers, and cross-datacenter replication. For Pulsar release details and downloads, visit: https://pulsar.apache.org/download Release Notes are at: https://pulsar.apache.org/release-notes/versioned/pulsar-4.0.2/ We would like to thank the contributors that made the release possible. Regards, The Pulsar Team
    🎉 3
  • o

    ojoxdan

    01/21/2025, 3:52 PM
    Good day everyone, I had previoulsy set the dispatch rate of a topic at 150 msg/s then later reset to unlimted with the value
    -1
    , after the last configuration I was expecting the rate to increase but was stuck at 150, Do I need to clear or restart something in in my k8 cluster ? @David K @Yu Wei Sung @Lari Hotari
    d
    l
    • 3
    • 13
  • a

    Alexander Brown

    01/22/2025, 5:26 PM
    This might be a simple answer, but we have pulsar helm chart 3.8.0 and running with oxia in k8s. 3 brokers/3 bookies, and at 400k msg/s latencies are 10ms, but at ~600k to 1000k latency spikes to high number like 25s. Bookies are running nvme, and I'm wondering if we just need to add more bookies or if there are some easy places to increase msg throughput?
    l
    • 2
    • 32
  • p

    Preet Patel

    01/23/2025, 9:25 AM
    Hi, I m trying to upgrade pulsar-3.0.2 to 4.0.1 inside kubernetes cluster, Do we have any utility / function available through which we can port the data during upgradation. like, I want tenants, namespaces, topics and data inside topics to be persisted throughout the upgrade and do not want any kind of data loss.
    l
    • 2
    • 1
  • o

    Olivier NOUGUIER

    01/24/2025, 2:04 PM
    👋 Hi everyone! I like to reproduce my zio-pravega connector for Pulsar, should be easy at 90%. But I cannot find a way to share an (aka open a running) pulsar transaction, is it possible ?
    a
    d
    • 3
    • 6
  • l

    Lari Hotari

    01/27/2025, 8:30 AM
    For users running Pulsar Docker containers on Apple M4 on MacOS Sequoia 15.2, there's a JVM bug that causes the JVM to crash and it results in an error message "Pulsar requires Java 17 or later". The workaround is to pass _JAVA_OPTIONS=-XX:UseSVE=0 environment variable. (this JVM setting is only supported on arm64 platforms) example of using
    -e "_JAVA_OPTIONS=-XX:UseSVE=0"
    to pass the environment option on the docker command line:
    Copy code
    docker run --rm -it -p 6650:6650 -p 8080:8080 -e "_JAVA_OPTIONS=-XX:UseSVE=0" apachepulsar/pulsar:4.0.2 bin/pulsar standalone -nfw -nss
    The problem will be fixed in Pulsar 4.0.3 docker images which will include the latest Corretto OpenJDK release which contains the fix. More details in https://github.com/apache/pulsar/issues/23891#issuecomment-2615087017 .
    👍 2
  • s

    Shubh Vanvat

    01/27/2025, 2:14 PM
    Hi, i am trying to setup dekaf in my kubernetes setup is there any helm chart available to streamline the process? any step by step guide to get started will be helpful
    a
    k
    • 3
    • 16
  • s

    Soumya Ghosh

    01/29/2025, 1:50 PM
    We are observing an intermittent issue where Pulsar producer client is unable to produce message to a partitioned topic, it is failing with
    org.apache.pulsar.client.api.PulsarClientException$TimeoutException
    after 30 seconds. Setup summary: Setup summary: Pulsar 4.0.1, 3 brokers, 5 bookies, 3 ZKs, Produce and Consume through Pulsar Java client (4.0.1) We also upgrade Pulsar cluster and client version to 4.0.2 but still observing the same issue.
    Copy code
    // Pulsar client
    this.pulsarClient =
              PulsarClient.builder()
                  .serviceUrl(connectionUrl)
                  .memoryLimit(0, SizeUnit.MEGA_BYTES)
                  .build();
    
    // Producer configurations
    ProducerBuilder<T> builder =
            pulsarClient
                .newProducer(schema)
                .producerName(producerName)
                .topic(topicName)
                .compressionType(CompressionType.LZ4)
                .enableBatching(true)
                .batchingMaxPublishDelay(30, TimeUnit.MILLISECONDS)
                .blockIfQueueFull(true);
    Pulsar brokers, bookies and client instances are running in the same AWS VPC, so chances of network outage are quite low. Brokers are configured to use
    advertisedAddress
    ,
    advertisedListeners
    is not configured In the client application, the Pulsar client and Producer object is initialized once and then messages are produced in sporadic intervals. There is no other producer or consumer running on the cluster and there is more than sufficient resources available to cater to produce request. Following are attached screenshots of client application and broker node which owns this topic. Time frame - 10:30 UTC We observed that after message failed to produce in 30 seconds, Pulsar producer clients were closed due to keep-alive timeout expiry and then recreated. At the same time we see in brokers logs
    Connection reset by peer
    Any thoughts on what could be going wrong here?
    l
    s
    • 3
    • 29
  • d

    Daniel Kaminski

    01/29/2025, 3:54 PM
    Hi, we observe a strange behaviour in our pulsar cluster which I would like to clarify if this is intended. We ran in a HA setup which means when I create a topology this will be replicated in the second cluster as well. So one of our customers created a namespace with the property schemaValidationEnforced: true but this seems to cause some issues for the system topics.
    [pulsar-io-6-4] WARN  org.apache.pulsar.broker.service.AbstractReplicator - [<persistent://tenant/namespace/__change_events> | pulsar-prod-ha1-->pulsar-prod-ha2] Failed to create remote producer (org.apache.pulsar.client.api.PulsarClientException$IncompatibleSchemaException: {"errorMsg":"org.apache.pulsar.broker.service.schema.exceptions.IncompatibleSchemaException: Producers cannot connect or send message without a schema to topics with a schema caused by org.apache.pulsar.broker.service.schema.exceptions.IncompatibleSchemaException: Producers cannot connect or send message without a schema to topics with a schema","reqId":3006917652050389956, "remote":"our.pulsar.cluster/<ip-address>:6651", "local":"/<ip-address>:34314"}), retrying in 57.934 s
    2025-01-29T14:41:26,689+0000 [pulsar-io-6-4] ERROR org.apache.pulsar.client.impl.ProducerImpl - [<persistent://tenant/namespace/__change_events>] [pulsar.repl.pulsar-prod-ha1-->pulsar-prod-ha2] Failed to create producer: {"errorMsg":"org.apache.pulsar.broker.service.schema.exceptions.IncompatibleSchemaException: Producers cannot connect or send message without a schema to topics with a schema caused by org.apache.pulsar.broker.service.schema.exceptions.IncompatibleSchemaException: Producers cannot connect or send message without a schema to topics with a schema","reqId":3006917652050389956, "remote":"our.pulsar.cluster/<ip-address>:6651", "local":"/<ip-address>"}
    025-01-29T14:41:26,689+0000 [pulsar-io-6-4] WARN  org.apache.pulsar.client.impl.ClientCnx - [id: 0x755f1150, L:/100.70.32.147:34314 - R:rour.pulsar.cluster/<ip-address>:6651] Received error from server: org.apache.pulsar.broker.service.schema.exceptions.IncompatibleSchemaException: Producers cannot connect or send message without a schema to topics with a schema caused by org.apache.pulsar.broker.service.schema.exceptions.IncompatibleSchemaException: Producers cannot connect or send message without a schema to topics with a schema
    I turned off schemaValidationEnforced off on namespace level and i can confirm that the error is gone and I do not see it for any system topic for that specific namespace. Can you confirm to me that this is intended? I guess the schemaValidationEnforced: true is inherited to the topic but I would be curious how this is solved when it comes to the system topics? We ran currently the pulsar LTS version 3.0.8.1.
    l
    • 2
    • 2
  • l

    Lari Hotari

    01/30/2025, 8:45 AM
    Pulsar 4.0.2 release contains a regression related to
    seek
    by timestamp. Please check the updated release notes for the workaround. Details also in this comment: https://github.com/apache/pulsar/issues/23910#issuecomment-2623862571
  • c

    Conor

    01/31/2025, 10:42 AM
    Hello, If someone can spare a moment, could you take a look at https://github.com/apache/pulsar/issues/23890? Our dev environment has been down all week, and I'm worried the same issue could occur in a production cluster
  • a

    Amy Krishnamohan

    01/31/2025, 9:04 PM
    Hello! Today, StreamNative published a blogpost that displays quite a remarkable benchmark result. This technology is based on Oxia (metadata coordinator that replace zookeeper) and leaderless architecture in Ursa engine. To learn more, please check out this blogpost. https://streamnative.io/blog/how-we-run-a-5-gb-s-kafka-workload-for-just-50-per-hour
    👍 1
    🎆 2
    m
    d
    • 3
    • 2
  • s

    Soumya Ghosh

    02/03/2025, 10:19 AM
    Hello all, if someone can assist with this - https://github.com/apache/pulsar/issues/23920 This is an extension of slack discussion thread
    l
    • 2
    • 3
  • h

    hulusi

    02/03/2025, 12:15 PM
    hello everyone how can I detect my system helath check some time bookie is shuting down... can someone help me please ASAP!
  • s

    Suvrajit Manna

    02/03/2025, 2:31 PM
    Hi Everyone, Is there any way to mimic
    ZeroQueueConsumerImpl
    like behaviour in case of Partitioned Topic? I was going through this issue: https://github.com/apache/pulsar-client-python/issues/43 Is any of the workaround mentioned, available for Pulsar Java Client ?
  • s

    Slackbot

    02/04/2025, 3:11 PM
    message has been deleted
  • g

    Girish Sharma

    02/05/2025, 7:35 AM
    I am trying to follow the bundle allocation flow but not getting any head or tales based on just the info/debug logs. Suppose we create a namespace and a topic in that namespace. Assumption is that the namespace is part of a ns-isolation policy and there are specific primary brokers that namespace can be hosted on. As per code, when a lookup happens, it should eventually go in ModularLoadManager::selectBroker but in the logs, I am not seeing the relevant logs.. Are there other places where this decision is made (considering the isolation regex in mind)? Adding debugger is difficult because the decision can happen on any of the broker and we want to replicate it on a multi broker setup with different isolation groups
  • s

    Shaun

    02/06/2025, 12:59 AM
    Should I expect to see consistent behavior between
    persistent
    vs
    non-persistent
    topics when using a subscription type of
    failover
    ? I am seeing that the broker never sends notification on active and inactive status changes to
    non-persistent
    topics like it does for
    persistent
    topics. Is this expected?
    • 1
    • 2
  • c

    Conor

    02/06/2025, 10:04 AM
    If someone can spare a moment, could you take a look at https://github.com/apache/pulsar/issues/23890?
    I haven't found a solution for this, so I am going to go ahead and delete the ledgers mentioned in the error messages from the Bookies manually. If anyone knows a more correct solution please let me know
  • t

    Thomas MacKenzie

    02/06/2025, 5:39 PM
    I'm looking at region / zone awareness, is there any other way to set the placement policies than using the
    pulsar-admin
    client? I've been looking in the configuration in the helm chart and directly in the broker and bookkeeper instances confs but did not find anything. Thanks https://pulsar.apache.org/docs/4.0.x/administration-isolation-bookie/
    s
    • 2
    • 2
  • s

    Sreenath Sasidharan

    02/06/2025, 8:43 PM
    Hello, I am trying to set a service account name(
    serviceAccountName
    ) allowing access to our s3 buckets to pulsar functions worker in Kubernetes. Based on what I read from the documentation: https://pulsar.apache.org/docs/4.0.x/functions-runtime-kubernetes/ and source code, I dont see a way to do it for python via
    customRuntimeOptions
    . Is that correct or have I missed something?
    d
    l
    • 3
    • 18
  • s

    Soumya Ghosh

    02/11/2025, 9:33 AM
    Hello, I am getting an error in initializing cluster metadata. Setup - 3 ZK, 5 bookies, 3 broker, 1 proxy node. Command
    bin/pulsar initialize-cluster-metadata --cluster <cluster_name> --metadata-store zk:zk_1,zk_2,zk_3:2181 --configuration-metadata-store zk:zk_1,zk_2,zk_3:2181             --web-service-url http://<broker_r53>:8080  --broker-service-url pulsar://<broker_r53>:6650
    Output :
    Killed
    Are there any flags to get verbose error?
    d
    • 2
    • 3
1...192021...155Latest