venice #github-notifications

GitHub

06/25/2025, 9:23 PM

<https://github.com/linkedin/venice/tree/main|main>

by misyel

<https://github.com/linkedin/venice/commit/4d118997c852b175f960de4298fe9e14b41d6fc4|4d118997>

- [controller] Add more logging to deferred version swap service (#1888) linkedin/venice

GitHub

06/25/2025, 10:58 PM

1 new commit pushed to

<https://github.com/linkedin/venice/tree/main|main>

by sushantmane

<https://github.com/linkedin/venice/commit/af839cef301b18f5af02b2937637e92da5cfb24c|af839cef>

- [admin-tool] Add support to update largest used RT version in admin tool (#1894) linkedin/venice

GitHub

06/25/2025, 11:32 PM

1 new commit pushed to

<https://github.com/linkedin/venice/tree/main|main>

by sushantmane

<https://github.com/linkedin/venice/commit/c1382bd5b492ee696ebf48a2bd39fd89c5e02d95|c1382bd5>

- [server] Delete RT offset emission metrics (#1895) linkedin/venice

GitHub

06/25/2025, 11:38 PM

1 new commit pushed to

<https://github.com/linkedin/venice/tree/main|main>

by minhmo1620

<https://github.com/linkedin/venice/commit/f6b9f4dd6e66c3c3445ce746f89f8e34bd71d422|f6b9f4dd>

- [controller] Refactor code to reduce call to check access in CreateVersion (#1905) linkedin/venice

GitHub

06/26/2025, 5:43 PM

1 new commit pushed to

<https://github.com/linkedin/venice/tree/main|main>

by haoxu07

<https://github.com/linkedin/venice/commit/93fdd6faefce54e8c88c548461adbf15b9dea1f6|93fdd6fa>

- [tc] Add count by value support on client side (#1857) linkedin/venice

GitHub

06/26/2025, 5:58 PM

1 new commit pushed to

<https://github.com/linkedin/venice/tree/main|main>

by misyel

<https://github.com/linkedin/venice/commit/a411641599fd8010167dabf2b875867e948800b4|a4116415>

- [server] Mark latch as created in SIT for future versions (#1897) linkedin/venice

GitHub

06/26/2025, 6:00 PM

1 new commit pushed to

<https://github.com/linkedin/venice/tree/main|main>

by misyel

<https://github.com/linkedin/venice/commit/2ff2583d3cb1ccb63c31c179b1cddbd1421a574f|2ff2583d>

- [controller] Use controller client for roll forward (#1889) linkedin/venice

GitHub

06/26/2025, 6:20 PM

#1903 [tc] Add count by bucket support on client side Pull request opened by LeoLeo718 ## [tc] Add count by bucket support on client side ## Problem Statement Currently, Venice thin client lacks the ability to perform bucket-based counting and aggregation operations on the client side. This limitation means that clients need to fetch all data and perform bucket aggregations manually, which is inefficient and can lead to unnecessary data transfer and processing overhead. We need to implement a countGroupByBucket functionality that allows clients to: • Count occurrences of values that match specific predicate conditions (buckets) • Support all predicate types (IntPredicate, LongPredicate, FloatPredicate, DoublePredicate, StringPredicate) • Support complex predicate combinations (AND, OR, nested combinations) • Return bucket counts for multiple fields in a single request • Handle all predicate methods (equalTo, greaterThan, greaterOrEquals, lowerThan, lowerOrEquals, anyOf) ## Solution Implemented a new countGroupByBucket operation in the Venice thin client using pure client-side computation: Key Features: • Predicate Support: Handles all predicate types with comprehensive method coverage • Bucket Aggregation: Groups data into buckets based on predicate conditions • Complex Combinations: Supports AND/OR combinations and nested predicate logic • Null Handling: Properly handles null values in predicate evaluation • Utf8 Conversion: Automatically converts Avro Utf8 values to String for string predicates • Pure Client-Side: Uses project() to fetch only required fields, then performs aggregation locally • Multi-Field Support: Can aggregate multiple fields with different predicate types in a single request Implementation Details: • AvroComputeAggregationRequestBuilder: Builds aggregation requests with validation: • Validates bucket definitions (non-null, non-empty bucket maps) • Validates field names (non-null, non-empty, must exist in schema) • Projects required fields for client-side processing • Supports chaining multiple countGroupByBucket calls • AvroComputeAggregationResponse: Processes results: • Evaluates predicates against field values • Aggregates counts across all compute results for each bucket • Handles Utf8 to String conversion for string fields • Returns bucket name to count mappings • Supports generic type casting with proper error handling ### Code changes • Added new code behind existing interfaces. No new configs needed. • Introduced new log lines. • Confirmed if logs need to be rate limited to avoid excessive logging. ### Concurrency-Specific Checks Both reviewer and PR author to verify • Code has no race conditions or thread safety issues. • Proper synchronization mechanisms (e.g.,

synchronized

RWLock

) are used where needed. • No blocking calls inside critical sections that could lead to deadlocks or performance degradation. • Verified thread-safe collections are used (e.g.,

ConcurrentHashMap

CopyOnWriteArrayList

). • Validated proper exception handling in multi-threaded code to avoid silent thread termination. Concurrency Details: • Each request creates its own response object - no shared state • Result aggregation happens in request-scoped maps • Predicate evaluation is stateless and thread-safe • No blocking calls or synchronization needed - pure computation • Thread-safe by design - immutable request builders and response objects ## How was this PR tested? • New unit tests added. • New integration tests added. • Modified or extended existing tests. • Verified backward compatibility (if applicable). Unit tests added: • AvroComputeAggregationRequestBuilderTest: Comprehensive validation tests • Valid configurations with single and multiple fields • Invalid bucket definitions (null, empty) • Null and empty field validation • Non-existent field detection • Chaining multiple countGroupByBucket calls • Boundary cases for field names and bucket names • AvroComputeAggregationResponseTest: Result processing tests • All predicate type methods (IntPredicate, LongPredicate, FloatPredicate, DoublePredicate, StringPredicate) • Complex predicate combinations (AND, OR, nested) • Null value handling in predicate evaluation • Utf8 value conversion for string fields • Type casting error scenarios • Mixed data types handling • Edge cases for all predicate methods Integration tests added: • ReadComputeValidationTest.testCountGroupByBucketAggregation: End-to-end test • Creates store with multiple field types (int, long, float, double, string) • Populates with test data covering various scenarios • Verifies bucket aggregation on all field types • Tests all predicate methods for each type • Verifies complex predicate combinations • Tests with various data patterns and bucket definitions Test Coverage: • Input validation: 100% coverage of error cases • Response processing: All predicate types and methods covered • Integration: Full workflow from request to response validated • Type casting: Proper error handling for type mismatches • Predicate evaluation: All combination scenarios tested • Edge cases: Null values, empty results, boundary conditions ## Does this PR introduce any user-facing or breaking changes? • No. You can skip the rest of this section. • Yes. Clearly explain the behavior change and its impact. This is a new feature that doesn't affect existing functionality: • All changes are backward compatible • New functionality is opt-in through the new countGroupByBucket API • Existing client operations remain unchanged • No changes to wire protocol or storage format • Existing countGroupByValue functionality remains unaffected linkedin/venice

GitHub

06/26/2025, 10:44 PM

#1907 [log-compaction][logs][controller] add logs for debugging & visibility to scheduled log compaction Pull request opened by WhitneyDeng ## Problem Statement Need more visibility and more streamlined filtering for scheduled log compaction flow ## Solution • added [log-compaction] tag to log compaction related logs • added cluster information to log compaction logs ### Code changes • Added new code behind a config. If so list the config names and their default values in the PR description. • Introduced new log lines. • Confirmed if logs need to be rate limited to avoid excessive logging. ### Concurrency-Specific Checks Both reviewer and PR author to verify • Code has no race conditions or thread safety issues. • Proper synchronization mechanisms (e.g.,

synchronized

RWLock

) are used where needed. • No blocking calls inside critical sections that could lead to deadlocks or performance degradation. • Verified thread-safe collections are used (e.g.,

ConcurrentHashMap

CopyOnWriteArrayList

). • Validated proper exception handling in multi-threaded code to avoid silent thread termination. ## How was this PR tested? N/A just logs • New unit tests added. • New integration tests added. • Modified or extended existing tests. • Verified backward compatibility (if applicable). ## Does this PR introduce any user-facing or breaking changes? • No. You can skip the rest of this section. • Yes. Clearly explain the behavior change and its impact. linkedin/venice

GitHub

06/27/2025, 12:00 AM

1 new commit pushed to

<https://github.com/linkedin/venice/tree/main|main>

by WhitneyDeng

<https://github.com/linkedin/venice/commit/a41f8b24030a64d921046c7fcd2c6f1c8c89a463|a41f8b24>

- [log-compaction][logs][controller] add logs for debugging & visibility to scheduled log compaction (#1907) linkedin/venice

GitHub

06/27/2025, 8:33 PM

#1908 [WIP] Support TTL Repush for stores with batch data Pull request opened by nisargthakkar ## Problem Statement ## Solution ### Code changes • Added new code behind a config. If so list the config names and their default values in the PR description. • Introduced new log lines. • Confirmed if logs need to be rate limited to avoid excessive logging. ### Concurrency-Specific Checks Both reviewer and PR author to verify • Code has no race conditions or thread safety issues. • Proper synchronization mechanisms (e.g.,

synchronized

RWLock

) are used where needed. • No blocking calls inside critical sections that could lead to deadlocks or performance degradation. • Verified thread-safe collections are used (e.g.,

ConcurrentHashMap

CopyOnWriteArrayList

). • Validated proper exception handling in multi-threaded code to avoid silent thread termination. ## How was this PR tested? • New unit tests added. • New integration tests added. • Modified or extended existing tests. • Verified backward compatibility (if applicable). ## Does this PR introduce any user-facing or breaking changes? • No. You can skip the rest of this section. • Yes. Clearly explain the behavior change and its impact. linkedin/venice

GitHub

06/27/2025, 10:26 PM

1 new commit pushed to

<https://github.com/linkedin/venice/tree/main|main>

by sushantmane

<https://github.com/linkedin/venice/commit/9db06ddaa13bd8c9f04cb8d6cb24d0428f7792a5|9db06dda>

- [server] Emit versioned storage quota stats and keep SIT open for current store version (#1868) linkedin/venice

GitHub

07/01/2025, 2:20 AM

#1909 [server][dvc] IngestionTaskReusableObjects.Strategy config Pull request opened by FelixGV The AASIT hangs on to a private property of thread-local type, where each instance can contain a 1 MB blob. When there are lots of AASIT instances, accessed from lots of ingestion threads, this can add up to a lot of big thread-local state. In a production heap dump with 191 AASIT instances, 51% of all bytes were occupied by byte[] (though not all of them came from this particular AASIT field). With this commit, the default behavior is the same as before, but a new config allows passing an enum to select alternative behaviors: server.ingestion.task.reusable.objects.strategy And the valid values are: • NO_REUSE: will pass null into the relevant code paths, which will result in new objects getting allocated on the fly, and needing to be garbage collected. • THREAD_LOCAL_PER_INGESTION_TASK: This avoids hot path allocations but is likely not very efficient , since there can be many AASIT and each will have its own independent set of objects, even if/when the AASIT becomes dormant. This is the first mode which was built, and it is considered stable. • SINGLETON_THREAD_LOCAL: Likely the most efficient. Test changes: • Integration tests override this setting to SINGLETON_THREAD_LOCAL. ### Code changes • Added new code behind a config. If so list the config names and their default values in the PR description. • Introduced new log lines. • Confirmed if logs need to be rate limited to avoid excessive logging. ### Concurrency-Specific Checks Both reviewer and PR author to verify • Code has no race conditions or thread safety issues. • Proper synchronization mechanisms (e.g.,

synchronized

RWLock

) are used where needed. • No blocking calls inside critical sections that could lead to deadlocks or performance degradation. • Verified thread-safe collections are used (e.g.,

ConcurrentHashMap

CopyOnWriteArrayList

GitHub

07/01/2025, 4:48 AM

1 new commit pushed to

<https://github.com/linkedin/venice/tree/main|main>

by FelixGV

<https://github.com/linkedin/venice/commit/17684d22b13c2948486dee2d6d6a98a0eb1688e9|17684d22>

- [server][dvc] IngestionTaskReusableObjects.Strategy config (#1909) linkedin/venice

GitHub

07/07/2025, 9:59 PM

#1911 [controller] add stats to record latency per route Pull request opened by pthirun ## Problem Statement ## Solution ### Code changes • Added new code behind a config. If so list the config names and their default values in the PR description. • Introduced new log lines. • Confirmed if logs need to be rate limited to avoid excessive logging. ### Concurrency-Specific Checks Both reviewer and PR author to verify • Code has no race conditions or thread safety issues. • Proper synchronization mechanisms (e.g.,

synchronized

RWLock

) are used where needed. • No blocking calls inside critical sections that could lead to deadlocks or performance degradation. • Verified thread-safe collections are used (e.g.,

ConcurrentHashMap

CopyOnWriteArrayList

GitHub

07/07/2025, 10:22 PM

#1912 [protocol] Add new store schema for store lifecycle configs Pull request opened by misyel ## Problem Statement After a version is swapped in the target region for a colo by colo push w/ deferred swap, we will perform validation. The validations will produce either a

PROCEED

(continue with the roll forward) or a

ROLL_BACK

(do not roll forward and roll back the target region) and the validations will be configurable via implementing the

StoreLifecycleHooks

class. To support different implementations of the

StoreLifecycleHooks

class, we will need to introduce a new field in the store schema to store the implementation to use ## Solution Introduce a new field,

storeLifecycleHooksConfig

, in the store metavalue and admin operation schema. The

storeLifecycleHooksConfig

will be a record and it will contain two fields in the record:

storeLifecycleHooksClassName

and

storeLifecycleHooksParams

storeLifecycleHooksClassName

will be the FQCN of the

StoreLifecycleHooks

implementation and

storeLifecycleHooksParams

will optionally provide additional args to the

StoreLifecycleHooks

implementation The new schema looks like so: { "name": "storeLifecycleHooksConfig", "doc": "Config to specify the class name and params to be used for the store lifecycle hooks", "type": [ "null", { "name": "StoreLifecycleHooksConfig", "type": "record", "fields": [ {"name": "storeLifecycleHooksClass", "type": "string"}, {"name": "storeLifecycleHooksParams", "type": {"type": "map", "values": "string"}} ] } ], "default": null } ### Code changes • Added new code behind a config. If so list the config names and their default values in the PR description. • Introduced new log lines. • Confirmed if logs need to be rate limited to avoid excessive logging. ### Concurrency-Specific Checks Both reviewer and PR author to verify • Code has no race conditions or thread safety issues. • Proper synchronization mechanisms (e.g.,

synchronized

RWLock

) are used where needed. • No blocking calls inside critical sections that could lead to deadlocks or performance degradation. • Verified thread-safe collections are used (e.g.,

ConcurrentHashMap

CopyOnWriteArrayList

). • Validated proper exception handling in multi-threaded code to avoid silent thread termination. ## How was this PR tested? • New unit tests added. • New integration tests added. • Modified or extended existing tests. • Verified backward compatibility (if applicable). n/a. This pr only adds in the new avro schemas. ## Does this PR introduce any user-facing or breaking changes? • No. You can skip the rest of this section. • Yes. Clearly explain the behavior change and its impact. linkedin/venice

GitHub

07/08/2025, 8:55 PM

#1913 [controller][server] Deprecated the Helix Message Channel Pull request opened by FelixGV The Helix Message Channel is a legacy mechanism for letting the controller send kill job commands to the servers. It has been replaced a long time ago by the Participant Store but the config defaults did not reflect that. This commit aligns the default config with production best practices. A later commit will delete this code entirely. Miscellaneous: • Refactored the flow of the ParticipantStoreConsumptionTask run loop a little bit. Added extensive unit testing for it. • Refactored the VeniceHelixAdmin::close function so that it cannot hang forever waiting for the HelixManager to disconnect... • Added a SleepStallingMockTime to help with multi-threaded testing. ### Code changes • Added new code behind a config. If so list the config names and their default values in the PR description. • Introduced new log lines. • Confirmed if logs need to be rate limited to avoid excessive logging. ### Concurrency-Specific Checks Both reviewer and PR author to verify • Code has no race conditions or thread safety issues. • Proper synchronization mechanisms (e.g.,

synchronized

RWLock

) are used where needed. • No blocking calls inside critical sections that could lead to deadlocks or performance degradation. • Verified thread-safe collections are used (e.g.,

ConcurrentHashMap

CopyOnWriteArrayList

GitHub

07/09/2025, 6:07 PM

1 new commit pushed to

<https://github.com/linkedin/venice/tree/main|main>

by ranwang2024

<https://github.com/linkedin/venice/commit/9c3ad76fcca8aab78dab203c267964af9b0e3e8a|9c3ad76f>

- Integrate MultiTaskSchedulerService with HelixVeniceClusterResources, AdminSparkServer and AdminTool (#1817) linkedin/venice

GitHub

07/09/2025, 6:07 PM

#1817 Integrate MultiTaskSchedulerService with HelixVeniceClusterResources, AdminSparkServer and AdminTool Pull request opened by ranwang2024 ## Problem Statement PR #1723 introduced MultiTaskSchedulerService, but it was never wired into the Controller. Consequently, there is no API endpoint that can invoke the auto-store-migration workflow. This PR integrates MultiTaskSchedulerService with the parent Venice Controller so that auto store migration can be triggered reliably through an exposed API. ## Solution Wired MultiTaskSchedulerService into HelixVeniceClusterResources; a dedicated MultiTaskSchedulerService is now created for every Venice cluster when the parent Controller application starts. Introduced a new REST endpoint

/auto_migrate_store

in AdminSparkServer, implemented on both server and client sides, and packaged the client logic into the AdminTool. Operators can now trigger an automatic store migration directly from the command-line tool. ### Code changes • Added new code behind a config. If so list the config names and their default values in the PR description. • Introduced new log lines. • Confirmed if logs need to be rate limited to avoid excessive logging. ### Concurrency-Specific Checks Both reviewer and PR author to verify • Code has no race conditions or thread safety issues. • Proper synchronization mechanisms (e.g.,

synchronized

RWLock

) are used where needed. • No blocking calls inside critical sections that could lead to deadlocks or performance degradation. • Verified thread-safe collections are used (e.g.,

ConcurrentHashMap

CopyOnWriteArrayList

GitHub

07/09/2025, 11:07 PM

#1914 [controller] Emit metric for ONLINE parent version status mismatch Pull request opened by misyel ## Problem Statement We have observed that there are cases where a store is eligible to enter the DeferredVersionSwapService loop and perform a version swap, but it does not seem to be entering at all. This could be due to the parent version status being marked ONLINE already even though a swap did not happen, leading to a version status mismatch and roll forward never happening. This pr adds a check to account for this case if it is happening and emit a metric ## Solution Emit a metric if the parent version status is marked ONLINE and allow it to enter the DeferredVersionSwapService loop to perform a version swap ### Code changes • Added new code behind a config. If so list the config names and their default values in the PR description. • Introduced new log lines. • Confirmed if logs need to be rate limited to avoid excessive logging. ### Concurrency-Specific Checks Both reviewer and PR author to verify • Code has no race conditions or thread safety issues. • Proper synchronization mechanisms (e.g.,

synchronized

RWLock

) are used where needed. • No blocking calls inside critical sections that could lead to deadlocks or performance degradation. • Verified thread-safe collections are used (e.g.,

ConcurrentHashMap

CopyOnWriteArrayList

). • Validated proper exception handling in multi-threaded code to avoid silent thread termination. ## How was this PR tested? • New unit tests added. • New integration tests added. • Modified or extended existing tests. • Verified backward compatibility (if applicable). New unit and e2e test ## Does this PR introduce any user-facing or breaking changes? • No. You can skip the rest of this section. • Yes. Clearly explain the behavior change and its impact. linkedin/venice

GitHub

07/10/2025, 5:50 PM

#1915 [tc][test] Add integration tests for countByValue and countByBucket functionality Pull request opened by LeoLeo718 ## Problem Statement Currently, Venice's thin client supports

countByValue

and

countByBucket

aggregation operations, but lacks comprehensive integration tests to validate these features. This creates several issues: • Limited Test Coverage: The existing test suite doesn't validate the complete functionality of aggregation operations • Regression Risk: Without proper integration tests, changes to aggregation logic could introduce bugs unnoticed • Feature Validation Gap: Users need confidence that aggregation operations work correctly across different data types and scenarios • Development Confidence: Developers need reliable tests to ensure new features don't break existing aggregation functionality The current implementation supports complex predicate combinations and multiple data types, but these capabilities aren't thoroughly tested in an integration environment. ## Solution This PR adds comprehensive integration tests for

countByValue

and

countByBucket

functionality in the

ReadComputeValidationTest

class. The solution provides: ### Test Coverage • countByValue Tests: Validates value-based aggregation with TopK functionality • countByBucket Tests: Tests bucket-based aggregation with complex predicate combinations • Data Type Coverage: Tests all supported types (Float, Integer, Double, Long, String) • Predicate Coverage: Validates all predicate operations (lt, lte, gt, gte, eq, anyOf) • Complex Scenarios: Tests AND/OR predicate combinations ### Key Features Tested // countByValue functionality - Single field aggregation (jobType, location) - Multi-field aggregation - TopK result limiting - Data type conversion handling // countByBucket functionality - All numeric types (Float, Integer, Double, Long) - String type aggregation - All predicate operations - Complex predicate combinations (AND, OR) - Edge cases and boundary conditions ### Performance Considerations • Tests use realistic data sizes to validate performance • Validates memory usage patterns for aggregation operations • Ensures no memory leaks in predicate evaluation • Tests concurrent access patterns where applicable ### Code Changes • Added new integration test methods to

ReadComputeValidationTest

• No new configuration required • No new log lines added • No concurrency changes (tests are single-threaded) ### Concurrency-Specific Checks • Code has no race conditions (tests are single-threaded) • No synchronization mechanisms needed for test code • No blocking calls in critical sections • Thread-safe collections not applicable for test code • Proper exception handling in test scenarios ## How was this PR tested? • New integration tests added:

testCountGroupByValueAggregation()

and

testCountGroupByBucketAggregation()

• Modified existing tests: Extended

ReadComputeValidationTest

class • Verified backward compatibility: All existing tests continue to pass • Performance testing: Validated test execution time and memory usage • Cross-platform testing: Tests run successfully on different environments ### Test Execution Results ./gradlew internalvenice-test-common:integrationTest --tests "ReadComputeValidationTest" # All tests pass successfully # testCountGroupByValueAggregation: PASS # testCountGroupByBucketAggregation: PASS # All existing tests: PASS ### Test Coverage Metrics • Predicate Types: 100% coverage for numeric types (Int, Long, Float, Double) • String Predicates: 33% coverage (equalTo, anyOf tested) • Complex Predicates: 100% coverage (AND, OR combinations) • Data Types: 100% coverage for all supported types • Edge Cases: Comprehensive boundary condition testing ## Does this PR introduce any user-facing or breaking changes? • No. This PR only adds integration tests and doesn't modify any existing functionality. The changes are purely additive and don't affect: • Existing API contracts • User-facing interfaces • Performance characteristics • Backward compatibility • Configuration requirements All existing functionality remains unchanged, and users will not experience any differences in behavior. linkedin/venice

GitHub

07/10/2025, 8:50 PM

#1916 [controller][config][admintool] add store level config to enable compaction scheduler Pull request opened by pthirun ## Problem Statement ## Solution ### Code changes • Added new code behind a config. If so list the config names and their default values in the PR description. • Introduced new log lines. • Confirmed if logs need to be rate limited to avoid excessive logging. ### Concurrency-Specific Checks Both reviewer and PR author to verify • Code has no race conditions or thread safety issues. • Proper synchronization mechanisms (e.g.,

synchronized

RWLock

) are used where needed. • No blocking calls inside critical sections that could lead to deadlocks or performance degradation. • Verified thread-safe collections are used (e.g.,

ConcurrentHashMap

CopyOnWriteArrayList

GitHub

07/10/2025, 11:35 PM

1 new commit pushed to

<https://github.com/linkedin/venice/tree/main|main>

by FelixGV

<https://github.com/linkedin/venice/commit/66b7b57975c9cb1ba14ef1a6247498e6cee08862|66b7b579>

- [tc] Add count by bucket support on client side (#1903) linkedin/venice

GitHub

07/11/2025, 1:35 AM

#1917 [dvc] parse empty Int2ObjectMap value without throwing error Pull request opened by arjun4084346 ## Problem Statement Right now config value for PUBSUB_TYPE_ID_TO_POSITION_CLASS_NAME_MAP cannot be an empty string. To be able to rollout a feature smoothly we want to be able to accept empty values without throwing error. ## Solution ### Code changes • Added new code behind a config. If so list the config names and their default values in the PR description. • Introduced new log lines. • Confirmed if logs need to be rate limited to avoid excessive logging. ### Concurrency-Specific Checks Both reviewer and PR author to verify • Code has no race conditions or thread safety issues. • Proper synchronization mechanisms (e.g.,

synchronized

RWLock

) are used where needed. • No blocking calls inside critical sections that could lead to deadlocks or performance degradation. • Verified thread-safe collections are used (e.g.,

ConcurrentHashMap

CopyOnWriteArrayList

GitHub

07/11/2025, 4:39 PM

1 new commit pushed to

<https://github.com/linkedin/venice/tree/main|main>

by MikeDafi

<https://github.com/linkedin/venice/commit/48f2fab3e650e68517a1ffdd5c2a4a49e0f790e2|48f2fab3>

- [controller] Store Deleted Validation Endpoint (#1869) linkedin/venice

GitHub

07/11/2025, 5:06 PM

#1910 [server][controller][test] Use PubSubPosition in in memory pubsub broker and clients Pull request opened by sushantmane ## Use PubSubPosition in in-memory PubSub broker and clients Replace numeric offset with InMemoryPubSubPosition in memory-only broker and client code. InMemoryPubSubPosition implements PubSubPosition and supports offset arithmetic. Update SIT tests to use InMemoryPubSubPosition. ### AI Generated Summary of Changes This pull request introduces changes to transition from Kafka-specific abstractions to more generic PubSub abstractions in the

da-vinci-client

module. The key updates include adding support for new PubSub methods in

SharedKafkaConsumer

, replacing Kafka-specific classes with PubSub equivalents in test cases, and updating method calls to use PubSub abstractions. ### Migration to PubSub Abstractions: • `SharedKafkaConsumer.java`: Added new methods such as

getPositionByTimestamp

beginningPosition

, and

endPosition

, which throw

UnsupportedOperationException

to align with the PubSub abstraction layer. • `KafkaClusterBasedRecordThrottlerTest.java`: Replaced

InMemoryKafkaBroker

with

InMemoryPubSubBroker

and updated method calls to use

getPubSubBrokerAddress

instead of

getKafkaBootstrapServer

. [1] [2] [3] [4] [5] [6] ### Test Updates for PubSub Compatibility: • `StoreIngestionTaskTest.java`: Replaced Kafka-specific classes (

InMemoryKafkaBroker

MockInMemoryConsumer

, etc.) with their PubSub equivalents (

InMemoryPubSubBroker

MockInMemoryConsumerAdapter

, etc.). Updated method calls and object initializations to align with the new PubSub abstractions. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] These changes are part of a broader effort to decouple the codebase from Kafka-specific dependencies and enable support for other PubSub systems. ### Code changes • Added new code behind a config. If so list the config names and their default values in the PR description. • Introduced new log lines. • Confirmed if logs need to be rate limited to avoid excessive logging. ### Concurrency-Specific Checks Both reviewer and PR author to verify • Code has no race conditions or thread safety issues. • Proper synchronization mechanisms (e.g.,

synchronized

RWLock

) are used where needed. • No blocking calls inside critical sections that could lead to deadlocks or performance degradation. • Verified thread-safe collections are used (e.g.,

ConcurrentHashMap

CopyOnWriteArrayList

GitHub

07/11/2025, 6:02 PM

#1918 [test] Pass Pub Sub Client Properties to change log consumer for integration test. Pull request opened by haoxu07 on <!date^1752223598^{date_short}|2025-07-11T08:46:38Z> ## Problem Statement Originally Change Log consumer integration test only work with kafka client. It does not support other pub sub system. ## Solution By adding properties from

VeniceTwoLayerMultiRegionMultiClusterWrapper.getPubSubClientProperties

and

VeniceClusterWrapper.getPubSubClientProperties

in to consumer properties to instantiate change log consumer, we could reuse the same integration test logic with specific pub sub client. ### Code changes • Added new code behind a config. If so list the config names and their default values in the PR description. • Introduced new log lines. • Confirmed if logs need to be rate limited to avoid excessive logging. ### Concurrency-Specific Checks Both reviewer and PR author to verify • Code has no race conditions or thread safety issues. • Proper synchronization mechanisms (e.g.,

synchronized

RWLock

) are used where needed. • No blocking calls inside critical sections that could lead to deadlocks or performance degradation. • Verified thread-safe collections are used (e.g.,

ConcurrentHashMap

CopyOnWriteArrayList

GitHub

07/11/2025, 6:27 PM

#1647 [server][common][controller][vpj] Materialized view projection and filter support Pull request opened by xunyin8 ## Problem Statement Consumers of a materialized view might not be interested in every data field and update event. Having the ability to perform projection and filter could reduce footprint and improve ingestion performance for MV consumers. ## Solution Add projection and filtering support for materialized view (MV) to be more efficient about unwanted data for view consumers. Projection can be enabled by setting projection fields in the materialized view parameters. Similarly filtering can be enabled by setting filter by fields. These two features can be enabled separately or together. If enabled together the filter by fields will be included in the projecting fields automatically. Here is an example MV configuration to illustrate the ideas: Record containing fields: {a, b, c, d, e} Projecting fields: {b, c} Filtering fields: {a} The only filtering option for now is to skip if none of the filter by fields changed. The filtering is also only applied during hybrid ingestion since it doesn't make sense to have a change filter on batch push. With the above setup we will project and write all batch data to the MV ({a, b, c}). RT updates (full PUT or UPDATE) will project and write the resulting record to the MV ({a, b, c}) only if the value of field (a) is different from the old value. All DELETE events will be written to the MV (no filtering). ### Code changes In order to achieve the above behavior there are several changes: 1. Previously we've used pub sub message headers to perform forwarding to handle chunks during NR pass-through in remote regions. This strategy will not work with projection because in order for us to perform projection on batch data in remote regions, we will need the remote partition leaders to assemble the chunks during NR pass-through. We are replacing the forwarding strategy with InMemoryChunkAssembler. To ensure leaders don't resume in-between chunks we will also buffer and delay writing the chunks to drainer until we have a fully assembled record and produced it to view topic(s). The view partition header code is left untouched in VPJ to remove deployment or rollback order requirements. i.e. VPJ can get ahead of the server. If server gets ahead of VPJ that's fine too because server's new chunking support can function on its own. We can clean up everything once everything is deployed and stable (no more rollbacks). 2. Added enforcement in controller to ensure view configs are immutable. Projection schema is generated when adding a new materialized view and stored with the view config. Since there can only be one schema version per view, the znode size should be manageable with compression. If this becomes a concern we can also store it separately or generate it on the fly. We also verify the filtering by fields and projection fields to ensure they exist in latest superset or value schema and have default values. 3. Projection is performed in ComplexVeniceWriter as part of complexPut so both VPJ and leaders can use the same code for projection. Filtering is performed in MaterializedViewWriter since current offering of change filter is applicable only to hyrbid writes. ### Concurrency-Specific Checks Both reviewer and PR author to verify • Code has no race conditions or thread safety issues. • Proper synchronization mechanisms (e.g.,

synchronized

RWLock

) are used where needed. • No blocking calls inside critical sections that could lead to deadlocks or performance degradation. • Verified thread-safe collections are used (e.g.,

ConcurrentHashMap

CopyOnWriteArrayList

). • Validated proper exception handling in multi-threaded code to avoid silent thread termination. ## How was this PR tested? Integration tests and will add unit tests once have some consensus on the changes. • New unit tests added. • New integration tests added. • Modified or extended existing tests. • Verified backward compatibility (if applicable). ## Does this PR introduce any user-facing or breaking changes? • No. You can skip the rest of this section. • Yes. Clearly explain the behavior change and its impact. linkedin/venice

GitHub

07/11/2025, 6:29 PM

#1858 [fast-client] Add detailed logging with redundant filter for no replica error Pull request opened by xunyin8 ## Problem Statement Currently when there is a no available replica error we don't have enough logging/metric to determine exactly which or what combination of the following happened: 1. Some replicas are not available already in routing metadata 2. Replicas filtered due to blocked or unhealthy instances 3. Filtered due to request context for retry (original request already used the route) ## Solution Extract the blocked and unhealthy instances out of routing strategy to AbstractStoreMetadata so it can be logged accurately in case of no available routes. Blocked/unhealthy instances will be provided as excluded instances so it will be filtered out already when performing the routing strategies. ### Code changes • Added new code behind a config. If so list the config names and their default values in the PR description. • Introduced new log lines. • Confirmed if logs need to be rate limited to avoid excessive logging. ### Concurrency-Specific Checks Both reviewer and PR author to verify • Code has no race conditions or thread safety issues. • Proper synchronization mechanisms (e.g.,

synchronized

RWLock

) are used where needed. • No blocking calls inside critical sections that could lead to deadlocks or performance degradation. • Verified thread-safe collections are used (e.g.,

ConcurrentHashMap

CopyOnWriteArrayList

GitHub

07/11/2025, 10:05 PM

1 new commit pushed to

<https://github.com/linkedin/venice/tree/main|main>

by sixpluszero

<https://github.com/linkedin/venice/commit/7b2db60638bec2f87cb667e63c693b3bbf32906c|7b2db606>

- [samza][producer] Add batch produce support for PUT/DELETE operation in Venice Writer (#1896) linkedin/venice