https://venicedb.org logo
Join Slack
Powered by
# general
  • f

    Felix GV

    03/25/2025, 6:46 PM
    > (2) Have you considered running Venice as a sidecar container? As compared to your current client-library based approach. They are pros and cons but would like to understand your experience on maintaining that client-facing library, especially you cannot quite control the binary releases. Yes we did… but so far the sidecar story at LinkedIn is a bit immature, so we’re sitting on the sidelines for now. There is work on this (at LI, not in Venice specifically) so we may jump on that bandwagon eventually… I will say, however, that the current architecture of Da Vinci is to have the RocksDB database accessed directly from the host application’s process, via JNI, and the overhead of that is quite low (we’ve benchmarked single digit microseconds including everything: JNI overhead, RocksDB lookup [assuming the PlainTable / all-in-RAM config], and Avro deserialization). Whereas with a sidecar we would hit the network stack (even though it would be on loopback) and it seems like it might be challenging to be quite as fast. Still, the maintainability benefits may warrant pursuing that approach. Regarding client releases and uptake, I acknowledge that it has been historically painful, and still is to some extent, but I will say that it has gotten better, at least internally at LI… Through a mix of tooling and processes, we now have much better dependency hygiene than before and our dependents do pick up new library versions much more quickly than before… That is not Venice-specific and does nothing for folks outside LI who likely have the same challenges, but I just wanted to share that data point nonetheless.
    j
    • 2
    • 2
  • f

    Felix GV

    03/25/2025, 6:46 PM
    > (3) For the partitioned-cases, is that a popular use case, the sharding part is a bit clear to me from reading the blog, and I imagine it’ll require tight integration to make it work --- the sharding on the data preparation & push side, may need to agree with the sharding on the data access layer? The partitioned case is definitely a power user scenario, and in that sense it is “not that popular”. I would say that it only makes sense when used as part of a “framework” or “environment” which already has a notion of partitioning. IOW, it doesn’t make much sense to use partitioned Da Vinci just like that, out of the box, because you wouldn’t know what partition to subscribe to on each host… At LI, we’ve had partitioned Da Vinci integrated in our search stack as well as our stream processing stack, both of which have built-in notions of partitioning. In those cases, we configure the relevant Venice stores with a “custom partitioner”, which results in the Venice data being “co-partitioned” with the other system (e.g. search or stream processing), and therefore the other system can subscribe to the correct shard in each of its own hosts (or tasks, or whatever). As an aside, this is where the name Da Vinci really shines… did you know that Leonardo Da Vinci left many of his works, including his masterpieces, unfinished? For example, he kept tweaking the Mona Lisa up until his death, many years after having started it. Likewise, while Venice is a full-fledged system that one can use directly without much scaffolding around it, Da Vinci is more of a building block, which you can integrate into other systems.
    j
    • 2
    • 1
  • j

    Jia

    03/26/2025, 10:06 PM
    Thanks a lot for the detailed reply @Felix GV. A follow up question. For Da Vinci mode, do you keep a copy of the data both on the client side and on the Venice server side? Do you really need the server-side copy?
    f
    • 2
    • 2
  • f

    Felix GV

    04/17/2025, 11:29 PM
    We have a Venice talk lined up at J on the Beach in Spain next month! Who's coming 😁 ? https://www.linkedin.com/posts/j-on-the-beach_jotb25-duckdb-rocksdb-activity-7318664574696632322-nsbo
    🎉 4
    🇪🇸 3
  • g

    Gabriel Drouin

    04/22/2025, 2:28 PM
    I've ran into issues while running
    ./gradlew check --continue
    with Ubuntu 24.04.1 LTS on WSL2 (Windows 11). In a nutshell: numerous integration/e2e tests would fail due to timeouts. I'll setup the work environment on my Macbook Pro instead. Further details in reply
    k
    f
    • 3
    • 14
  • g

    Gabriel Drouin

    04/25/2025, 6:12 PM
    @Koorous Vargha I just noticed dark mode got enabled on the docs 👀 very nice
    🚀 3
    k
    f
    • 3
    • 6
  • g

    Gabriel Drouin

    05/07/2025, 1:15 PM
    Hey folks, question regarding tests. In this current PR, I'm adding unit tests where each tests takes ~6 seconds due to exponential backoff. When running
    :internal:venice-common:test --tests "com.linkedin.venice.utils.TestHelixUtils"
    , the tests run sequentially, which takes some time. I was wondering if you knew of any way to run multiple tests in parallel. I remember that
    ./gradlew check --continue
    would spawn x threads when running the full suite, and thought perhaps I would be able to do the same thing here, or similar. Thanks! EDIT: it doesn't actually take that long, but thought it might be helpful to know in the future.
    k
    f
    • 3
    • 5
  • f

    Felix GV

    05/09/2025, 7:35 PM
    Interesting talk by @Jia about Pinterest's KV Store, used for ML feature serving:

    https://youtu.be/aCVIjDkzLM8?si=hHd-finRd2Hcw5xk▾

    👍 5
    m
    k
    j
    • 4
    • 9
  • g

    Gabriel Drouin

    05/11/2025, 9:25 PM
    All checks passing on this new PR 🍀 However, I think much discussion will be required in order to arrive at a proper solution we can all agree on 😅 Exciting! Have a great week everyone! 👋
    🚀 2
    k
    • 2
    • 6
  • f

    Felix GV

    06/02/2025, 5:08 PM
    One of the things we were talking about today @Gabriel Drouin
    g
    • 2
    • 5
  • g

    Gabriel Drouin

    06/10/2025, 4:49 PM
    @Felix GV A very important question I wish to solve today... Would you be able to tell me why the tests in the CI are named in french? 😆
    🤣 2
    z
    f
    s
    • 4
    • 5
  • m

    Minh Nguyen

    06/13/2025, 4:32 PM
    I have added the welcome bot to our slack channel! Anyone who join #C03SLQWRSLF will trigger the workflow. Any extra info you want to add?
    🚀 2
    🚢 1
    f
    g
    • 3
    • 3
  • d

    Dejan Mijic

    06/18/2025, 3:03 PM
    Hey folks, is there a working example of fast client wiring? The closest thing I found to a guide is
    com.linkedin.venice.fastclient.utils.TestClientSimulator
    . Is it an appropriate thing to follow (minus the mocks and dummies 🙂 )? Thanks!
    f
    s
    x
    • 4
    • 53
  • f

    Felix GV

    06/20/2025, 4:15 PM
    The latest conference talk is now posted online! Reshare and like this social media post, if you'd like, to promote Venice to your networks!
    ❤️ 1
  • f

    Felix GV

    06/20/2025, 4:16 PM
    Direct YouTube link here:

    https://youtu.be/hc0pgvnr3fQ▾

  • f

    Felix GV

    08/10/2025, 4:25 PM
    HyBRID: Hybrid Batch/Real-time Ingested Data
    ♥️ 1
    n
    g
    • 3
    • 4
  • f

    Felix GV

    08/20/2025, 3:04 PM
    New functionality, new video! Watch it to learn about: Facet Counting in Venice
    🚀 5
    • 1
    • 2
  • f

    Felix GV

    08/20/2025, 3:09 PM
    If you would like, please re-share our announcement to your networks! • LinkedIn • Bluesky • Twitter
    🎉 3
  • k

    Kacper Bielecki

    09/05/2025, 9:29 AM
    VPJ.log
    VPJ.log
  • k

    Kacper Bielecki

    09/05/2025, 9:30 AM
    Hi, we are slowly but steadily continue with testing Venice. We are trying now with the bigger setup and trying to push 4TB store but we fail. We use 0.4.613 version with the k8s setup through EKS. We are pushing with 100 partitions and 12 venice-servers. It seems that some partitions fail to allocate replicas. The root cause seems to be the failure of opening RocksDB metadata database on some venice-server pods which manifests in such exceptions:
    Copy code
    Caused by: org.rocksdb.RocksDBException: While lock file: /opt/venice/rocksdb/rocksdb/client-id-to-email-hash_v26/client-id-to-email-hash_v26_1000000000/LOCK: Resource temporarily unavailable
    I am attaching VPJ, leader controler and one of the venice-server logs. It seems that venice-server tries to open the metadata database from multiple threads and the locking failures are not retried. Is it a known issue? Are we doing anything wrong? 🤔
    k
    d
    +2
    • 5
    • 26
  • f

    Felix GV

    10/21/2025, 2:09 PM
    The 2nd edition of Designing Data-Intensive Applications mentions Venice!
    🚀 6
    • 1
    • 1
  • f

    Felix GV

    10/21/2025, 4:15 PM
    For those in the Bay Area, this could be an interesting meetup: https://luma.com/e7feg2i6
    🙌 2
    n
    • 2
    • 2
  • a

    Amre Shakim

    10/23/2025, 10:53 PM
    Has anyone looked into integrating Venice with Snowflake as a batch data source or batch ingestion layer? Since we already use Snowflake for transformations, I’m wondering what the main challenges or limitations might be.
    k
    n
    f
    • 4
    • 20
  • f

    Felix GV

    11/19/2025, 5:46 PM
    This is quite similar to how large values get chunked inside the Venice server. If instead of appending a flag to the key, Venice used a separate table (column family) to store chunks, then it would essentially be the same. https://www.linkedin.com/posts/ben-dicken-78797a73_postgres-uses-toast-to-store-large-variable-sized-activity-7396908837485727744-dG76?utm_source=share&utm_medium=member_ios&rcm=ACoAAAEk238BXlhz1s5hXc96bKIJJ-eWwXUnlas
    😮 1
  • g

    Gaojie Liu

    11/19/2025, 5:58 PM
    @Felix GV The latest Venice blog post is out today: https://www.linkedin.com/blog/engineering/infrastructure/evolution-of-the-venice-ingestion-pipeline
    🎉 6
  • f

    Felix GV

    11/19/2025, 7:00 PM
    Great to see that post finally come to light! I just re-read it and it’s even better than the first time I did! Congrats @Gaojie Liu and team!
    ➕ 2
  • z

    Zac Policzer

    11/19/2025, 7:39 PM
    Just posted on the VeniceDB LinkedIn page.
  • z

    Zac Policzer

    11/19/2025, 7:40 PM
    idk why, but using a social media account and posting as VeniceDB feels like this:
    😂 4
  • z

    Zac Policzer

    11/19/2025, 7:49 PM
    Blue sky post up as well: https://bsky.app/profile/venicedb.org/post/3m5yzjby7tk2t
    🙌 1
    f
    • 2
    • 5
  • m

    Minh Nguyen

    11/25/2025, 3:31 AM
    🎥 Proud to share our two latest ASQ videos! • Data Integrity Validation by @Lei Lu:

    https://youtu.be/sYytwZ4WJJw▾

    • Stateful CDC Client by @Koorous Vargha:

    https://youtu.be/6vvtmijdwUI▾

    Check them out and let us know your thoughts! 🚀
    🚀 2
    🎉 1
    👀 1
    venice black on white 1
    f
    • 2
    • 1