Working out some thoughts about how to present thi...
# success-stories
a
Working out some thoughts about how to present this as a tool for the community, but I got a bit triggered by the state of the trajectory visualizer. I’d like Vizz to handle ingestion of any convo chat export format, OpenHands trajectories are the standard right now but I’ll be going after Open WebUI and some online platforms like Bolt that don’t readily export as nicely. I strived to make the UI dense with available information from the session, but without being too cluttered (it is too cluttered at the moment). I’ll drop a link to the repo in Slack when I get to a decent checkpoint.
p
not sure I follow, feels like a horizontal time-axis LangSmith of sorts? some ideas: make it 3D, 1st person, the message is the player, you walk up on a piano until a fork on the road; your HUD shows what happened, maybe you have a mirror (like in racing games) to see what you just talked about with the AI some mechanism to see forks in the road and edits of past turns summary of groups of turn (what Jules would call a task) rolling TOC as the trajectory is happening text diffs are pain, image diffs pretty horrible UX, 3D diffs almost non-existent, but I feel those kinds of tools would help us synthesize a trajectory and differences between trajectories we need to use the multimodality better, even for SWE-agentic workflows also, text diffusion models might break this iterative sequential mental model altogether(?)

https://www.youtube.com/watch?v=agOdP2Bmieg&t=1217s

https://dynamicland.org/2024/Intro/
a
🙂 i love all of these ideas. For my own sake while digging into OpenHands, I wanted an end to end visualization to avoid bad assumptions. But yeah, it's very langsmith/fuse adjacent, but I'd like to make it a dead simple drag & drop to get an overview of the trajectory results
make it 3D, 1st person, the message is the player, you walk up on a piano until a fork on the road; your HUD shows what happened, maybe you have a mirror (like in racing games) to see what you just talked about with the AI Triggered from the days of XR demo weekends, haha. I'm surprised it's not 3d already
😄 1
p
I'm still appalled that we can't easily group chat with learning models; sharing threads is so 2005 SWE-agenting should be a team sport; the UX is rough with OpenHands. E.g. why do we even need the #C08D8FJ5771 channel in 2025? IMO, it's to fill the giant gap needed for co-dev. Just because Linus wanted a DVCS to manage merges, doesn't mean we should all end up in that same spacetime constraint. Trajectories, the way you're trying to visualize them, is still framed as a solitary activity; why? we might need a "google docs" real-time co-editing/Miro equivalent?
🙌 1
💯 1
a
I'm still appalled that we can't easily group chat with learning models; sharing threads is so 2005 Well, so... I'm in the process of launching a 24/7 stream where the LLM/Agent is the centerpiece of the television station, but a quiet goal is to implement WebRTC/SIP conferencing and turn it into a collaborative, emergent old TV station 🙂 The group chat with the LLM is easy, but the coordination of participants and helping them to participate is the extremely difficult UX challenge if only because it's not being done from what I see just yet.
👍 1
"first to market" 💀 haha. Collaborative text/voice chat should be the only natural end result from where we've come so far, just an opinion and not a strong one heh
I just appreciate the potential of global communication, especially if the computers are about to enter the chat.
The elevenlabs audiovisual output of myself is already startling, before any sort of GenAI people or avatars get involved
but I digress