Working out some thoughts about how to present this as a tool for the community, but I got a bit triggered by the state of the trajectory visualizer. I’d like Vizz to handle ingestion of any convo chat export format, OpenHands trajectories are the standard right now but I’ll be going after Open WebUI and some online platforms like Bolt that don’t readily export as nicely. I strived to make the UI dense with available information from the session, but without being too cluttered (it is too cluttered at the moment). I’ll drop a link to the repo in Slack when I get to a decent checkpoint.
p
polite-architect-52194
07/11/2025, 1:03 AM
not sure I follow, feels like a horizontal time-axis LangSmith of sorts?
some ideas:
make it 3D, 1st person, the message is the player, you walk up on a piano until a fork on the road; your HUD shows what happened, maybe you have a mirror (like in racing games) to see what you just talked about with the AI
some mechanism to see forks in the road
and edits of past turns
summary of groups of turn (what Jules would call a task)
rolling TOC as the trajectory is happening
text diffs are pain, image diffs pretty horrible UX, 3D diffs almost non-existent, but I feel those kinds of tools would help us synthesize a trajectory and differences between trajectories
we need to use the multimodality better, even for SWE-agentic workflows
also, text diffusion models might break this iterative sequential mental model altogether(?)
🙂 i love all of these ideas. For my own sake while digging into OpenHands, I wanted an end to end visualization to avoid bad assumptions. But yeah, it's very langsmith/fuse adjacent, but I'd like to make it a dead simple drag & drop to get an overview of the trajectory results
ambitious-account-40737
07/11/2025, 2:38 AM
make it 3D, 1st person, the message is the player, you walk up on a piano until a fork on the road; your HUD shows what happened, maybe you have a mirror (like in racing games) to see what you just talked about with the AI
Triggered from the days of XR demo weekends, haha. I'm surprised it's not 3d already
😄 1
p
polite-architect-52194
07/11/2025, 2:52 AM
I'm still appalled that we can't easily group chat with learning models; sharing threads is so 2005
SWE-agenting should be a team sport; the UX is rough with OpenHands. E.g. why do we even need the #C08D8FJ5771 channel in 2025? IMO, it's to fill the giant gap needed for co-dev. Just because Linus wanted a DVCS to manage merges, doesn't mean we should all end up in that same spacetime constraint.
Trajectories, the way you're trying to visualize them, is still framed as a solitary activity; why?
we might need a "google docs" real-time co-editing/Miro equivalent?
🙌 1
💯 1
a
ambitious-account-40737
07/11/2025, 9:54 PM
I'm still appalled that we can't easily group chat with learning models; sharing threads is so 2005
Well, so... I'm in the process of launching a 24/7 stream where the LLM/Agent is the centerpiece of the television station, but a quiet goal is to implement WebRTC/SIP conferencing and turn it into a collaborative, emergent old TV station 🙂 The group chat with the LLM is easy, but the coordination of participants and helping them to participate is the extremely difficult UX challenge if only because it's not being done from what I see just yet.
👍 1
ambitious-account-40737
07/11/2025, 9:54 PM
"first to market" 💀 haha. Collaborative text/voice chat should be the only natural end result from where we've come so far, just an opinion and not a strong one heh
ambitious-account-40737
07/11/2025, 9:55 PM
I just appreciate the potential of global communication, especially if the computers are about to enter the chat.
ambitious-account-40737
07/11/2025, 9:56 PM
The elevenlabs audiovisual output of myself is already startling, before any sort of GenAI people or avatars get involved