Discussion about this post

User's avatar
Joshua's avatar

Thank you for this essay. Your writing has brought forth one of the most damning parts of agentic system development: non-determinism. When you ship an agent, the workflow is usually fixed at shipping but because the underlying LLM pipeline will ever slightly change due to the non-deterministic nature of these frontier language models, the corresponding workflow also must adjust during post-production operation but they normally don’t. And this renders the agent unreliable and the testing nearly impossible. This is why there are many eval + observability platforms (LandSmith, Arize, W&B) proliferating but they don’t actually fix the issues. I will have to do it manually and that suck.

Vasile Tiple's avatar

Great art. In vibecoding, this could be called predictable cataclysm.

1 more comment...

No posts

Ready for more?