Tag: agents
All the articles with the tag "agents".
-
Breaking Down Agent Evals: A Practitioner's Guide
Published:Part 1 of a 3-part series. Why traces (not code) are the source of truth in agents, the three observability primitives, run types, the metrics that matter at each level, the pass^k reliability metric, a four-step methodology for building an eval suite, and a filter funnel approach to why no single eval method is enough.
-
How to Mitigate the Lost-in-the-Middle Effect in LLMs
Published:A look at why long contexts quietly break LLMs, why important information is easier to use at the boundaries than in the middle, and why agents that periodically restate their goals at the end of the context often work better.