GEPA: How an LLM Can Write a Better Prompt Than RL Can Train One
A walkthrough of GEPA (Agrawal et al., ICLR 2026), the reflective prompt optimiser that beats GRPO with up to 35× fewer rollouts by reading its own trace logs in plain English. The four-step loop, a worked iteration on a multi-hop QA system, the Pareto trick that keeps the candidate pool diverse, and where 98% of the rollout budget actually goes.