This is a pile of garbage. Your job is to find the gold.
Series
Engineering LLM agents from problem framing to production monitoring.
Coming soonFrom REINFORCE to GRPO to agents — an eval researcher's map through RL.
Coming soon7 posts
A summary of our preprint. We measure persona collapse across 10 LLMs and 1,144 personas, and show that better per-persona fidelity often makes population diversity worse.
A summary of our position paper. We argue AI welfare assessment fails for two structural reasons: indicators are co-engineered with the systems they evaluate, and there is no external validation channel that can falsify them.
The first design decision is not how smart the system should be, but where uncertainty, agency, and accountability sit. A framework for making that choice deliberately.
96 papers across algorithms, rewards, preferences, systems, and agents — organized into 5 categories with reading depth recommendations.
An NLP evaluation researcher's honest map through the algorithmic, reward, and systems landscape of reinforcement learning for language models.