Lorenzo's pile of garbage

This is a pile of garbage. Your job is to find the gold.

non-academic academic for fun

Series

Agentic System Design

Engineering LLM agents from problem framing to production monitoring.

How I Learned RL for LLMs

From REINFORCE to GRPO to agents — an eval researcher's map through RL.

7 posts

Latest

How I Learned RL for LLMs

Dissecting NeMo Gym: How NVIDIA Built a Modular Microservice Architecture for RL Verification at Scale

A deep dive into NeMo Gym's three-server-type design, 34 reward verifiers, and the infrastructure decisions that make RLVR pipelines composable at scale.

Mar 14, 2026 12 min read

#academic #AI #engineering blog
A Systems Engineering Approach to LLM Agents · Part 0

A Systems Engineering Approach to LLM Agents — Series Overview

Everyone's building agents. Almost nobody is engineering them. A 7-part series on the full lifecycle — from problem framing to production monitoring.

Mar 14, 2026 6 min read

#academic #AI #engineering blog