• Fondos Newsletter
  • Posts
  • šŸŽ™ļø New Ep. Jay Ram | Evals Don’t Improve Agents - Environments Do: Stop Scoring, Start Training

šŸŽ™ļø New Ep. Jay Ram | Evals Don’t Improve Agents - Environments Do: Stop Scoring, Start Training

Find out why 1,500+ Startups use Fondo: The All-in-one accounting for startups

Focus on building. We'll handle the books. šŸ‘‰ Get Started 

Hey There,

The newest episode of The Startup Growth Podcast is live!

Jay Ram is Founder & CEO of Hud, the evaluation and RL platform for AI agents. Hud helps startups build RL environments, run fast reward loops, and plug into any RL backend—so teams can cut costs and push last-mile accuracy once they've hit PMF. Before Hud, Jay left a lucrative quant career, shipped an AI prank-calling app that briefly hit #1 on the App Store (ā‰ˆ500k calls), and decided he wanted harder problems and smarter customers. He's a YC W25 alum; Hud is already used by researchers at foundation labs and is expanding into enterprise environments.

Jay's catalyst was realizing he didn't want to just talk weekends—he wanted to build. He and his co-founders first tackled computer-use evals for labs. Inside that work, the language shifted: labs asking for "evals" really needed environments—places where you design rewards, iterate, and actually improve model behavior. Today, Jay frames Hud as the "Next.js of RL environments": opinionated lifecycle, backend-agnostic training, and infra that returns signal fast. Early on, use a foundation model; post-PMF, train your own with SFT/RL—that's where environments matter. Looking ahead, he sees post-training speciation: domain-tuned models for finance, accounting, creative tooling, and more—because teams will own more of their stack again.

Key Topics:

  • What Hud is: tools to set up your agent for RL, define tasks, shape rewards, and plug into RFT/other RL backends.

  • From evals to environments: why scores measure but rewards improve—and how iteration loops change outcomes.

  • Pre-train vs post-train: scale vs accuracy + domain depth—and why post-training is the real edge.

  • YC W25 arc: vision matched the original app more than mid-batch; enterprise demand is catching up now.

  • & more

I hope you find this educational content valuable and can apply it in your business. Until next week - keep experimenting, keep scaling, keep building.

David J Phillips

Founder & CEO at Fondo