Blog

01

Is synthetic data good enough to train user simulators?

We investigate whether LLM-generated data can replace real user data for training good user simulators.

Read →
02

Dropping out persona makes stronger user simulators

Selectively omitting user profile information during training leads to better user simulators.

Read →
03

Reward hacking in training user simulators, not just from LLM judges

Reward hacking isn't limited to LLM judges...

Coming soon
04

HumanLM evolves during training

Tracking how model behavior and alignment shifts across training.

Coming soon