01
Is synthetic data good enough to train user simulators?
We investigate whether LLM-generated data can replace real user data for training good user simulators.
Read →
02
Dropping out persona makes stronger user simulators
Selectively omitting user profile information during training leads to better user simulators.
Read →
03
Reward hacking in training user simulators, not just from LLM judges
Reward hacking isn't limited to LLM judges...
Coming soon
04
HumanLM evolves during training
Tracking how model behavior and alignment shifts across training.
Coming soon