Blog

Is synthetic data good enough to train user simulators?

We investigate whether LLM-generated data can replace real user data for training good user simulators.

Selectively omitting user profile information during training leads to better user simulators.

Reward hacking isn't limited to LLM judges...

Coming soon

Tracking how model behavior and alignment shifts across training.

Coming soon