Recent work on synthetic users and behavior modeling has made important progress but leaves key gaps. Anthology conditions LLMs on rich, survey-style backstories and improves consistency and subgroup fidelity in virtual personas, yet these backstories are still largely synthetic narratives rather than traits learned from large-scale logs of how people actually act (Moon et al., 2024).
arXiv PersonaHub and Personas within Parameters move closer to data-driven personas—PersonaHub by auto-curating a billion text-based personas from web data for persona-conditioned synthetic data generation, and Personas within Parameters by fine-tuning small LMs with low-rank adapters on interaction logs to mimic specific users—but both ultimately tie behavior to particular datasets and model checkpoints; Personas within Parameters in particular requires a separate adaptation run per user, making it difficult to scale to millions of distinct personas (Ge et al., 2024; Thakur et al., 2025).
arXiv+1 The Stanford HAI work on
Generative Agent Simulations of 1,000 People shows that agents grounded in two-hour qualitative interviews can closely replicate survey responses and experimental outcomes for 1,052 real individuals, but this architecture depends on expensive, interview-style data collection and is optimized for reproducing survey attitudes rather than continuous product-interaction behavior at population scale (Park et al., 2024).
arXiv+1 Behavior-oriented foundation models such as Be.FM and Centaur instead train a single large model on diverse behavioral or cognitive datasets to predict human decisions across many tasks (Xie et al., 2025; Binz et al., 2024).
arXiv+1 However, because all behavior is encoded in one monolithic predictor rather than an explicit, inspectable synthetic population of individuals, these approaches make it hard to study population structure, attach persistent persona profiles, or run controlled “what-if” simulations on specific user segments.