Vectorial AI: Sapiens - Learns Traits of Users from Large Observation Data to Create Synthetic Users that Can Predict Human Behavior.
We introduce SAPIENS, AI framework that directly learns rich behavioral traits from large-scale observational data. Concretely, SAPIENS analyzes real user opinions to learn and describe the latent traits and recurring experience patterns that drive how different individuals respond to stimuli, and uses these signals to instantiate an explicit synthetic population: personas with learned characteristics and backstories. Inspired by TextGrad’s idea of using natural-language feedback based on gradients and matrix factorization-inspired learning algorithms often used in recommender systems to learn user traits, SAPIENS iteratively refines and learns persona traits to reduce the delta between synthetic and real opinions, without needing to fine-tune the backbone model separately for each user.
Synthetic users formed by SAPIENS significantly outperformed other behavior foundation models and standard LLMs in predicting human opinion given a product description.

Creating synthetic users with fundamental traits across demographics, psychology, and experiences learned from real human opinions has broad applications, from helping enterprises running simulations during different stages of product development, for example, running simulations to understand pain points and evaluate graphical user interfaces. conversation and voice interfaces to provide feedback that reflects real human opinions. Another area where synthetic users are super important is in training foundation models for reward modeling of different user populations to make them cater to different intents and preferences, enabling foundation models to give better personalized responses for different users. This has huge implications for training the next generation of foundation models.
Current Research Work in Creating Synthetic Users & Human Behavior Simulation by Stanford, UC Berkeley
Recent work on synthetic users and behavior modeling has made important progress but leaves key gaps. Anthology conditions LLMs on rich, survey-style backstories and improves consistency and subgroup fidelity in virtual personas, yet these backstories are still largely synthetic narratives rather than traits learned from large-scale logs of how people actually act (Moon et al., 2024). arXiv PersonaHub and Personas within Parameters move closer to data-driven personas—PersonaHub by auto-curating a billion text-based personas from web data for persona-conditioned synthetic data generation, and Personas within Parameters by fine-tuning small LMs with low-rank adapters on interaction logs to mimic specific users—but both ultimately tie behavior to particular datasets and model checkpoints; Personas within Parameters in particular requires a separate adaptation run per user, making it difficult to scale to millions of distinct personas (Ge et al., 2024; Thakur et al., 2025). arXiv+1 The Stanford HAI work on Generative Agent Simulations of 1,000 People shows that agents grounded in two-hour qualitative interviews can closely replicate survey responses and experimental outcomes for 1,052 real individuals, but this architecture depends on expensive, interview-style data collection and is optimized for reproducing survey attitudes rather than continuous product-interaction behavior at population scale (Park et al., 2024). arXiv+1 Behavior-oriented foundation models such as Be.FM and Centaur instead train a single large model on diverse behavioral or cognitive datasets to predict human decisions across many tasks (Xie et al., 2025; Binz et al., 2024). arXiv+1 However, because all behavior is encoded in one monolithic predictor rather than an explicit, inspectable synthetic population of individuals, these approaches make it hard to study population structure, attach persistent persona profiles, or run controlled “what-if” simulations on specific user segments.
SAPIENS addresses these gaps by learning an explicit population of synthetic users whose traits are inferred directly from large-scale observational data, rather than from LLM-generated backstories through interviews, or per-user fine-tuning. It builds high-dimensional explainable traits from many historical opinions per user and uses these traits to predict reactions on new products. This design produces more grounded, adaptable, and interpretable personas than prompt-only or monolithic behavior foundation models, while remaining practical to scale to millions of synthetic users
Performance Benchmarking, Dataset, and Methodology
Our synthetic user models achieve 84% accuracy in predicting which topics a user will focus on when reacting to a new product (for example, price, reliability, usability, or support). This is about 30% higher than behavior foundation models Be.FM and Centaur, and general-purpose LLMs Claude Sonnet 4 and GPT-5 on the same benchmark.
Dataset Foundation and Research Task
Data Source
We ground our models in the Amazon product reviews dataset, a comprehensive public repository containing millions of authentic user opinions across diverse product categories. This dataset provides the real-world behavioral foundation necessary for training synthetic users in the e-commerce domain that reflect actual human responses.
Prediction Task
Given a product or feature description, our models predict the specific opinion topics that real users will discuss—not just whether feedback is positive or negative, but what aspects they'll talk about. We then measure accuracy by comparing predicted topics with actual topics.
The choice of Amazon reviews as our foundational dataset reflects a deliberate strategy: these reviews represent authentic, unstructured user feedback across an enormous variety of products and use cases. Users write about what matters to them, in their own words, without the constraints of surveys or interview scripts. This natural language data captures the true diversity of user concerns, priorities, and mental models.
Why this task is challenging. A single product description can trigger a wide range of possible reactions: one user might focus on privacy, another on ease of setup, and yet another on long-term durability. Even when we provide past reviews, they don’t directly tell us what a user will talk about next. The model has to infer deeper traits and habits (what this person tends to care about) and then generalize them to a completely new product and context.
How we evaluate. SAPIENS learns from each user’s historical reviews to understand those underlying traits and preferences. At test time, every model—SAPIENS, Be.FM, Centaur, Claude Sonnet 4, and GPT-5—gets the same inputs: a description of an unseen product plus a sample of that user’s previous reviews. Each model must predict the main topic that user will mention in their next review. We then compare the predicted topic with the ground-truth topic from the real review to compute accuracy.
Learning User Traits from Real Observation Data - Deep Dive
We have an architecture that is inspired by algorithms used in recommender systems like matrix factorization, a popular way to model a user's latent traits and an item's latent traits to predict which item a user is going to like next. We took it a step further to describe how these traits lead to certain preferences and decision making. Each time we compare a synthetic user's predictions to real user responses, we learn something about how that persona's traits should be calibrated. Did we overestimate their concern about pricing sensitivity ? Underestimate their focus on aesthetics? These insights get encoded back into the user model, making subsequent predictions (synthetic opinions) more closely matched to real opinions. Over time, this creates a system that continuously learns and adapts to emerging patterns of diverse users and their decision making patterns.
Present Product Description
Synthetic users receive detailed product or feature descriptions, just as real users would encounter them in marketing materials or product launches.
Generate Initial Response
Based on their trait profiles, synthetic users produce predictions about which topics they would discuss and what concerns they would raise.
Compare to Real Feedback
We validate synthetic responses against actual user opinions from similar products, identifying matches and misalignments.
Update Persona Traits and Backstory
Prediction errors trigger adjustments to user trait profiles, refining how characteristics map to opinion topics for future predictions.
Persona Architecture: Layered Trait Modeling
The accuracy of synthetic user predictions depends fundamentally on how we model user characteristics. Sapiens looks at users in multiple aspects that captures the different dimensions of user identity and behavior that influence their decision-making. This layered approach reflects a key insight: opinion topics are context-dependent, and predicting them requires understanding not just what users do, but also why they do it, what constraints they face, and their past experiences
Each layer contributes essential information for predicting user opinion. Behavioral patterns tell us how a user approaches decisions—whether they dive deep into technical details or prioritize ease of use. Psychographic traits reveal personality and how it influences decision-making. Background and experience determine what they are capable of evaluating and what they will compare against. Role and Domain context define the broad domain to which user opinions will be related.
Two users might both give a product four stars, but one discusses pricing transparency while the other focuses on integration capabilities. The difference comes from their trait combinations. Consider two Amazon reviews with same ratings for the same pair of headphones: one from a daily commuter prioritizing battery life and noise cancellation, versus an audiophile focusing on subtle sound nuances and comfort for extended listening sessions.
Continuous Updates: Perception, Knowledge, and Forming Intuition
User behavior evolves constantly—new product launches, technology shifts, evolving user expectations, changing market conditions, and emerging technologies create new possibilities. Static user models quickly become obsolete in this dynamic environment.

Vectorial AI addresses this challenge through two crucial foundations:
1. A continuous data pipeline of observation data: This system ingests observation data from diverse public sources, such as social media platforms like linkedin, reddit, youtube. When integrated with enterprises, it continuously learns as companies launch new products and features, ensuring real-time relevance.
2. A three-layer continuous update system of cognition: This dynamic architecture keeps synthetic users current with evolving contexts and allows them to continuously learn and react to new stimuli effectively.
Perception Layer
Continuously reads new signals about how users perceive public and company events related to the product. This includes tracking social media discussions, product reviews, and shifts in user behavior and key events.
Knowledge Layer
Organizes incoming signals into a structured map, linking and making sense of topics discussed by users across many different sources. This allows the user to consolidate scattered insights and knowledge split across various sources.
Intuition Layer
Creates an episodic memory that tracks context, time, and reactions. This memory records specific instances of how synthetic users responded to particular stimuli at certain times, allowing the system to learn from past experiences and refine its predictions based on the outcomes of these 'episodes'.
"The continuous update system ensures our predictions remain accurate even as market conditions change, product positioning evolves, and user expectations shift. We're not predicting based on static historical patterns—we're simulating how users would respond today."
Vectorial AI: On a Mission to Represent the Entire World Population and Behavior
Vectorial AI is built on a simple but ambitious belief: to truly understand and model human behavior. Human-behavior simulation is emerging as a new branch of AI—distinct from LLMs that primarily capture world knowledge and from world models that simulate physical dynamics. While AGI aims for general cognitive capability, our north star is Behavior General Intelligence (BGI): systems that can represent how diverse people think, feel, and act across contexts. SAPIENS v1.0 is our first concrete step toward this goal—a state-of-the-art engine for creating synthetic users.