SimPersona: Learning Buyer Personas from Clickstreams for E-Commerce Agents
A novel framework named SimPersona has been developed to identify distinct buyer categories from raw clickstream data, enhancing LLM-based web agents in e-commerce. Conventional agents tend to reduce buyer diversity to a single "average buyer" policy, neglecting the varied nature of the population. SimPersona employs a behavior-aware VQ-VAE to create a discrete buyer-type space from historical traffic, effectively capturing the statistical patterns of actual buyer behavior and merchant-specific distributions. Each identified buyer type corresponds to a unique persona token within the LLM agent's vocabulary, facilitating behavior-specific guidance. This method overcomes the shortcomings of manually crafted prompt-based personas, which tend to be fragile, difficult to scale, and inefficient in context. The research is available on arXiv (2605.14205).
Key facts
- SimPersona learns discrete buyer types from raw clickstreams
- LLM-based web agents often collapse to a single 'average buyer' policy
- Existing personalization uses hand-crafted prompt-based personas
- Hand-crafted personas are brittle, difficult to scale, and context-inefficient
- SimPersona uses a behavior-aware VQ-VAE to induce a discrete buyer-type space
- The framework captures statistical structure of real buyer behavior
- Each buyer type is mapped to a dedicated persona token in the LLM agent vocabulary
- The paper is published on arXiv with ID 2605.14205
Entities
Institutions
- arXiv