ARTFEED — Contemporary Art Intelligence

SemaPop: Semantic-Persona Conditioned Population Synthesis Framework

publication · 2026-04-25

A study available on arXiv introduces SemaPop, a framework for population synthesis that is both semantic-conditioned and controllable. This innovative approach utilizes large language models (LLMs) to extract persona representations from survey data. By converting these personas into semantic embeddings, the framework allows for generation that adheres to statistical constraints. Implemented with a GAN-based architecture and marginal regularization, it maintains distributional consistency. The research tackles the shortcomings of current unconditional generation methods used in transport planning and socio-economic analysis.

Key facts

  • SemaPop is a semantic-conditioned and controllable population synthesis framework.
  • It uses large language models (LLMs) to derive persona text from survey data.
  • Persona representations are encoded into semantic embeddings for conditioning.
  • The framework enables controllable generation under statistical constraints.
  • It uses a GAN-based architecture with marginal regularization.
  • The paper is published on arXiv with ID 2602.11569v2.
  • Population synthesis is essential for individual-level simulation in transport planning and socio-economic analysis.
  • Existing data-driven approaches predominantly rely on unconditional generation.

Entities

Institutions

  • arXiv

Sources