LLM-Guided Reward Shaping for Demographic Equity in Buildings

ai-technology · 2026-05-28

The newly developed framework, OccuReward, leverages large language models (LLMs) to create reward functions for deep reinforcement learning (DRL) focused on energy management in buildings, with the goal of promoting demographic equity in occupant comfort. The research presents the Comfort Equity Index (CEI) as a feedback mechanism and utilizes iterative, equity-conscious LLM reward shaping. A Soft Actor-Critic agent is implemented in CityLearn v2, utilizing four occupant profiles sourced from the ASHRAE Global Thermal Comfort Database II (13,440 votes). The Gemini API is responsible for producing the logic and weights for the reward function. This study investigates how LLM-driven reward design can help reduce disparities among diverse demographic groups.

Key facts

OccuReward is a framework for LLM-guided reward shaping.
It targets demographic equity in grid-interactive buildings.
The Comfort Equity Index (CEI) is introduced as a novel feedback signal.
Four occupant profiles from ASHRAE Global Thermal Comfort Database II are used.
The database contains 13,440 votes.
A Soft Actor-Critic agent is deployed in CityLearn v2.
The Gemini API generates reward function logic and weights.
The study examines disparities in occupant comfort across demographics.

LLM-Guided Reward Shaping for Demographic Equity in Buildings

Key facts

Entities

Institutions

Sources