Nash-MADDPG Improves V2V Energy Trading by 61.6%

other · 2026-05-23

The newly developed Nash-MADDPG framework employs multi-agent reinforcement learning by incorporating the Nash Bargaining Solution into the Multi-Agent Deep Deterministic Policy Gradient, facilitating vehicle-to-vehicle (V2V) energy trading among electric vehicles (EVs). This method promotes decentralized peer-to-peer energy transactions, thereby decreasing reliance on the grid and allowing for the monetization of excess capacity. Efficient bilateral pricing is established through Nash bargaining, and rewards based on Nash-guided price proximity steer agents towards optimal bargaining strategies. A 30-day continuous operation evaluation reveals a 61.6% enhancement in social welfare and a 62.9% increase in trading volume compared to Double Auction, while also ensuring greater fairness. The study tackles the complexities of coordinating self-interested EV agents with varying charging requirements and unpredictable schedules, addressing the shortcomings of centralized optimization and fairness in current methods.

Key facts

Nash-MADDPG integrates Nash Bargaining Solution into Multi-Agent Deep Deterministic Policy Gradient
Improves social welfare by 61.6% over Double Auction
Improves trading volume by 62.9% over Double Auction
Enables decentralized peer-to-peer energy exchange among EVs
Reduces grid dependency while monetizing surplus capacity
Nash bargaining determines efficient bilateral pricing
Nash-guided price proximity rewards align agent learning
Evaluated over 30-day continuous operation

Entities

—

Sources

arXiv cs.AI — 2026-05-23