FPILOT: Inference-Time Optimization for RL Trading Agents
A new framework called FPILOT (Financial Plugin Inference-time Learning for Optimal Trading) enhances reinforcement learning agents for portfolio management by enabling inference-time optimization using price forecasts. Inspired by Model Predictive Control, FPILOT uses a predictive model to generate multi-step price trajectories without requiring iterative action-conditioned rollouts. At each decision step, the framework optimizes the policy based on an imagined return objective derived from predicted prices, then executes one trade step. It is compatible with any pre-trained agent and adapts to changing forecasts.
Key facts
- FPILOT stands for Financial Plugin Inference-time Learning for Optimal Trading
- It is a plugin inference-time optimization framework
- Inspired by Model Predictive Control (MPC)
- Uses a predictive model for multi-step price trajectory
- Optimizes policy at inference-time before each trade step
- Compatible with any pre-trained agent
- Adapts policy to forecasted prices
- No iterative action-conditioned rollouts needed
Entities
—