ARTFEED — Contemporary Art Intelligence

Deep Reinforcement Learning for Autonomous Bearings-Only Tracking

other · 2026-05-06

A recent study published on arXiv presents a novel deep reinforcement learning approach designed for tracking moving targets using only bearing information. This framework employs a belief Markov decision process that integrates insights from a cubature Kalman filter. The method seeks to optimize both the accuracy of target position estimates and the reliability of the Kalman filter through a specially crafted reward system. A deep Q-network was trained over 50,000 episodes and evaluated through 5,000 Monte Carlo simulations, benchmarked against two existing methodologies: the perpendicular-to-bearing approach and D-optimal Fisher information maximization techniques.

Key facts

  • Paper develops deep reinforcement learning observer control for bearings-only tracking.
  • Observer maneuver problem formulated as belief Markov decision process.
  • Belief state represented by cubature Kalman filter (CKF) posterior.
  • Reward function balances Euclidean distance and Mahalanobis distance.
  • Reward is geometric interpolation on Pareto front with β ∈ [0,1].
  • Policy implemented as deep Q-network (DQN) trained over 50,000 episodes.
  • Evaluated over 5,000 Monte Carlo episodes.
  • Compared against perpendicular-to-bearing heuristic and D-optimal Fisher information maximization.

Entities

Institutions

  • arXiv

Sources