ARTFEED — Contemporary Art Intelligence

Data Selection Reformulated as Sequential Decision-Making

other · 2026-06-01

A fresh theoretical model presents data selection as a sequential decision-making challenge, with optimal sequences obtained through dynamic programming techniques. Data values are viewed as representations of this optimal sequence, integrating existing approaches such as Data Shapley as short-sighted linear estimates. The research examines the decline in selection optimality due to utility curvature in submodular contexts, clarifying the shortcomings of current approximations. To connect theory with practical application, an effective bipartite graph-based surrogate is introduced, maintaining submodular characteristics for scalable greedy selection with demonstrable guarantees. The methodology is tested through experiments on traditional tasks.

Key facts

  • Data selection is reformulated as a sequential decision-making problem
  • Optimal selection sequence arises from dynamic programming
  • Data values are encodings of the optimal sequence
  • Data Shapley is reinterpreted as a myopic linear approximation
  • Selection optimality degrades with utility curvature under submodularity
  • A bipartite graph-based surrogate enables scalable greedy selection
  • The surrogate preserves submodular structure with provable guarantees
  • Experiments conducted on classical tasks

Entities

Sources