Truthful Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing

ai-technology · 2026-05-26

A new arXiv paper proposes a mechanism for truthful online preference aggregation to fine-tune large language models (LLMs) in mobile crowdsourcing. The study addresses strategic misreporting by workers who may distort feedback to maximize influence or payment. Existing methods like EM-based weight estimation fail to identify the most accurate worker, leading to linear regret over time. The authors formulate a dynamic Bayesian game modeling the multi-agent learning process between the platform and workers. They introduce a novel online weighted aggregation mechanism that dynamically adjusts weights to ensure truthfulness and improve learning efficiency. The paper is published under arXiv:2605.24052.

Key facts

arXiv paper 2605.24052 proposes truthful online preference aggregation for LLM fine-tuning in mobile crowdsourcing.
Workers may strategically misreport feedback to maximize influence or payment.
Existing EM-based weight estimation fails to identify the most accurate worker, resulting in linear regret O(T).
A dynamic Bayesian game models the multi-agent online learning process.
A novel online weighted aggregation mechanism is proposed to ensure truthfulness.
The mechanism dynamically adjusts weights based on worker accuracy.
The approach aims to improve LLM alignment with human feedback in mobile applications like navigation.
The paper is a cross submission type.

Truthful Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing

Key facts

Entities

Institutions

Sources