arXiv Paper Identifies Synchronization Bias in Federated Learning Systems
On April 26, 2024, a technical paper was released on arXiv that examines a flaw in Federated Learning (FL) systems, particularly the Probabilistic Synchronous Parallel (PSP) method. This technique aims to alleviate synchronization delays by selecting a subset of nodes. However, the authors contend that PSP presumes static and independent behaviors from devices, resulting in skewed synchronization where more reliable nodes dominate the training process. As a result, devices that are often unavailable risk having their data excluded. The study reveals a situation where the availability of nodes correlates with data distribution, potentially leading to the under-representation of certain data classes, which can obstruct the effective learning of specific features. The paper, arXiv:2604.16090v1, emphasizes how correlated device failures can distort the learning process.
Key facts
- The paper was published on arXiv on April 26, 2024.
- It analyzes the Probabilistic Synchronous Parallel (PSP) technique in Federated Learning.
- PSP aims to reduce synchronization bottlenecks by sampling a subset of nodes per round.
- FL systems often involve unreliable edge devices due to mobility and power constraints.
- PSP assumes device behavior is static and independent, which the paper identifies as a key limitation.
- This assumption can lead to unfair synchronization, with highly available nodes dominating training.
- If data distribution and node availability are correlated, certain classes may be persistently underrepresented.
- This under-representation can cause inefficient or ineffective learning of specific model features.
Entities
Institutions
- arXiv