Self-Supervised and Reinforcement Learning for Android Malware Concept Drift
A recent framework introduced by arXiv (2605.24294) suggests leveraging self-supervised and reinforcement learning techniques to adjust Android malware detectors to concept drift post-deployment. This method treats maintenance as a sequential decision-making challenge, learning a consistent latent representation through self-supervised learning during the initialization phase. It locks the encoder, assesses latent drift, and employs a lightweight adapter along with a classification head for subsequent adaptations. A proximal policy optimization controller determines cost-effective maintenance actions based on the detector's condition, which includes utility, retention, drift signals, and update expenses. Tested on both emulator and actual Android malware datasets with static and dynamic features under a causal deployment framework, the RL controller demonstrates potential for minimizing retraining expenses.
Key facts
- Framework uses self-supervised and reinforcement learning for concept drift adaptation
- Models deployment-time maintenance as a sequential decision problem
- Learns stable latent representation via self-supervised learning during initialization
- Freezes encoder and measures latent drift in fixed representation space
- Uses trainable adapter and classification head for lightweight downstream adaptation
- Proximal policy optimization controller selects low-cost maintenance actions
- Evaluated on emulator and real Android malware datasets with static and dynamic features
- Causal deployment-style protocol used for evaluation
Entities
Institutions
- arXiv