Self-Supervised and Reinforcement Learning for Android Malware Concept Drift

other · 2026-05-26

A recent framework introduced by arXiv (2605.24294) suggests leveraging self-supervised and reinforcement learning techniques to adjust Android malware detectors to concept drift post-deployment. This method treats maintenance as a sequential decision-making challenge, learning a consistent latent representation through self-supervised learning during the initialization phase. It locks the encoder, assesses latent drift, and employs a lightweight adapter along with a classification head for subsequent adaptations. A proximal policy optimization controller determines cost-effective maintenance actions based on the detector's condition, which includes utility, retention, drift signals, and update expenses. Tested on both emulator and actual Android malware datasets with static and dynamic features under a causal deployment framework, the RL controller demonstrates potential for minimizing retraining expenses.

Key facts

Framework uses self-supervised and reinforcement learning for concept drift adaptation
Models deployment-time maintenance as a sequential decision problem
Learns stable latent representation via self-supervised learning during initialization
Freezes encoder and measures latent drift in fixed representation space
Uses trainable adapter and classification head for lightweight downstream adaptation
Proximal policy optimization controller selects low-cost maintenance actions
Evaluated on emulator and real Android malware datasets with static and dynamic features
Causal deployment-style protocol used for evaluation

Self-Supervised and Reinforcement Learning for Android Malware Concept Drift

Key facts

Entities

Institutions

Sources