PrivSTRUCT: A Framework for Untangling Data Purpose Compliance in Privacy Policies
A new research paper presents PrivSTRUCT, an innovative encoder-decoder framework aimed at analyzing the logical structure of privacy policies, tackling the frequent problem of automated systems conflating different data practices. In contrast to current methods that consider policies as linear text, PrivSTRUCT utilizes structural indicators from section titles to effectively associate sensitive data items with their intended uses. When compared to the leading tool, PoliGrapher, PrivSTRUCT demonstrates the ability to extract over twice as many excerpts regarding data items and their purposes while maintaining the structure defined by developers. Analyzing a dataset of 3,756 Android applications from the Google Play Store, the framework uncovers a significant transparency issue: the likelihood of developers exaggerating a data purpose is 20.4% greater for first-party collection and 9% for third-party collection. This research underscores the necessity for improved compliance verification in app privacy practices.
Key facts
- PrivSTRUCT is an encoder-decoder framework for privacy policy analysis.
- It uses structural cues from section headings to untangle data practices.
- Benchmarked against PoliGrapher, it extracts over twice as many data item and purpose excerpts.
- Applied to 3,756 Android apps from Google Play Store.
- First-party data purpose overstatement probability is 20.4% higher.
- Third-party data purpose overstatement probability is 9% higher.
- The research is published on arXiv with ID 2604.22157.
- The paper addresses the transparency gap in privacy policy compliance.
Entities
Institutions
- Google Play Store
- arXiv