Fairness of Classifiers with Feature Constraints
A new arXiv paper (2605.00592) proposes a definition of fairness for machine learning classifiers that accounts for constraints between features. The authors argue that a decision is fair if it has a fair explanation—a prime-implicant reason containing no protected features, with constraints considered. They show that ignoring constraints can alter fairness judgments even without direct links between protected and unprotected features. Three classifier-level fairness definitions are introduced: all decisions have only fair explanations, at least one fair explanation exists, or changing explanations affects fairness.
Key facts
- arXiv paper 2605.00592
- Proposes fairness definition based on prime-implicant explanations
- Protected features like gender should not appear in fair explanations
- Constraints between features can obscure dependencies
- Ignoring constraints can change fairness even without protected-unprotected feature constraints
- Three definitions: all fair explanations, at least one fair explanation, or changing explanations
- Accepted definition: decision should not depend on protected features
- Prime-implicant reasons used for explanation
Entities
Institutions
- arXiv