Fairness of Classifiers with Feature Constraints

publication · 2026-05-04

A new arXiv paper (2605.00592) proposes a definition of fairness for machine learning classifiers that accounts for constraints between features. The authors argue that a decision is fair if it has a fair explanation—a prime-implicant reason containing no protected features, with constraints considered. They show that ignoring constraints can alter fairness judgments even without direct links between protected and unprotected features. Three classifier-level fairness definitions are introduced: all decisions have only fair explanations, at least one fair explanation exists, or changing explanations affects fairness.

Key facts

arXiv paper 2605.00592
Proposes fairness definition based on prime-implicant explanations
Protected features like gender should not appear in fair explanations
Constraints between features can obscure dependencies
Ignoring constraints can change fairness even without protected-unprotected feature constraints
Three definitions: all fair explanations, at least one fair explanation, or changing explanations
Accepted definition: decision should not depend on protected features
Prime-implicant reasons used for explanation

Fairness of Classifiers with Feature Constraints

Key facts

Entities

Institutions

Sources