SciFACE Framework Enables Controllable Diversity in Paper Recommendations Through Facet-Aware Reranking
A novel reranking system known as SciFACE (Scientific Faceted Cross-Encoder) has been developed to overcome the shortcomings of existing paper recommendation tools by focusing on two distinct aspects: Background (the problem addressed) and Method (the solution approach). It was trained on 5,891 authentic seed-candidate paper pairs, which were labeled by GPT-4o-mini based on specific facet criteria, with these labels corroborated by human assessments. On the CSFCube benchmark, SciFACE recorded an NDCG@20 score of 70.63 for Background recommendations, surpassing SPECTER by 5.9 points. For Method recommendations, it achieved 49.06 NDCG@20, outpacing SPECTER by 31.1 points. Compared to FaBLE, which lacked citation pre-training, SciFACE enhanced Method NDCG@20 by 4.1 points while utilizing far fewer training examples—5,891 labeled pairs instead of 40,000 synthetic augmentations. This research, identified by arXiv as 2604.16329v1, highlights the efficiency of high-quality grounded facet labels over synthetic ones. The framework allows users to indicate the reasons for paper similarities rather than receiving a singular mixed similarity score.
Key facts
- SciFACE is a reranking framework that models Background and Method facets separately
- Trained on 5,891 real seed-candidate paper pairs labeled by GPT-4o-mini
- Labels were validated against human judgments
- Achieved 70.63 NDCG@20 on Background facet (5.9 points above SPECTER)
- Achieved 49.06 NDCG@20 on Method facet (31.1 points above SPECTER)
- Improved Method NDCG@20 by 4.1 points over FaBLE without citation pre-training
- Used 5,891 labeled pairs versus 40,000 synthetic augmentations in FaBLE
- Research announced on arXiv with identifier 2604.16329v1
Entities
Institutions
- arXiv