BoostTaxo: Zero-Shot Taxonomy Induction via LLM Boosting
BoostTaxo is a framework designed for zero-shot taxonomy induction, utilizing a boosting-style LLM, as detailed in arXiv paper 2605.12520v1. It processes domain-specific terms to identify parent categories through a coarse-to-fine strategy. This technique integrates retrieval-augmented definition refinement, a hybrid selection of parent candidates, candidate evaluation, and calibration of scores based on structure. A lightweight LLM effectively filters potential parent candidates, while a larger LLM is responsible for ranking and scoring them for precise selection. The methodology focuses on enhancing generalization, structural dependability, and efficiency in both zero-shot and large-scale contexts.
Key facts
- BoostTaxo is a boosting-style LLM framework for zero-shot taxonomy induction.
- It takes a set of domain terms as inputs.
- Parent identification is performed in a coarse-to-fine manner.
- Methods include retrieval-augmented definition refinement, hybrid parent candidate selection, candidate rating, and structure-aware score calibration.
- A lightweight LLM filters candidate parents efficiently.
- A large-scale LLM ranks and scores candidate parents for fine-grained selection.
- Structural features are incorporated to calibrate scores.
- The approach targets zero-shot and large-scale scenarios.
Entities
Institutions
- arXiv