Fine-Tuning Causal LLMs for Text Classification: Embedding vs. Instruction Methods

publication · 2026-05-25

A study available on arXiv evaluates two approaches for fine-tuning decoder-only Large Language Models (LLMs) aimed at text classification when resources are limited: embedding-based fine-tuning (which involves adding a classification head to the final-token embedding) and instruction-tuning (utilizing a prompt-to-response format). The research utilized 4-bit quantization and Low-Rank Adaptation (LoRA) on a single GPU for models with up to 8B parameters. Experiments conducted on two patent benchmarks—a proprietary single-label corpus with five classes and the public WIPO-Alpha multi-label dataset featuring 14 categories—indicate that the embedding-based technique either matches or surpasses instruction-tuning in single-label classification while requiring 10 to 30 times fewer parameters. Instruction-tuning shows competitiveness solely in multi-label scenarios. This paper is cataloged on arXiv with ID 2512.12677.

Key facts

arXiv paper ID: 2512.12677
Compares embedding-based vs. instruction-tuning for LLM text classification
Uses 4-bit quantization and LoRA for single-GPU fine-tuning up to 8B parameters
Experiments on two patent benchmarks: proprietary 5-class single-label and WIPO-Alpha multi-label (14 categories)
Embedding-based method matches or exceeds instruction-tuning on single-label tasks
Embedding-based method trains 10 to 30 times fewer parameters than instruction-tuning
Instruction-tuning is competitive only on multi-label classification
Models are decoder-only causal LLMs

Fine-Tuning Causal LLMs for Text Classification: Embedding vs. Instruction Methods

Key facts

Entities

Institutions

Sources