ARTFEED — Contemporary Art Intelligence

Quantized LLM Performance in Qualitative Analysis Improved by Multi-Pass Prompt Verification

ai-technology · 2026-05-22

A study on arXiv (2605.20193) investigates how lower-bit quantization levels (8-bit, 4-bit, 3-bit, 2-bit) and types affect LLaMA-3.1 (8B) performance in qualitative analysis. Using 82 interview transcripts with expert and non-expert responses, low-bit models exhibit increased hallucinations and instability, particularly with non-expert language. The authors propose a quantization-aware multi-pass prompt verification method that guides the model through controlled steps to reduce hallucinations, removing unreliable content and passing verified results to the next transcript. Human coders using NVivo and BF16 LLaMA validated performance. The method improves accuracy for quantized models in qualitative tasks.

Key facts

  • Study examines quantization levels: 8-bit, 4-bit, 3-bit, 2-bit
  • Uses LLaMA-3.1 (8B) model
  • Data from 82 interview transcripts with expert and non-expert responses
  • Low-bit models produce higher hallucinations and unstable results
  • Proposes quantization-aware multi-pass prompt verification method
  • Method reduces hallucinations through controlled steps and verification
  • Validation by human coders using NVivo and BF16 LLaMA
  • arXiv paper ID: 2605.20193

Entities

Institutions

  • arXiv
  • NVivo

Sources