Switchcraft: AI Model Router for Agentic Tool Calling
The introduction of Switchcraft marks the debut of the first model router tailored for agentic tool calling within AI systems. This innovative system identifies the most cost-effective model while ensuring accuracy, achieving a remarkable 82.9% precision and cutting inference expenses by 84%, translating to savings exceeding $3,600 for every million queries. Utilizing a DistilBERT-based classifier, it functions within a specified latency budget. The research revealed that larger models do not always surpass smaller ones in tool-use tasks, and less expensive models may lead to greater overall costs due to the demands of token-heavy reasoning.
Key facts
- Switchcraft is the first model router optimized for agentic tool calling.
- It achieves 82.9% accuracy, matching or exceeding the best individual model.
- Inference cost is reduced by 84%.
- Saves over $3,600 per million queries.
- Uses a DistilBERT-based classifier.
- Operates under a latency budget.
- Larger models do not consistently outperform smaller ones on tool-use tasks.
- Cheaper models can incur higher total cost due to token-intensive reasoning.
Entities
—