Torchtune: PyTorch Native Post-Training Library for LLMs
Torchtune is a library built specifically for PyTorch that aims to enhance the post-training processes of large language models (LLMs), facilitating effective fine-tuning, experimentation, and deployment workflows. In contrast to many current fine-tuning solutions that focus on user-friendliness, tailored recipes, or hardware optimization at the expense of transparency and flexibility, torchtune prioritizes modularity, adaptability, and direct interaction with core PyTorch elements. The paper outlines the foundational design principles of torchtune, illustrates their implementation in its model builders, training recipes, and distributed training framework, and assesses the library's performance in various post-training scenarios, benchmarking it against well-known fine-tuning frameworks.
Key facts
- Torchtune is a PyTorch-native library for post-training of LLMs.
- It focuses on modularity, hackability, and direct access to PyTorch components.
- The library supports fine-tuning, experimentation, and deployment workflows.
- It contrasts with frameworks that optimize for ease of use or hardware efficiency at the cost of transparency.
- The paper describes model builders, training recipes, and distributed training stack.
- Evaluation is performed across representative post-training settings.
- Comparisons are made against popular fine-tuning frameworks.
- The paper is published on arXiv with ID 2605.21442.
Entities
Institutions
- PyTorch
- arXiv