Parameter-efficient fine-tuning for LLMs using LoRA, QLoRA, and 25+ methods. Use when fine-tuning large models (7B-70B) with limited GPU memory, when you need to train <1% of parameters with minimal accuracy loss, or for multi-adapter serving. HuggingFace's official library integrated with transformers ecosystem.
- Initial release of parameter-efficient fine-tuning (PEFT) support for large language models (LLMs), including LoRA, QLoRA, and 25+ adapter methods. - Enables fine-tuning of 7B–70B models on consumer GPUs by training less than 1% of model parameters, with adapters as small as 6MB. - Provides memory-optimized workflows for single-GPU fine-tuning of even the largest models using quantization (QLoRA). - Integrates fully with the HuggingFace transformers ecosystem and official PEFT library. - Includes practical guides, recommended settings, and code for adapter training, merging, and multi-adapter serving. - Offers architecture-specific configuration and compares leading parameter-efficient fine-tuning methods.