Unsloth Long Context Training
· Batch Import
Description
Training models on extended context lengths (up to 89K+ tokens) using optimized RoPE scaling and memory-efficient Triton kernels, enabling 4x longer context windows with 30% less memory usage than Flash Attention 2.
Repository
https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/unsloth-long-context
View on GitHub