Trajectory Releases Concurrent Multi-LoRA Training Stack for Continual Learning
Trajectory's concurrent multi-LoRA stack reports a 2.81× experiment-throughput gain over single-tenant RL, with all code in the NovaSky-AI/SkyRL GitHub repository.

["Trajectory, in collaboration with UC Berkeley Sky Lab and Anyscale, has introduced a concurrent multi-LoRA training platform that enables continuous learning workloads. This development aims to revolutionize the way language models are updated, moving away from the traditional discontinuous jumps and towards a more dynamic and adaptive approach. The team's field report details the construction of this platform, which is open-sourced in the NovaSky-AI/SkyRL repository.", "The current paradigm for improving language models involves collecting data, training, and shipping a new version, a process that can take months and result in either remarkable or catastrophic behavior for users.
Trajectory's approach seeks to replace this cycle with continual learning, allowing models to update from live feedback and production interactions. This method could enable applications such as coding agents learning from developer corrections and support agents improving through operator interventions.", 'The concurrent multi-LoRA training stack, dubbed Continuous Multi-LoRA Training (C-LoRA), maps each experiment to a dedicated LoRA adapter on a warm, multi-tenant engine. This approach addresses four key inefficiencies in traditional stacks: cold starts, memory intensity, single-tenancy, and low job utilization.
By leveraging LoRA adapters, the platform significantly reduces memory usage and enables the concurrent execution of multiple experiments, leading to a 2.81× end-to-end experiment-throughput improvement over single-tenant training frameworks.', 'The Trajectory team tested their approach using a single H200 node with the Qwen3-4B-Instruct-2507 model, running synchronous reinforcement learning on GSM8K in an agentic setting. They achieved a speedup of 2.81× with eight concurrent multi-LoRA runs, with all concurrent experiments finishing before three serial runs back-to-back. Moreover, the team demonstrated that their approach maintains reward accuracy above 90% by step 9 across different concurrency levels.', 'While higher throughput comes at the cost of per-step latency, the results show that multi-LoRA can significantly improve experiment throughput without sacrificing model performance.
The code for the SkyRL training stack is available in the NovaSky-AI/SkyRL GitHub repository, providing a valuable resource for researchers and developers interested in advancing continual learning and multi-LoRA training techniques.']
Source: MarkTechPost