Self-adapting Robotic Agents through Online Continual Reinforcement Learning with World Model Feedback

每日信息看板 · 2026-03-05

研究/论文

AI 总结

As learning-based robotic controllers are typically trained offline and deployed with fixed parameters, their ability to cope with unforeseen changes during op…

As learning-based robotic controllers are typically trained offline and deployed with fixed parameters, their ability to cope with unforese…
Biologically inspired, this work presents a framework for online Continual Reinforcement Learning that enables automated adaptation during …
Building on DreamerV3, a model-based Reinforcement Learning algorithm, the proposed method leverages world model prediction residuals to de…
Adaptation progress is monitored using both task-level performance signals and internal training metrics, allowing convergence to be assess…
The approach is validated on a variety of contemporary continuous control problems, including a quadruped robot in high-fidelity simulation…
Relevant metrics and their interpretation are presented and discussed, as well as resulting trade-offs described

#arXiv #paper #研究/论文 #Agent

原链接

内容摘录

As learning-based robotic controllers are typically trained offline and deployed with fixed parameters, their ability to cope with unforeseen changes during operation is limited. Biologically inspired, this work presents a framework for online Continual Reinforcement Learning that enables automated adaptation during deployment. Building on DreamerV3, a model-based Reinforcement Learning algorithm, the proposed method leverages world model prediction residuals to detect out-of-distribution events and automatically trigger finetuning. Adaptation progress is monitored using both task-level performance signals and internal training metrics, allowing convergence to be assessed without external supervision and domain knowledge. The approach is validated on a variety of contemporary continuous control problems, including a quadruped robot in high-fidelity simulation, and a real-world model vehicle. Relevant metrics and their interpretation are presented and discussed, as well as resulting trade-offs described. The results sketch out how autonomous robotic agents could once move beyond static training regimes toward adaptive systems capable of self-reflection and -improvement during operation, just like their biological counterparts.