Self-adapting Robotic Agents through Online Continual Reinforcement Learning with World Model Feedback
每日信息看板 · 2026-03-05
2026-03-04T13:07:42Z
Published
AI 总结
As learning-based robotic controllers are typically trained offline and deployed with fixed parameters, their ability to cope with unforeseen changes during op…
- As learning-based robotic controllers are typically trained offline and deployed with fixed parameters, their ability to cope with unforese…
- Biologically inspired, this work presents a framework for online Continual Reinforcement Learning that enables automated adaptation during …
- Building on DreamerV3, a model-based Reinforcement Learning algorithm, the proposed method leverages world model prediction residuals to de…
- Adaptation progress is monitored using both task-level performance signals and internal training metrics, allowing convergence to be assess…
- The approach is validated on a variety of contemporary continuous control problems, including a quadruped robot in high-fidelity simulation…
- Relevant metrics and their interpretation are presented and discussed, as well as resulting trade-offs described
#arXiv #paper #研究/论文 #Agent
内容摘录
As learning-based robotic controllers are typically trained offline and deployed with fixed parameters, their ability to cope with unforeseen changes during operation is limited. Biologically inspired, this work presents a framework for online Continual Reinforcement Learning that enables automated adaptation during deployment. Building on DreamerV3, a model-based Reinforcement Learning algorithm, the proposed method leverages world model prediction residuals to detect out-of-distribution events and automatically trigger finetuning. Adaptation progress is monitored using both task-level performance signals and internal training metrics, allowing convergence to be assessed without external supervision and domain knowledge. The approach is validated on a variety of contemporary continuous control problems, including a quadruped robot in high-fidelity simulation, and a real-world model vehicle. Relevant metrics and their interpretation are presented and discussed, as well as resulting trade-offs described. The results sketch out how autonomous robotic agents could once move beyond static training regimes toward adaptive systems capable of self-reflection and -improvement during operation, just like their biological counterparts.