microsoft/RD-Agent

每日信息看板 · 2026-02-14
开源项目
Category
github_search
Source
44
Score
2026-02-14T15:42:46Z
Published

AI 总结

微软开源RD-Agent并持续发布量化与数据科学扩展,其在MLE-bench上取得当前公开领先成绩,显示多智能体自动化研发在工业级机器学习与量化策略中的实用价值。
#GitHub #repo #开源项目 #RD-Agent #MLE-bench #LiteLLM #Agent

内容摘录

<h4 align="center">
 <img src="docs/_static/logo.png" alt="RA-Agent logo" style="width:70%; ">
 
 <a href="https://rdagent.azurewebsites.net" target="_blank">🖥️ Live Demo</a> |
 <a href="https://rdagent.azurewebsites.net/factor_loop" target="_blank">🎥 Demo Video</a> <a href="https://www.youtube.com/watch?v=JJ4JYO3HscM&list=PLALmKB0_N3_i52fhUmPQiL4jsO354uopR" target="_blank">▶️YouTube</a> |
 <a href="https://rdagent.readthedocs.io/en/latest/index.html" target="_blank">📖 Documentation</a> |
 <a href="https://aka.ms/RD-Agent-Tech-Report" target="_blank">📄 Tech Report</a> |
 <a href="#-paperwork-list"> 📃 Papers </a>
</h3>

CI
CodeQL
Dependabot Updates
Lint PR Title
Release.yml
Platform
PyPI
PyPI - Python Version
Release
GitHub
pre-commit
Checked with mypy
Ruff
Chat
Documentation Status
Readthedocs Preview <!-- this badge is too long, please place it in the last one to make it pretty --> 
arXiv
📰 News
| 🗞️ News | 📝 Description |
| -- | ------ |
| NeurIPS 2025 Acceptance | We are thrilled to announce that our paper R&D-Agent-Quant has been accepted to NeurIPS 2025 | 
| Technical Report Release | Overall framework description and results on MLE-bench | 
| R&D-Agent-Quant Release | Apply R&D-Agent to quant trading | 
| MLE-Bench Results Released | R&D-Agent currently leads as the top-performing machine learning engineering agent on MLE-bench |
| Support LiteLLM Backend | We now fully support **LiteLLM** as our default backend for integration with multiple LLM providers. |
| General Data Science Agent | Data Science Agent |
| Kaggle Scenario release | We release **Kaggle Agent**, try the new features! |
| Official WeChat group release | We created a WeChat group, welcome to join! (🗪QR Code) |
| Official Discord release | We launch our first chatting channel in Discord (🗪Chat) |
| First release | **R&D-Agent** is released on GitHub |
🏆 The Best Machine Learning Engineering Agent!

MLE-bench is a comprehensive benchmark evaluating the performance of AI agents on machine learning engineering tasks. Utilizing datasets from 75 Kaggle competitions, MLE-bench provides robust assessments of AI systems' capabilities in real-world ML engineering scenarios.

R&D-Agent currently leads as the top-performing machine learning engineering agent on MLE-bench:

| Agent | Low == Lite (%) | Medium (%) | High (%) | All (%) |
|---------|--------|-----------|---------|----------|
| R&D-Agent o3(R)+GPT-4.1(D) | 51.52 ± 6.9 | 19.3 ± 5.5 | 26.67 ± 0 | 30.22 ± 1.5 |
| R&D-Agent o1-preview | 48.18 ± 2.49 | 8.95 ± 2.36 | 18.67 ± 2.98 | 22.4 ± 1.1 |
| AIDE o1-preview | 34.3 ± 2.4 | 8.8 ± 1.1 | 10.0 ± 1.9 | 16.9 ± 1.1 |

**Notes:**
**O3(R)+GPT-4.1(D)**: This version is designed to both reduce average time per loop and leverage a cost-effective combination of backend LLMs by seamlessly integrating Research Agent (o3) with Development Agent (GPT-4.1).
**AIDE o1-preview**: Represents the previously best public result on MLE-bench as reported in the original MLE-bench paper.
Average and standard deviation results for R&D-Agent o1-preview is based on a independent of 5 seeds and for R&D-Agent o3(R)+GPT-4.1(D) is based on 6 seeds.
According to MLE-Bench, the 75 competitions are categorized into three levels of complexity: **Low==Lite** if we estimate that an experienced ML engineer can produce a sensible solution in under 2 hours, excluding the time taken to train any models; **Medium** if it takes between 2 and 10 hours; and **High** if it takes more than 10 hours.

You can inspect the detailed runs of the above results online.
R&D-Agent o1-preview detailed runs
R&D-Agent o3(R)+GPT-4.1(D) detailed runs

For running R&D-Agent on MLE-bench, refer to **MLE-bench Guide: Running ML Engineering via MLE-bench**
🥇 The First Data-Centric Quant Multi-Agent Framework!

R&D-Agent for Quantitative Finance, in short **RD-Agent(Q)**, is the first data-centric, multi-agent framework designed to automate the full-stack research and development of quantitative strategies via coordinated factor-model co-optimization.

!image

Extensive experiments in real stock markets show that, at a cost under $10, RD-Agent(Q) achieves approximately 2× higher ARR than benchmark factor libraries while using over 70% fewer factors. It also surpasses state-of-the-art deep time-series models under smaller resource budgets. Its alternating factor–model optimization further delivers excellent trade-off between predictive accuracy and strategy robustness.

You can learn more details about **RD-Agent(Q)** through the paper and reproduce it through the documentation.
Data Science Agent Preview
Check out our demo video showcasing the current progress of our Data Science Agent under development:

https://github.com/user-attachments/assets/3eccbecb-34a4-4c81-bce4-d3f8862f7305
🌟 Introduction
<div align="center">
 <img src="docs/_static/scen.png" alt="Our focused scenario" style="width:80%; ">
</div>

R&D-Agent aims to automate the most critical and valuable aspects of the industrial R&D process, and we begin with focusing on the data-driven scenarios to streamline the development of models and data. 
Methodologically, we have identified a framework with two key components: 'R' for proposing new ideas and 'D' for implementing them.
We believe that the automatic evolution of R&D will lead to solutions of significant industrial value.

<!-- Tag Cloud -->
R&D is a very general scenario. The advent of R&D-Agent can be your
💰 **Automatic Quant Factory** (🎥Demo Video|▶️YouTube)
🤖 **Data Mining Agent:** Iteratively proposing data & models (🎥Demo Video 1|▶️YouTube) (🎥Demo Video 2|▶️YouTube) and implementing them by gaining knowledge from data.
🦾 **Research Copilot:** Auto read research papers (🎥Demo Video|▶️YouTube) / financial reports (🎥Demo Video|▶️YouTube) and implement model structures or building datasets.
🤖 **Kaggle Agent:** Auto Model Tuning and Feature Engineering([🎥Demo Video Coming Soon...]()) and implementing them to achieve more in competitions.
...

You can click the links…