内容摘录
<p align="center">
<img src="assets/logo-text.png" alt="LEANN Logo" width="400">
</p>
<p align="center">
<a href="https://trendshift.io/repositories/15049" target="_blank">
<img src="https://trendshift.io/api/badge/repositories/15049" alt="yichuan-w/LEANN | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/>
</a>
</p>
<p align="center">
<img src="https://img.shields.io/badge/Python-3.9%20%7C%203.10%20%7C%203.11%20%7C%203.12%20%7C%203.13-blue.svg" alt="Python Versions">
<img src="https://github.com/yichuan-w/LEANN/actions/workflows/build-and-publish.yml/badge.svg" alt="CI Status">
<img src="https://img.shields.io/badge/Platform-Ubuntu%20%26%20Arch%20%26%20WSL%20%7C%20macOS%20(ARM64%2FIntel)-lightgrey" alt="Platform">
<img src="https://img.shields.io/badge/License-MIT-green.svg" alt="MIT License">
<img src="https://img.shields.io/badge/MCP-Native%20Integration-blue" alt="MCP Integration">
<a href="https://join.slack.com/t/leann-e2u9779/shared_invite/zt-3ol2ww9ic-Eg_kB8omwe6xmYVd0epr4Q">
<img src="https://img.shields.io/badge/Slack-Join-4A154B?logo=slack&logoColor=white" alt="Join Slack">
</a>
</p>
<div align="center">
<a href="https://forms.gle/rDbZf864gMNxhpTq8">
<img src="https://img.shields.io/badge/📣_Community_Survey-Help_Shape_v0.4-007ec6?style=for-the-badge&logo=google-forms&logoColor=white" alt="Take Survey">
</a>
<p>
We track <b>zero telemetry</b>. This survey is the ONLY way to tell us if you want <br>
<b>GPU Acceleration</b> or <b>More Integrations</b> next.<br>
👉 <a href="https://forms.gle/rDbZf864gMNxhpTq8"><b>Click here to cast your vote (2 mins)</b></a>
</p>
</div>
<div align="center">
<h3>💬 Join our Slack community!</h3>
<p>
We'd love for you to be part of the LEANN community!<br>
👉 <a href="https://join.slack.com/t/leann-e2u9779/shared_invite/zt-3ol2ww9ic-Eg_kB8omwe6xmYVd0epr4Q"><b>Join LEANN Slack</b></a><br>
If the invite link has expired or you have trouble joining, please <a href="https://github.com/yichuan-w/LEANN/issues">open an issue</a> and we'll help you get in!
</p>
</div>
<h2 align="center" tabindex="-1" class="heading-element" dir="auto">
The smallest vector index in the world. RAG Everything with LEANN!
</h2>
LEANN is an innovative vector database that democratizes personal AI. Transform your laptop into a powerful RAG system that can index and search through millions of documents while using **97% less storage** than traditional solutions **without accuracy loss**.
LEANN achieves this through *graph-based selective recomputation* with *high-degree preserving pruning*, computing embeddings on-demand instead of storing them all. Illustration Fig → | Paper →
**Ready to RAG Everything?** Transform your laptop into a personal AI assistant that can semantic search your **file system**, **emails**, **browser history**, **chat history** (WeChat, iMessage), **agent memory** (ChatGPT, Claude), **live data** (Slack, Twitter), **codebase**\* , or external knowledge bases (i.e., 60M documents) - all on your laptop, with zero cloud costs and complete privacy.
\* Claude Code only supports basic grep-style keyword search. **LEANN** is a drop-in **semantic search MCP service fully compatible with Claude Code**, unlocking intelligent retrieval without changing your workflow. 🔥 Check out the easy setup →
Why LEANN?
<p align="center">
<img src="assets/effects.png" alt="LEANN vs Traditional Vector DB Storage Comparison" width="70%">
</p>
**The numbers speak for themselves:** Index 60 million text chunks in just 6GB instead of 201GB. From emails to browser history, everything fits on your laptop. See detailed benchmarks for different applications below ↓
🔒 **Privacy:** Your data never leaves your laptop. No OpenAI, no cloud, no "terms of service".
🪶 **Lightweight:** Graph-based recomputation eliminates heavy embedding storage, while smart graph pruning and CSR format minimize graph storage overhead. Always less storage, less memory usage!
📦 **Portable:** Transfer your entire knowledge base between devices (even with others) with minimal cost - your personal AI memory travels with you.
📈 **Scalability:** Handle messy personal data that would crash traditional vector DBs, easily managing your growing personalized data and agent generated memory!
✨ **No Accuracy Loss:** Maintain the same search quality as heavyweight solutions while using 97% less storage.
Installation
📦 Prerequisites: Install uv
Install uv first if you don't have it. Typically, you can install it with:
🚀 Quick Install
Clone the repository to access all examples and try amazing applications,
and install LEANN from PyPI to run them immediately:
<!--
Low-resource? See "Low-resource setups" in the Configuration Guide. -->
<details>
<summary>
<strong>🔧 Build from Source (Recommended for development)</strong>
</summary>
**macOS:**
Note: DiskANN requires MacOS 13.3 or later.
**Linux (Ubuntu/Debian):**
Note: On Ubuntu 20.04, you may need to build a newer Abseil and pin Protobuf (e.g., v3.20.x) for building DiskANN. See Issue #30 for a step-by-step note.
You can manually install Intel oneAPI MKL instead of libmkl-full-dev for DiskANN. You can also use libopenblas-dev for building HNSW only, by removing --extra diskann in the command below.
**Linux (Arch Linux):**
**Linux (RHEL / CentOS Stream / Oracle / Rocky / AlmaLinux):**
See Issue #50 for more details.
</details>
Quick Start
Our declarative API makes RAG as easy as writing a config file.
Check out demo.ipynb or Open In Colab
RAG on Everything!
LEANN supports RAG on various data sources including documents (.pdf, .txt, .md), Apple Mail, Google Search History, WeChat, ChatGPT conversations, Claude conversations, iMessage conversations, and **live data from any platform through MCP (Model Context Protocol) servers** - including Slack, Twitter, and more.
Generation Model Setup
LLM Backend
LEANN supports many LLM providers for text generation (HuggingFace, Ollama, Anthropic, and Any OpenAI com…