内容摘录
<p align="center">
<picture>
<img src="./docs/images/logo.png" alt="WeKnora Logo" height="120"/>
</picture>
</p>
<p align="center">
<picture>
<a href="https://trendshift.io/repositories/15289" target="_blank">
<img src="https://trendshift.io/api/badge/repositories/15289" alt="Tencent%2FWeKnora | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/>
</a>
</picture>
</p>
<p align="center">
<a href="https://weknora.weixin.qq.com" target="_blank">
<img alt="官方网站" src="https://img.shields.io/badge/官方网站-WeKnora-4e6b99">
</a>
<a href="https://chatbot.weixin.qq.com" target="_blank">
<img alt="微信对话开放平台" src="https://img.shields.io/badge/微信对话开放平台-5ac725">
</a>
<a href="https://github.com/Tencent/WeKnora/blob/main/LICENSE">
<img src="https://img.shields.io/badge/License-MIT-ffffff?labelColor=d4eaf7&color=2e6cc4" alt="License">
</a>
<a href="./CHANGELOG.md">
<img alt="Version" src="https://img.shields.io/badge/version-0.3.0-2e6cc4?labelColor=d4eaf7">
</a>
</p>
<p align="center">
| <b>English</b> | <a href="./README_CN.md"><b>简体中文</b></a> | <a href="./README_JA.md"><b>日本語</b></a> |
</p>
<p align="center">
<h4 align="center">
Overview • Architecture • Key Features • Getting Started • API Reference • Developer Guide
</h4>
</p>
💡 WeKnora - LLM-Powered Document Understanding & Retrieval Framework
📌 Overview
**WeKnora** is an LLM-powered framework designed for deep document understanding and semantic retrieval, especially for handling complex, heterogeneous documents.
It adopts a modular architecture that combines multimodal preprocessing, semantic vector indexing, intelligent retrieval, and large language model inference. At its core, WeKnora follows the **RAG (Retrieval-Augmented Generation)** paradigm, enabling high-quality, context-aware answers by combining relevant document chunks with model reasoning.
**Website:** https://weknora.weixin.qq.com
✨ Latest Updates
**v0.3.0 Highlights:**
🏢 **Shared Space**: Shared space with member invitations, shared knowledge bases and agents across members, tenant-isolated retrieval
🧩 **Agent Skills**: Agent skills system with preloaded skills for smart-reasoning agent, sandboxed execution environment for security isolation
🤖 **Custom Agents**: Support for creating, configuring, and selecting custom agents with knowledge base selection modes (all/specified/disabled)
📊 **Data Analyst Agent**: Built-in Data Analyst agent with DataSchema tool for CSV/Excel analysis
🧠 **Thinking Mode**: Support thinking mode for LLM and agents, intelligent filtering of thinking content
🔍 **Web Search Providers**: Added Bing and Google search providers alongside DuckDuckGo
📋 **Enhanced FAQ**: Batch import dry run, similar questions, matched question in search results, large imports offloaded to object storage
🔑 **API Key Auth**: API Key authentication mechanism with Swagger documentation security
📎 **In-Input Selection**: Select knowledge bases and files directly in the input box with @mention display
☸️ **Helm Chart**: Complete Helm chart for Kubernetes deployment with Neo4j GraphRAG support
🌍 **i18n**: Added Korean (한국어) language support
🔒 **Security Hardening**: SSRF-safe HTTP client, enhanced SQL validation, MCP stdio transport security, sandbox-based execution
⚡ **Infrastructure**: Qdrant vector DB support, Redis ACL, configurable log level, Ollama embedding optimization, DISABLE_REGISTRATION control
**v0.2.0 Highlights:**
🤖 **Agent Mode**: New ReACT Agent mode that can call built-in tools, MCP tools, and web search, providing comprehensive summary reports through multiple iterations and reflection
📚 **Multi-Type Knowledge Bases**: Support for FAQ and document knowledge base types, with new features including folder import, URL import, tag management, and online entry
⚙️ **Conversation Strategy**: Support for configuring Agent models, normal mode models, retrieval thresholds, and Prompts, with precise control over multi-turn conversation behavior
🌐 **Web Search**: Support for extensible web search engines with built-in DuckDuckGo search engine
🔌 **MCP Tool Integration**: Support for extending Agent capabilities through MCP, with built-in uvx and npx launchers, supporting multiple transport methods
🎨 **New UI**: Optimized conversation interface with Agent mode/normal mode switching, tool call process display, and comprehensive knowledge base management interface upgrade
⚡ **Infrastructure Upgrade**: Introduced MQ async task management, support for automatic database migration, and fast development mode
🔒 Security Notice
**Important:** Starting from v0.1.3, WeKnora includes login authentication functionality to enhance system security. For production deployments, we strongly recommend:
Deploy WeKnora services in internal/private network environments rather than public internet
Avoid exposing the service directly to public networks to prevent potential information leakage
Configure proper firewall rules and access controls for your deployment environment
Regularly update to the latest version for security patches and improvements
🏗️ Architecture
!weknora-architecture.png
WeKnora employs a modern modular design to build a complete document understanding and retrieval pipeline. The system primarily includes document parsing, vector processing, retrieval engine, and large model inference as core modules, with each component being flexibly configurable and extendable.
🎯 Key Features
**🤖 Agent Mode**: Support for ReACT Agent mode that can use built-in tools to retrieve knowledge bases, MCP tools, and web search tools to access external services, providing comprehensive summary reports through multiple iterations and reflection
**🔍 Precise Understanding**: Structured content extraction from PDFs, Word documents, images and more into unified semantic views
**🧠 Intelligent Reasoning**: Leverages LLMs to understand document context and user intent for accurate Q&A and multi-turn conversations
**📚 Multi-Type Knowledge Bases**: Support for…