cactus-compute/cactus

每日信息看板 · 2026-02-24
开源项目
Category
github_search
Source
2
Score
2026-02-24T01:53:11Z
Published

AI 总结

<img src="assets/banner.jpg" alt="Logo" style="border-radius: 30px; width: 100%;"> Cactus Engine Example response from Gemma3-270m Cactus Graph Benchmark *…
#GitHub #repo #开源项目

内容摘录

<img src="assets/banner.jpg" alt="Logo" style="border-radius: 30px; width: 100%;">

Cactus Engine

Example response from Gemma3-270m
Cactus Graph
Benchmark 

**High-End Devices**
| Device | LFM2.5-1.2B-INT4<br>(1k-Prefill/100-Decode) | LFM2.5-VL-1.6B-INT4<br>(256px-Latency & Decode) | Whisper-Small-244-INT8<br>(30s-audio-Latency & Decode)
|--------|--------|--------|----------|
| Mac M4 Pro | 582tps/100tps (76MB RAM) | 0.2s/98tps (87MB RAM) | 0.1s/119tps (73MB RAM) |
| iPad/Mac M4 | 379tps/66tps (30MB RAM) | 0.2s/64tps (53MB RAM) | 0.2s/100tp (122MB RAM) |
| iPad/Mac M2 | 315tps/60tps (181MB RAM) | 0.3s/58tps (426MB RAM) | 0.3s/86tps (160MB RAM) |
| iPhone 17 Pro | 327tps/48tps (108MB RAM)| 0.3s/48tps (156MB RAM) | 0.3s/114tps (177MB RAM)|
| Galaxy S25 Ultra | 255tps/37tps (1.2GB RAM) | 2.6s/34tps (2GB RAM) | 2.3s/90tps (363MB RAM) |

**Budget Devices**
| Device | LFM2-350m-INT4<br>(1k-Prefill/100-Decode) | LFM2-VL-450m-INT4<br>(256px-Latency & Decode) | Moonshine-Base-67m-INT4<br>(30s-audio-Latency & Decode)
|--------|--------|--------|----------|
| iPhone 13 Mini (Apple A15) | 516tps/65tps (29MB RAM) | 0.3s/69tps (20MB RAM) | 0.7s/413tps (10MB RAM) |
| Pixel 6a (Google Tensor G1) | 218tps/44tps (395MB RAM)| 2.5s/36tps (631MB RAM) | 1.5s/189tps (111MB RAM)|
| Galaxy A17 5G (Exxynox 1330) | 87tps/24tps (395MB RAM) | 4.1s/20tps (619MB RAM) | 4.2s/90tps (103MB RAM) |
| CMF Phone 2 Pro (Mediatek Dimensity 7300) | 146tps/21tps (394MB RAM) | 2.4s/22tps (632MB RAM) | 1.9s/119tps (112MB RAM) |
| Raspberry Pi 5 | - | - | - |
Supported Models 
 
| Model | Features | 
|-------|----------| 
| google/gemma-3-270m-it | completion | 
| google/functiongemma-270m-it | completion, tools | 
| LiquidAI/LFM2-350M | completion, tools, embed | 
| Qwen/Qwen3-0.6B | completion, tools, embed | 
| LiquidAI/LFM2-700M | completion, tools, embed | 
| LiquidAI/LFM2-8B-A1B | completion, tools, embed | 
| google/gemma-3-1b-it | completion | 
| LiquidAI/LFM2.5-1.2B-Thinking | completion, tools, embed | 
| LiquidAI/LFM2.5-1.2B-Instruct | completion, tools, embed | 
| Qwen/Qwen3-1.7B | completion, tools, embed | 
| LiquidAI/LFM2-2.6B | completion, tools, embed | 
| LiquidAI/LFM2-VL-450M | vision, txt & img embed, Apple NPU | 
| LiquidAI/LFM2.5-VL-1.6B | vision, txt & img embed, Apple NPU | 
| UsefulSensors/moonshine-base | transcription, speech embed | 
| openai/whisper-small | transcription, speech embed, Apple NPU | 
| openai/whisper-medium | transcribe, speech embed, Apple NPU |
| snakers4/silero-vad | vad |
| nomic-ai/nomic-embed-text-v2-moe | embed | 
| Qwen/Qwen3-Embedding-0.6B | embed | 
Using this repo on Mac
Using this repo on Linux (Ubuntu/Debian)

| Command | Description |
|---------|-------------|
| cactus auth | Setup Cactus cloud fallback (optional) (--status, --clear) |
| cactus run [model] | Opens playground (auto downloads model) |
| cactus download [model] | Downloads model to ./weights |
| cactus convert [model] [dir] | Converts model, supports LoRA merging (--lora <path>) |
| cactus build | Builds for ARM (--apple or --android) |
| cactus test | Runs tests (--ios / --android, --model [name/path], --transcribe_model [name/path], --only [test_name], --precision) |
| cactus transcribe [model] | Transcribe audio file (--file) or live microphone |
| cactus clean | Removes build artifacts |
| cactus --help | Shows all commands and flags (always run this) |
Using in your apps 
Python for Mac
Rust SDK
React Native SDK
Swift Multiplatform SDK
Kotlin Multiplatform SDK
Flutter SDK
Try demo apps 
iOS Demo
Android Demo
Maintaining Organisations
Developed by Cactus Compute, Inc. (YC S25), with maintenance from:
UCLA's BruinAI 
Char (YC S25)
Yale's AI Society
National Unoversity of Singapore's AI Society
UC Irvine's AI@UCI
Imperial College's AI Society
University of Pennsylvania's AI@Penn
University of Michigan Ann-Arbor MSAIL
University of Colorado Boulder's AI Club
Contributing to Cactus
**C++ Standard**: Use C++20 features where appropriate.
**Formatting**: Follow the existing code style in the project, one header per folder.
**Comments**: Avoid comments, make your code read like plain english.
**AI-Generated Code**: Do not bindly PR AI slop, this codebase is very complex, they miss details.
**Update docs**: Please update docs when necessary, be intuitive and straightforward. 
**Keep It Simple**: Do not go beyond the scope of the GH issue, avoid bloated PRs, keep codes lean.
**Benchmark Your Changes**: Test performance impact, Cactus is performance-critical.
**Test everything**: A PR that fails to build is the biggest red flag, means it was not tested. 
Citation

If you use Cactus in your research, please cite it as follows:
Join The Community
Reddit Channel