AI Performance

Citizen Startup latent resource

admin August 25, 2024, 6:32am 1

Distributed

AI workloads can be distributed across many processors.

Distribution

Distrubuted Llama
GitHub - b4rtaz/distributed-llama: Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.
Exolabs
https://exolabs.net/
Petals
GitHub - bigscience-workshop/petals: 🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
Decentralized Distributed LLM- Using AI in Cost Effective and Environment Friendly way | by Vishal Mysore | Agentic Guru | Medium
AI Horde
https://stablehorde.net/

Parallel

Multiple GPUs can be used with llama.cpp now:

https://github.com/ggerganov/llama.cpp/pull/6017